Real-time processing of IoT events with historic data using Apache Kafka and Apache Spark with dashing framework

TLDR

IoT connects devices over the Internet, yet it requires both sender and receiver to be online continuously, which is often impractical. The study aims to store and process the massive data generated by IoT communications to extract insights that enhance organizational benefits. The proposed architecture employs Apache Kafka for message ingestion, Apache Spark for real‑time and batch processing, and the Dashing framework to display the processed data attractively.

Abstract

IoT (Internet of Things) is a concept that broadens the idea of connecting multiple devices to each other over the Internet and enabling communication between these devices. Traditionally, the packets are sent over the network for communication only if both, the sender as well as the receiver, are online. This forces the sender and the receiver to be online 24×7; which is not achievable in each and every environment the devices communicates in. Considering the humongous data generated in the communication, it is necessary to store and process this data so that data insights can be identified to improve the organizational benefits. This generated data can be in two forms, real-time as well as existing or historical data. When this data is obtained in real-time and it is processed, even traditional big data technologies do not perform up to the mark. Hence to process this real-time data, streaming of this data is required; which is not a feature of traditional big data technologies. To achieve these objectives, the proposed architecture uses open source technologies such as Apache Kafka, for online and offline consumption of messages, and Apache Spark, to stream, process and provide a structure to the real-time and existing data. A framework known as Dashing is used to present the processed data in a more attractive and readable manner.

References

Page 1

	Year	Citations

Page 1