Fault Tolerance and Error Handling Techniques in Apache Kafka
Article No.: 18, Pages 1 - 5
Abstract
Real Time streaming framework enables the availability of data for users in near real time. For achieving the high throughput, low latency and to avoid loss of any events at runtime, real time streaming frameworks has their own methods to accomplish this task of message handling, but there are scenarios when due to huge volume of data at the source or due to other environment or Infrastructure related issues, some of the critical data might get lost in the process. This article evaluates methods to handle data rejections in conjunction with a fault-tolerant error handling recommendations for real time data streaming. There are many frameworks available for streaming the events in real time, in this article Apache Kafka is used to examine most frequent situations when an error can occur. This article will evaluate the key performance factors in terms of capability of the framework to recover when downtime occurs. Apache Kafka [15] is a distributed real time data streaming platform; it was developed by Linked inn and currently used by many applications across the industry. It works in producer consumer fashion and with correct configuration and set up, high Throughput with low latency can be achieved. This article will explain the configurations provided with the framework that can be utilized to design and develop optimized event streaming applications. This article is explaining the different parameters that can be used to design and develop the stream processing applications and what are the available options to handle the errors during the streaming to avoid any negative impact to the application.
References
[1]
Wu, Han and Shang, Zhihao and Peng, Guang and Wolter, Katinka, "A Reactive Batching Strategy of Apache Kafka for Reliable Stream Processing in Real- time," 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), 2020, pp. 207-217. Anzaroot and Andrew McCallum. 2013. UMass Citation Field Extraction Dataset. Retrieved May 27, 2019 from http://www.iesl.cs.umass.edu/data/data-umasscitationfield
[2]
Hiraman, Bhole Rahul and Viresh M., Chapte and Abhijeet C., Karve., "A Study of Apache Kafka in Big Data Stream Processing," 2018 International Conference on Information, Communication, Engineering and Technology (ICICET), 2018, pp. 1-3.
[3]
Van-Dai Ta and Chuan-Ming Liu and Nkabinde, Goodwill Wandile, "Big data stream computing in healthcare real-time analytics," 2016 IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), 2016, pp. 37-42. M. Kleinberg. 1999. Authoritative sources in a hyperlinked environment. J. ACM 46, 5 (September 1999), 604–632. https://doi.org/10.1145/324133.324140
[4]
Bang, Jiwon and Son, Siwoon and Kim, Hajin and Moon, Yang-Sae and Choi, Mi-Jung, "Design and implementation of a load shedding engine for solving starvation problems in Apache Kafka," NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium, 2018, pp. 1-4. W. Demmel, Yozo Hida, William Kahan, Xiaoye S. Li, Soni Mukherjee, and Jason Riedy. 2005. Error Bounds from Extra Precise Iterative Refinement. Technical Report No. UCB/CSD-04-1344. University of California, Berkeley.
[5]
Shree, Rishika and Choudhury, Tanupriya and Gupta, Subhash Chand and Kumar, Praveen, "KAFKA: The modern platform for data management and analysis in big data domain," 2017 2nd International Conference on Telecommunication and Networks (TEL-NET), 2017, pp. 1-5. Jerald. 2015. The VR Book: Human-Centered Design for Virtual Reality. Association for Computing Machinery and Morgan & Claypool.
[6]
Wu, Han and Shang, Zhihao and Wolter, Katinka, "Learning to Reliably Deliver Streaming Data with Apache Kafka," 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2020, pp. 564-571. Core Team. 2019. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
[7]
Wu, Han and Shang, Zhihao and Wolter, Katinka, "Performance Prediction for the Apache Kafka Messaging System," 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), 2019, pp. 154-161. R. Smith and Shih-Fu Chang. 1997. Visual Seek: a fully automated content-based image query system. In Proceedings of the fourth ACM international conference on Multimedia (MULTIMEDIA ’96). Association for Computing Machinery, New York, NY, USA, 87–98. https://doi.org/10.1145/244130.244151
[8]
Alaasam, Ameer B. A. and Radchenko, Gleb and Tchernykh, Andrey, "Stateful Stream Processing for Digital Twins: Microservice-Based Kafka Stream DSL," 2019 International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON), 2019, pp. 0804-0809.
[9]
Wu, Han and Shang, Zhihao and Wolter, Katinka, "TRAK: A Testing Tool for Studying the Reliability of Data Delivery in Apache Kafka," 2019 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), 2019, pp. 394-397.
[10]
Ed-daoudy, Abderrahmane and Maalmi, Khalil "Application of Machine Learning Model on Streaming Health Data Event in Real-Time to Predict Health Status Using Spark," 2018 International Symposium on Advanced Electrical and Communication Technologies (ISAECT), 2018, pp. 1-4.
[11]
Aung, Thandar and Min, Hla Yin and Maw, Aung Htein, "Coordinate Checkpoint Mechanism on Real-Time Messaging System in Kafka Pipeline Architecture," 2019 International Conference on Advanced Information Technologies (ICAIT), 2019, pp. 37-42.
[12]
Navaz, Alramzana Nujum and Harous, Saad and Serhani, Mohamed Adel and Taleb, Ikbal, "Real-Time Data Streaming Algorithms and Processing Technologies: A Survey," 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), 2019, pp. 246-250.
[13]
van Dongen, Giselle and Van den Poel, Dirk, "Evaluation of Stream Processing Frameworks," IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 8, pp. 1845-1858, 1 Aug. 2020.
[14]
How to choose the number of topics/partitions in a Kafka cluster?https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster/, August 2023
[15]
APACHE KAFKA,https://kafka.apache.org/,Sep. 2023
[16]
van Dongen, Giselle and Poel, Dirk Van Den, "A Performance Analysis of Fault Recovery in Stream Processing Frameworks," IEEE Access, vol. 9, pp. 93745-93763, 2021.
[17]
Pelle, István and Szőke, Bence and Fayad, Abdulhalim and Cinkler, Tibor and Toka, László, "A Comprehensive Performance Analysis of Stream Processing with Kafka in Cloud Native Deployments for IoT Use-cases," NOMS 2023-2023 IEEE/IFIP Network Operations and Management Symposium, Miami, FL, USA, 2023, pp. 1-6.
[18]
Raptis, Theofanis P. and Passarella, Andrea, "On Efficiently Partitioning a Topic in Apache Kafka," 2022 International Conference on Computer, Information and Telecommunication Systems (CITS), Piraeus, Greece, 2022, pp. 1-8.
[19]
Vyas, Shubham and Tyagi, Rajesh Kumar and Jain, Charu and Sahu, Shashank, "Performance Evaluation of Apache Kafka – A Modern Platform for Real Time Data Streaming," 2022 2nd International Conference on Innovative Practices in Technology and Management (ICIPTM), Gautam Buddha Nagar, India, 2022, pp. 465-470.
[20]
Raj, Pethuru and Vanga, Skylab and Chaudhary, Akshita, "Setting Up Apache Kafka Clusters in a Cloud Environment and Secure Monitoring," in Cloud-native Computing: How to Design, Develop, and Secure Microservices and Event-Driven Applications, IEEE, 2023, pp.299-315.
Recommendations
Benchmarking Apache Kafka under network faults
Middleware '21: Proceedings of the 22nd International Middleware Conference: Demos and PostersNetwork faults are often transient and hence hard to detect and difficult to resolve. Our study conducts an analysis of Kafka's network fault tolerance capabilities, one of the widely used distributed stream processing system (DSPS). Across different ...
Comments
Information & Contributors
Information
Published In
November 2023
1215 pages
ISBN:9798400709418
DOI:10.1145/3647444
Copyright © 2023 ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Published: 13 May 2024
Check for updates
Qualifiers
- Research-article
- Research
- Refereed limited
Conference
ICIMMI 2023
ICIMMI 2023: International Conference on Information Management & Machine Intelligence
November 23 - 25, 2023
Jaipur, India
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 35Total Downloads
- Downloads (Last 12 months)35
- Downloads (Last 6 weeks)7
Reflects downloads up to 02 Feb 2025
Other Metrics
Citations
View Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign inFull Access
View options
View or Download as a PDF file.
PDFeReader
View online with eReader.
eReaderHTML Format
View this article in HTML Format.
HTML Format