Published September 30, 2021 | Version v1
Journal article Open

Latency Reduction Techniques in Kafka for Real- Time Data Processing Applications

Authors/Creators

Description

Apache Kafka has become a cornerstone in modern distributed systems, particularly for real-time data processing applications. However, as data volumes and processing demand increase, minimizing latency becomes crucial for maintaining system performance and responsiveness. This research article explores various techniques for reducing latency in Kafka-based systems, focusing on both producer-side and consumer-side optimizations, as well as broker configurations. We examine strategies such as batching, compression, partitioning schemes, and consumer group designs, and their impact on overall system latency. Our findings suggest that a combination of these techniques, when properly implemented, can significantly reduce end-to-end latency in Kafka- based real-time data processing applications.

Files

EJAET-8-9-115-117.pdf

Files (99.8 kB)

Name Size Download all
md5:d01fa35ed0ac461a23549e4abe66366c
99.8 kB Preview Download

Additional details

References

  • [1]. Apache Software Foundation. (Jan 2021). Apache Kafka Documentation. https://kafka.apache.org/documentation/
  • [2]. Narkhede, N., Shapira, G., & Palino, T. (2017). Kafka: The Definitive Guide: Real-Time Data and Stream Processing at Scale. O'Reilly Media.
  • [3]. Kreps, J., Narkhede, N., & Rao, J. (2011, June). Kafka: A distributed messaging system for log processing. In Proceedings of the NetDB (Vol. 11, No. 2011, pp. 1-7).
  • [4]. Confluent, Inc. (2020). Confluent Platform Security Overview. https://docs.confluent.io/platform/current/security/incremental- security-upgrade.html
  • [5]. E. Alothali, H. Alashwal, M. Salih and K. Hayawi, "Real Time Detection of Social Bots on Twitter Using Machine Learning and Apache Kafka," 2021 5th Cyber Security in Networking Conference (CSNet), Abu Dhabi, United Arab Emirates, 2021, pp.98-102, doi: 10.1109/CSNet52717.2021.9614282.
  • [6]. Purshotam S Yadav, "Minimize Downtime: Container Failover with Distributed Locks in Multi - Region Cloud Deployments for Low - Latency Applications", International Journal of Science and Research (IJSR), Volume 9 Issue 10, October 2020, pp. 1800-1803, https://www.ijsr.net/getabstract.php?paperid=SR24709191432
  • [7]. R. Shree, T. Choudhury, S. C. Gupta and P. Kumar, "KAFKA: The modern platform for data management and analysis in big data domain," 2017 2nd International Conference on Telecommunication and Networks (TEL-NET), Noida, India, 2017, pp. 1-5, doi: 10.1109/TEL-NET.2017.8343593.
  • [8]. M. H. Javed, X. Lu and D. K. Panda, "Cutting the Tail: Designing High Performance Message Brokers to Reduce Tail Latencies in Stream Processing," 2018 IEEE International Conference on Cluster Computing (CLUSTER), Belfast, UK, 2018, pp. 223-233, doi: 10.1109/CLUSTER.2018.00040
  • [9]. H. Wu, Z. Shang and K. Wolter, "Performance Prediction for the Apache Kafka Messaging System," 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), Zhangjiajie, China, 2019, pp. 154-161, doi: 10.1109/HPCC/SmartCity/DSS.2019.00036
  • [10]. Valentin Crettaz; Alexander Dean, Event Streams in Action: Real-time event systems with Kafka and Kinesis , Manning, 2019.