An Empirical Performance Evaluation of Apache Kafka

Total Page:16

File Type:pdf, Size:1020Kb

An Empirical Performance Evaluation of Apache Kafka How Fast Can We Insert? An Empirical Performance Evaluation of Apache Kafka Guenter Hesse, Christoph Matthies, Matthias Uflacker Hasso Plattner Institute University of Potsdam Germany fi[email protected] about such study results is a prerequisite for making informed Abstract—Message brokers see widespread adoption in modern decisions about whether a system is suitable for the existing IT landscapes, with Apache Kafka being one of the most use cases. Additionally, it is also crucial for finding or fine- employed platforms. These systems feature well-defined APIs for use and configuration and present flexible solutions for tuning appropriate system configurations. various data storage scenarios. Their ability to scale horizontally The contributions of this research are as follows: enables users to adapt to growing data volumes and changing • We propose a user-centered and extensible monitoring environments. However, one of the main challenges concerning framework, which includes tooling for analyzing any message brokers is the danger of them becoming a bottleneck within an IT architecture. To prevent this, knowledge about the JVM-based system. amount of data a message broker using a specific configuration • We present an analysis that highlights the capabilities of can handle needs to be available. In this paper, we propose a Apache Kafka regarding the maximum achievable rate of monitoring architecture for message brokers and similar Java incoming records per time unit. Virtual Machine-based systems. We present a comprehensive • We enable reproducibility of the presented results by performance analysis of the popular Apache Kafka platform 1 using our approach. As part of the benchmark, we study selected making all needed artifacts available online . data ingestion scenarios with respect to their maximum data The rest of the paper is structured as follows: In Section II ingestion rates. The results show that we can achieve an ingestion we give a brief introduction of Apache Kafka. Section III-A rate of about 420,000 messages/second on the used commodity presents the benchmark setup and the developed data sender hardware and with the developed data sender tool. Index Terms—performance, benchmarking, big data, Apache tool. Subsequently, we describe the results of the ingestion rate Kafka analyses. Section V introduces related work and Section VI elaborates on the lessons learned. The last chapter concludes I. INTRODUCTION the study and outlines areas of future work. In the current business landscape, with an ever-increasing II. APACHE KAFKA growth in data and popularity of cloud-based applications, Apache Kafka is a distributed open-source message broker horizontal scalability is becoming an increasingly common and or messaging system originally developed at LinkedIn in important requirement. Message brokers play a central role in 2010 [7]. The core of this publish-subscribe system is a modern IT systems as they satisfy this requirement and thus, distributed commit log, although it has extended its scope allow for adaptations of the IT landscape to data sources that through extensions. An example is Kafka Streams [8], a client grow both in volume and velocity. Moreover, they can be used library for developing stream processing applications. to decouple disparate data sources from applications using this The high-level architecture of an exemplary Apache Kafka arXiv:2003.06452v3 [cs.PF] 3 Mar 2021 data. Usage scenarios where message brokers are employed are cluster is visualized in Figure 1. A cluster consists of multiple manifold and reach from e.g., machine learning [1] to stream brokers, which are numbered and store data assigned to topics. processing architectures [2], [3], [4] and general-purpose data Data producers send data to a certain topic stored in the cluster. processing [5]. Consumers subscribe to a topic and are forwarded new values In the context of a complex IT architecture, the degree, sent to this topic as soon as they arrive. to which a system aligns with its application scenarios and Topics are divided into partitions. The number of topic parti- the functional and non-functional requirements derived from tions can be configured at the time of topic creation. Partitions it, are key [6]. If non-functional requirements related to of a single topic can be distributed across different brokers of performance are not satisfied, the system might become a a cluster. However, a message order across partitions is not bottleneck. This situation does not directly imply that the guaranteed by Apache Kafka [10], [9]. system itself is inadequate for the observed use case, but might Next to the number of partitions, it is possible to define indicate a suboptimal configuration. Therefore, it is crucial a replication factor for each topic, one being the minimum. to be able to evaluate the capabilities of a system in certain environments and with distinct configurations. The knowledge 1https://github.com/guenter-hesse/KafkaAnalysisTools Copyright ©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. DOI: will follow as soon as the paper is published. Producer A Producer B messages, a producer can require multiple retries, which would increase the difference between the timestamp assigned with a message and the time when it is appended to the log and Apache Kafka Cluster thus, made available for consuming applications [10]. Broker #1 Broker #2 Broker #3 _________________________ _________________________ _________________________ III. BENCHMARK SETUP topic1 / part1 topic1 / part1 This section introduces the monitoring architecture em- topic1 / part2 topic1 / part2 topic2 / part1 topic2 / part1 topic2 / part1 ployed in the ingestion rate study as well as the developed data sender tool. A. Monitoring Architecture Consumer A Consumer B Consumer C The architecture of the monitoring system is shown in Fig- ure 3. We use Grafana [12], an open-source tool for creating Fig. 1. Apache Kafka cluster architecture (based on [9]) and managing dashboards and exporting data, as the interface to the user. The presented benchmarks employ version 5.4.5 This allows preventing data loss in the case of a single broker of its docker image. OS-level virtualization through docker failure. In the context of replication, Apache Kafka defines is used for ease of installation and replicability of results. leaders and followers for each partition. The leader handles all The OS base image used in this image allows a simple reads and writes for the corresponding topic partition, whereas time zone configuration via an environment variable, which is followers copy or replicate the inserted data. In Figure 1, the important for time synchronization among all systems. Later leader partitions are shown in bold type. The first topic, topic1 versions of the image contain a different OS, specifically has two partitions and a replication factor of one, while topic2 Alpine Linux [13], which no longer supports this feature. has only one partition and a replication factor of two [10]. Figure 2 shows the structure of an Apache Kafka topic, Grafana specifically of a topic with two partitions. Each of these Data Visualization (Dashboard) partitions is an ordered and immutable record sequence where new values are appended. A sequential number is assigned to each topic record within a partition, referred to as an offset. Graphite Apache Kafka itself provides the topic consumer offsets for Monitoring Tool w/ Storage Functionality storing the offsets. However, consumers must manage their offset. They can commit their current offset either automati- cally in certain intervals or manually. The latter can be done jmxtrans collectd either synchronously or asynchronously. When polling data, JVM Metrics Collection System Metrics Collection a consumer needs to pass the offset to the cluster. Apache Kafka returns all messages with a greater offset, i.e., all new Apache Kafka messages that have not already been sent to this consumer. As Message Broker / System Under Test the consumer has control over its offset, it can also decide to start from the beginning and to reread messages [10]. Fig. 3. Monitoring architecture in Fundamental Modeling Concepts (FMC) [14] Partition 0 0 1 2 3 4 5 6 7 8 Grafana fetches the data to display from Graphite [15], an open-source monitoring tool. It consists of three components: writes Carbon, Whisper, and Graphite-web. Carbon is a service that Partition 1 0 1 2 3 4 5 retrieves time-series data, which is stored in Whisper, a persis- tence library. Graphite-web includes an interface for designing old new dashboards. However, these dashboards are not as appealing and functionally comprehensive as the corresponding compo- Fig. 2. Apache Kafka topic structure (based on [11]) nents of Grafana, which is why Grafana is employed. For the installation of Graphite, the official docker image in version Furthermore, Apache Kafka can be configured to use the Lo- 1.1.4 is used, again for time zone configuration reasons. gAppendTime feature, which induces Apache Kafka to assign Graphite receives its input from two sources: collectd [16] a timestamp to each message once it is appended to the log. and jmxtrans [17]. The former is a daemon collecting system The existing alternative, which represents the default value, and application performance metrics, that runs on the broker’s is CreateTime. In this setting, the timestamp created by the machines in the described setup. It offers plugins for gathering Apache Kafka producer when creating the message, i.e., before OS-level measurements, such as memory usage, system load, sending it, is stored along with the message. For transmitting and received or transmitted packages over the network. Jmxtrans, the other data source for Graphite, is a tool for TABLE II collecting JVM runtime metrics.
Recommended publications
  • Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka
    White Paper Information Security | Machine Learning October 2020 IT@Intel: Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka Our Apache Kafka data pipeline based on Confluent Platform ingests tens of terabytes per day, providing in-stream processing for faster security threat detection and response Intel IT Authors Executive Summary Ryan Clark Advanced cyber threats continue to increase in frequency and sophistication, Information Security Engineer threatening computing environments and impacting businesses’ ability to grow. Jen Edmondson More than ever, large enterprises must invest in effective information security, Product Owner using technologies that improve detection and response times. At Intel, we Dennis Kwong are transforming from our legacy cybersecurity systems to a modern, scalable Information Security Engineer Cyber Intelligence Platform (CIP) based on Kafka and Splunk. In our 2019 paper, Transforming Intel’s Security Posture with Innovations in Data Intelligence, we Jac Noel discussed the data lake, monitoring, and security capabilities of Splunk. This Security Solutions Architect paper describes the essential role Apache Kafka plays in our CIP and its key Elaine Rainbolt benefits, as shown here: Industry Engagement Manager ECONOMIES OPERATE ON DATA REDUCE TECHNICAL GENERATES OF SCALE IN STREAM DEBT AND CONTEXTUALLY RICH Paul Salessi DOWNSTREAM COSTS DATA Information Security Engineer Intel IT Contributors Victor Colvard Information Security Engineer GLOBAL ALWAYS MODERN KAFKA LEADERSHIP SCALE AND REACH ON ARCHITECTURE WITH THROUGH CONFLUENT Juan Fernandez THRIVING COMMUNITY EXPERTISE Technical Solutions Specialist Frank Ober SSD Principal Engineer Apache Kafka is the foundation of our CIP architecture. We achieve economies of Table of Contents scale as we acquire data once and consume it many times.
    [Show full text]
  • Unravel Data Systems Version 4.5
    UNRAVEL DATA SYSTEMS VERSION 4.5 Component name Component version name License names jQuery 1.8.2 MIT License Apache Tomcat 5.5.23 Apache License 2.0 Tachyon Project POM 0.8.2 Apache License 2.0 Apache Directory LDAP API Model 1.0.0-M20 Apache License 2.0 apache/incubator-heron 0.16.5.1 Apache License 2.0 Maven Plugin API 3.0.4 Apache License 2.0 ApacheDS Authentication Interceptor 2.0.0-M15 Apache License 2.0 Apache Directory LDAP API Extras ACI 1.0.0-M20 Apache License 2.0 Apache HttpComponents Core 4.3.3 Apache License 2.0 Spark Project Tags 2.0.0-preview Apache License 2.0 Curator Testing 3.3.0 Apache License 2.0 Apache HttpComponents Core 4.4.5 Apache License 2.0 Apache Commons Daemon 1.0.15 Apache License 2.0 classworlds 2.4 Apache License 2.0 abego TreeLayout Core 1.0.1 BSD 3-clause "New" or "Revised" License jackson-core 2.8.6 Apache License 2.0 Lucene Join 6.6.1 Apache License 2.0 Apache Commons CLI 1.3-cloudera-pre-r1439998 Apache License 2.0 hive-apache 0.5 Apache License 2.0 scala-parser-combinators 1.0.4 BSD 3-clause "New" or "Revised" License com.springsource.javax.xml.bind 2.1.7 Common Development and Distribution License 1.0 SnakeYAML 1.15 Apache License 2.0 JUnit 4.12 Common Public License 1.0 ApacheDS Protocol Kerberos 2.0.0-M12 Apache License 2.0 Apache Groovy 2.4.6 Apache License 2.0 JGraphT - Core 1.2.0 (GNU Lesser General Public License v2.1 or later AND Eclipse Public License 1.0) chill-java 0.5.0 Apache License 2.0 Apache Commons Logging 1.2 Apache License 2.0 OpenCensus 0.12.3 Apache License 2.0 ApacheDS Protocol
    [Show full text]
  • HDP 3.1.4 Release Notes Date of Publish: 2019-08-26
    Release Notes 3 HDP 3.1.4 Release Notes Date of Publish: 2019-08-26 https://docs.hortonworks.com Release Notes | Contents | ii Contents HDP 3.1.4 Release Notes..........................................................................................4 Component Versions.................................................................................................4 Descriptions of New Features..................................................................................5 Deprecation Notices.................................................................................................. 6 Terminology.......................................................................................................................................................... 6 Removed Components and Product Capabilities.................................................................................................6 Testing Unsupported Features................................................................................ 6 Descriptions of the Latest Technical Preview Features.......................................................................................7 Upgrading to HDP 3.1.4...........................................................................................7 Behavioral Changes.................................................................................................. 7 Apache Patch Information.....................................................................................11 Accumulo...........................................................................................................................................................
    [Show full text]
  • The Geography of Job Tasks
    Working Papers WP 21-27 August 2021 https://doi.org/10.21799/frbp.wp.2021.27 The Geography of Job Tasks Enghin Atalay Federal Reserve Bank of Philadelphia Research Department Sebastian Sotelo University of Michigan-Ann Arbor Daniel Tannenbaum University of Nebraska-Lincoln ISSN: 1962-5361 Disclaimer: This Philadelphia Fed working paper represents preliminary research that is being circulated for discussion purposes. The views expressed in these papers are solely those of the authors and do not necessarily reflect the views of the Federal Reserve Bank of Philadelphia or the Federal Reserve System. Any errors or omissions are the responsibility of the authors. Philadelphia Fed working papers are free to download at: https://philadelphiafed.org/research-and-data/publications/working-papers. The Geography of Job Tasks ∗ Enghin Atalay Sebastian Sotelo Daniel Tannenbaum August 3, 2021 Abstract The returns to skills and the nature of work differ systematically across labor mar- kets of different sizes. Prior research has pointed to worker interactions, technological innovation, and specialization as key sources of urban productivity gains, but has been limited by the available data in its ability to fully characterize work across geographies. We study the sources of geographic inequality and present new facts about the geog- raphy of work using online job ads. We show that the (i) intensity of interactive and analytic tasks, (ii) technological requirements, and (iii) task specialization all increase with city size. The gradient for tasks and technologies exists both across and within occupations. It is also steeper for jobs requiring a college degree and for workers employed in non-tradable industries.
    [Show full text]
  • Apache Flume™
    ™ Apache Flume™ Flume 1.7.0 User Guide Introduction Overview Apache Flume is a distributed, reliable, and available system for efficiently collecting, aggregating and moving large amounts of log data from many different sources to a centralized data store. The use of Apache Flume is not only restricted to log data aggregation. Since data sources are customizable, Flume can be used to transport massive quantities of event data including but not limited to network traffic data, social-media-generated data, email messages and pretty much any data source possible. Apache Flume is a top level project at the Apache Software Foundation. There are currently two release code lines available, versions 0.9.x and 1.x. Documentation for the 0.9.x track is available at the Flume 0.9.x User Guide. This documentation applies to the 1.4.x track. New and existing users are encouraged to use the 1.x releases so as to leverage the performance improvements and configuration flexibilities available in the latest architecture. System Requirements 1. Java Runtime Environment - Java 1.7 or later 2. Memory - Sufficient memory for configurations used by sources, channels or sinks 3. Disk Space - Sufficient disk space for configurations used by channels or sinks 4. Directory Permissions - Read/Write permissions for directories used by agent Architecture Data flow model A Flume event is defined as a unit of data flow having a byte payload and an optional set of string attributes. A Flume agent is a (JVM) process that hosts the components through which events flow from an external source to the next destination (hop).
    [Show full text]
  • DMI-Technical-Proposal ODOT ESP-2020-11-06
    State of Ohio Department of Transportation ODOT Event Streaming Platform (ESP) Invitation No. 580-21 June 11, 2020 [email protected] This proposal contains data that shall not be disclosed by the Customer and shall not be duplicated, used, or disclosed—in whole or in part—for any reason other than to evaluate this proposal. If, however, a contract is awarded to Digital Management, LLC. as a result of—or in connection with—the submission of this proposal, the Customer shall have the right to duplicate, use, or disclose the data to the extent provided in the resulting contract. This restriction does not limit the Customer’s right to use the information contained in this proposal if it is obtained from another source without restriction. This restriction is in force for all data contained on all pages of this proposal. Cover Sheet Prepared For • State of Ohio, Department of Transportation Office of Contract Sales, Purchasing Services 1st floor 1980 West Broad St. Mail Stop 4110 Columbus, OH 43223 • Attn: Dr. Jack Marchbanks • Email: [email protected] Prepared By • Digital Mobile Innovations, LLC (DMI) 6550 Rock Spring Drive, 7th Floor Bethesda, MD 20817 • Phone: 240-223-4800, Fax: 240-223-4888 • www.DMInc.com DMI Authorized Negotiators • Mr. Thomas Ford, Senior Director of Contracts Email: [email protected] Office: 240-720-0433, Mobile: 202-374-0222 DMI Corporate Information • DUNS: 11-351-2359 • CAGE Code: 3BDL8 • TIN: 80-0882271 REQUEST FOR QUOTES (RFQ) State of Ohio, Department of Transportation Office of Contract Sales, Purchasing Services Jack Marchbanks, Ph.
    [Show full text]
  • Dissertation Zur Erlangung Des Akademischen Grades Des Doktors Der Naturwissenschaften
    Dissertation zur Erlangung des akademischen Grades des Doktors der Naturwissenschaften (Dr. rer. nat) Performance and Reliability Evaluation of Apache Kafka Messaging System eingereicht am Institut für Informatik des Fachbereichs Mathematik und Informatik der Freien Universität Berlin von Han Wu Berlin, 2020 Gutachter: Prof. Dr. Katinka Wolter (der erste Rezensent) Department of Computer Science Freie Universität Berlin, Germany Prof. Dr. Aad van Moorsel (der zweite Rezensent) School of Computing Newcastle University, UK Das Datum der Disputation: 14th. Jan 2021 Selbständigkeitserklärung Ich versichere, dass ich die Doktorarbeit selbständig verfasst, und keine anderen als die angegebenen Quellen und Hilfsmittel benutzt habe. Die Arbeit hat keiner anderen Prüfungsbehörde vorgelegen. Han Wu Berlin, den 4. Februar 2021 Abstract Streaming data is now flowing across various devices and applications around us. This type of data means any unbounded, ever growing, infinite data set which is continuously generated by all kinds of sources. Examples include sensor data transmitted among different Internet of Things (IoT) devices, user activity records collected on websites and payment requests sent from mobile devices. In many application scenarios, streaming data needs to be processed in real-time because its value can be futile over time. A variety of stream processing systems have been developed in the last decade and are evolving to address rising challenges. A typical stream processing system consists of multiple processing nodes in the topology of a DAG (directed acyclic graph). To build real-time streaming data pipelines across those nodes, message middleware technology is widely applied. As a distributed messaging system with high durability and scalability, Apache Kafka has become very popular among modern companies.
    [Show full text]
  • Technology Overview
    Big Data Technology Overview Term Description See Also Big Data - the 5 Vs Everyone Must Volume, velocity and variety. And some expand the definition further to include veracity 3 Vs Know and value as well. 5 Vs of Big Data From Wikipedia, “Agile software development is a group of software development methods based on iterative and incremental development, where requirements and solutions evolve through collaboration between self-organizing, cross-functional teams. Agile The Agile Manifesto It promotes adaptive planning, evolutionary development and delivery, a time-boxed iterative approach, and encourages rapid and flexible response to change. It is a conceptual framework that promotes foreseen tight iterations throughout the development cycle.” A data serialization system. From Wikepedia, Avro Apache Avro “It is a remote procedure call and serialization framework developed within Apache's Hadoop project. It uses JSON for defining data types and protocols, and serializes data in a compact binary format.” BigInsights Enterprise Edition provides a spreadsheet-like data analysis tool to help Big Insights IBM Infosphere Biginsights organizations store, manage, and analyze big data. A scalable multi-master database with no single points of failure. Cassandra Apache Cassandra It provides scalability and high availability without compromising performance. Cloudera Inc. is an American-based software company that provides Apache Hadoop- Cloudera Cloudera based software, support and services, and training to business customers. Wikipedia - Data Science Data science The study of the generalizable extraction of knowledge from data IBM - Data Scientist Coursera Big Data Technology Overview Term Description See Also Distributed system developed at Google for interactively querying large datasets. Dremel Dremel It empowers business analysts and makes it easy for business users to access the data Google Research rather than having to rely on data engineers.
    [Show full text]
  • A Survey of Distributed Message Broker Queues
    A Survey of Distributed Message Broker Queues Vineet John Xia Liu University of Waterloo University of Waterloo [email protected] [email protected] ABSTRACT This paper surveys the message brokers that are in vogue today for distributed communication. Their primary goal is to facilitate the construction of decentralized topolo- gies without single points of failure, enabling fault tol- erance and high availability. These characteristics make them optimal for usage within distributed architectures. However, there are multiple protocols built to achieve this, and it would be beneficial to have a empirical comparison between their features and performance to determine their real-world applicability. Figure 1: Kafka Architecture This paper focuses on two popular protocols (Kafka and AMQP) and explores the divergence in their fea- RQ0 What are the message broker implementations tures as well as their performance under varied testing commonly in use today? workloads. RQ1 What are the common requirements for the im- plementation of message queues? KEYWORDS RQ2 What are the divergent functionalities in the distributed message broker, message queue, kafka, amqp, current message queue offerings? rabbitmq RQ3 How do each of the implementations offer relia- bility, partitioning and fault tolerance? 1 INTRODUCTION 3 KAFKA Kafka was developed at LinkedIn and primarily used Distributed Message Brokers are typically used to decou- for log processing. This worked well for Kafka’s user en- ple separate stages of a software architecture. They per- gagement metrics collection use-case. The fundamental mit communication between these stages asynchronously, features behind Kafka are performance over reliability by using the publish-subscribe paradigm.[1]. These mes- and it offers high throughput, low latency message queu- sage brokers are also finding new applications in the ing.
    [Show full text]
  • The Event Mesh: a Primer Chapter 1: What Are Events? PAGE 3
    The Event Mesh: A Primer Chapter 1: What Are Events? PAGE 3 Chapter 2: The Event-Driven Enterprise PAGE 5 Chapter 3: What Is an Event Mesh? PAGE 8 Chapter 4: Deploying the Event Mesh in the Real World PAGE 10 Chapter 5: Conclusions and More Resources PAGE 11 Copyright © 2020 RTInsights. All rights reserved. All other trademarks are the property of their respective companies. The information contained in this publication has been obtained from sources believed to be reliable. NACG LLC and RTInsights disclaim all warranties as to the accuracy, completeness, or adequacy of such information and shall have no liability for errors, omissions or inadequacies in such information. The information expressed herein is subject to change without notice. Chapter 1: What Are Events? An event is a change, action, or observation in a system that produces a notification. Events are triggered by changes such as master data entity updates like a customer billing address change, or an action like a new order or a trade approval. Other triggers are more basic, like a battery charge level threshold being met or a periodic temperature reading from a sensor. It is common to refer to the notification as the “event.” Rather than use the term trigger, software architects often say “send an event” when a specified condition occurs. Event notifications produced programmatically by the system of record (SOR), can contain a snapshot of the affected entity, a delta In an event-based system, (i.e., changed attributes only), or a reference (e.g., a PO number). a component produces an Snapshot and delta events usually carry business meaning, have a event without necessarily complex structure, and are commonly represented in JSON or XML.
    [Show full text]
  • Building a Scalable Distributed Data Platform Using Lambda Architecture
    Building a scalable distributed data platform using lambda architecture by DHANANJAY MEHTA B.Tech., Graphic Era University, India, 2012 A REPORT submitted in partial fulfillment of the requirements for the degree MASTER OF SCIENCE Department Of Computer Science College Of Engineering KANSAS STATE UNIVERSITY Manhattan, Kansas 2017 Approved by: Major Professor Dr. William H. Hsu Copyright Dhananjay Mehta 2017 Abstract Data is generated all the time over Internet, systems, sensors and mobile devices around us this data is often referred to as 'big data'. Tapping this data is a challenge to organiza- tions because of the nature of data i.e. velocity, volume and variety. What make handling this data a challenge? This is because traditional data platforms have been built around relational database management systems coupled with enterprise data warehouses. Legacy infrastructure is either technically incapable to scale to big data or financially infeasible. Now the question arises, how to build a system to handle the challenges of big data and cater needs of an organization? The answer is Lambda Architecture. Lambda Architecture (LA) is a generic term that is used for a scalable and fault-tolerant data processing architecture that ensure real-time processing with low latency. LA provides a general strategy to knit together all necessary tools for building a data pipeline for real- time processing of big data. LA builds a big data platform as a series of layers that combine batch and real time processing. LA comprise of three layers - Batch Layer, responsible for bulk data processing; Speed Layer, responsible for real-time processing of data streams and Serving Layer, responsible for serving queries from end users.
    [Show full text]
  • Log4j User Guide
    ...................................................................................................................................... Apache Log4j 2 v. 2.14.1 User's Guide ...................................................................................................................................... The Apache Software Foundation 2021-03-06 T a b l e o f C o n t e n t s i Table of Contents ....................................................................................................................................... 1. Table of Contents . i 2. Introduction . 1 3. Architecture . 3 4. Log4j 1.x Migration . 10 5. API . 17 6. Configuration . 20 7. Web Applications and JSPs . 62 8. Plugins . 71 9. Lookups . 75 10. Appenders . 87 11. Layouts . 183 12. Filters . 222 13. Async Loggers . 238 14. Garbage-free Logging . 253 15. JMX . 262 16. Logging Separation . 269 17. Extending Log4j . 271 18. Programmatic Log4j Configuration . 282 19. Custom Log Levels . 290 © 2 0 2 1 , T h e A p a c h e S o f t w a r e F o u n d a t i o n • A L L R I G H T S R E S E R V E D . T a b l e o f C o n t e n t s ii © 2 0 2 1 , T h e A p a c h e S o f t w a r e F o u n d a t i o n • A L L R I G H T S R E S E R V E D . 1 I n t r o d u c t i o n 1 1 Introduction ....................................................................................................................................... 1.1 Welcome to Log4j 2! 1.1.1 Introduction Almost every large application includes its own logging or tracing API. In conformance with this rule, the E.U.
    [Show full text]