Cloud Kafka What's CKafka? Product Introduction What's CKafka? Product Introduction Copyright Notice ©2013-2017 Tencent Cloud. All rights reserved. Copyright in this document is exclusively owned by Tencent Cloud. You must not reproduce, modify, copy or distribute in any way, in whole or in part, the contents of this document without Tencent Cloud's the prior written consent. Trademark Notice All trademarks associated with Tencent Cloud and its services are owned by Tencent Cloud Computing (Beijing) Company Limited and its affiliated companies. Trademarks of third parties referred to in this document are owned by their respective proprietors. Service Statement This document is intended to provide users with general information about Tencent Cloud's products and services only and does not form part of Tencent Cloud's terms and conditions. Tencent Cloud's products or services are subject to change. Specific products and services and the standards applicable to them are exclusively provided for in Tencent Cloud's applicable terms and conditions. ©2013-2017 Tencent Cloud. All rights reserved. Page 2 of 17 What's CKafka? Product Introduction Contents Documentation Legal Notice ............................................................................................................................................ 2 What's CKafka? ....................................................................................................................................................................... 4 Product Overview................................................................................................................................................................. 4 Technical Principle ............................................................................................................................................................... 8 Product Comparison ......................................................................................................................................................... 11 SLA ....................................................................................................................................................................................... 14 ©2013-2017 Tencent Cloud. All rights reserved. Page 3 of 17 What's CKafka? Product Introduction What's CKafka? Product Overview Product Overview Cloud Kafka is a high-throughput and highly scalable message queue service provided by Tencent based on the self-developed CMQ engine. Compatible with Apache kafka 0.9, Cloud Kafka has superior advantages in performance, scalability, business security, and OPS, allowing you to enjoy powerful features at low cost while eliminating the tedious operation OPS work. Application Scenarios 1. Webpage tracking Cloud Kafka processes website activities (PV, search, and other user activities) in real time, and then posts them to topics by type. These info flows can be used for real-time monitoring or offline statistical analysis. Since a large amount of activity information is generated in each user's page view, the website activity tracking requires high throughput. Cloud Kafka can perfectly meet the requirements of high throughput and offline processing. 2. Log aggregation Cloud Kafka provides the features of low-latency processing, easy support for multiple data sources and distributed data processing (consumption). Compared to the centralized log aggregation system, Cloud Kafka can implement stronger persistence guarantees as well as lower end-to-end latency while providing the same performance. The above features make Cloud Kafka an ideal "log collection center". Multiple servers/applications can send the operation logs "asynchronously" to a Cloud Kafka cluster "in batch" without being stored locally or in a DB. Cloud Kafka can submit/compress messages in batch, and the producers can hardly perceive the performance overhead. At this time, the consumers can use systematic storage and analysis systems such as Hadoop to collect and analyze pulled logs. 3. Big Data Scenario For some business scenarios related to big data, a large amount of concurrent data needs to ©2013-2017 Tencent Cloud. All rights reserved. Page 4 of 17 What's CKafka? Product Introduction be processed and aggregated. Therefore, high cluster processing performance and high scalability are required. In the implementation, Cloud Kafka is also suitable for handling massive real-time messages and aggregating distributed application data in terms of its data distribution mechanism, allocation of disk storage spaces, processing of message formats, server selection, and data compression, facilitating system OPS. In a specific big data scenario, Cloud Kafka processes offline data and streaming data very well, and aggregates and analyzes data easily. Advantages 1. Decoupling The relationship between producers and consumers is effectively decoupled. Under the premise that the same API constraint is ensured, the processing between producers and consumers is allowed to be independently expanded or modified. 2. Scalability Because the message processing is decoupled, it only needs to be horizontally scaled to effectively increase the enqueue efficiency and processing efficiency of the message, which is very flexible. 3. Peak Load Shifting Message queue can withstand the sudden access pressure, without completely crashing due to the sudden overloaded requests, which effectively boosts the system robustness. 4. Resiliency When part of the system components fail, the overall system is not affected, which increases the system's fault tolerance. The process failure of a certain message is processed timely, and the messages in the queue can still be processed after the system is restored. 5. Sequential Read/Write Cloud Kafka can guarantee the orderliness of messages within a Partition, which is consistent with most message queues. Besides, Cloud Kafka ensures that data is processed in order, ©2013-2017 Tencent Cloud. All rights reserved. Page 5 of 17 What's CKafka? Product Introduction greatly improving the disk efficiency. 6. Asynchronous Communication In the scenario where the business does not need to process messages immediately, Cloud Kafka provides the asynchronous message processing mechanism. When the traffic is large, messages are put into the queue only, and they will be processed after the traffic is reduced, which relieves the system pressure. Glossary No. Name Explanation 1 Broker The server in the Cloud Kafka cluster 2 Topic The message type, Cloud Kafka is message-oriented 3 Partition A concept in physical partition, where one Topic can contain one or more partitions, and Cloud Kafka uses partition as an allocation unit 4 Replica The copy of partition, used for guaranteeing the high availability of partition 5 Offset The unique serial number of a message in partition 6 Producer The producer, responsible for publishing messages 7 Consumer The consumer, consuming messages from the cluster 8 Consumer group The group of consumers, each consumer must belong to one consumer group. Each message can be consumed by multiple ©2013-2017 Tencent Cloud. All rights reserved. Page 6 of 17 What's CKafka? Product Introduction No. Name Explanation consumer groups, but can only be consumed by one consumer in this group 9 Zookeeper Used to store meta data of cluster, conduct leader election, fault tolerance, etc. ©2013-2017 Tencent Cloud. All rights reserved. Page 7 of 17 What's CKafka? Product Introduction Technical Principle How It Works The architecture of Cloud Kafka is as follows: A typical Cloud Kafka cluster is shown above. Producer may be the messages generated by web activities, service logs, or other information. Producers post the messages to Cloud Kafka's Broker cluster in push mode, and consumers consume the messages from Broker in pull mode. Consumer are divided into a number of Consumer Groups, In addition, the cluster manages the clustering configuration through Zookeeper, and conducts leader election, fault tolerance and so on. Implementation of High Throughput A huge amount of network data in Cloud Kafka are permanently sent to disks and disk files over the network. The performance of this process directly affects Kafka's overall throughput, especially from the following aspects: 1. Efficient Use of Disks 2. Sequential reading/writing data on disks improves the disk utilization. Write message ©2013-2017 Tencent Cloud. All rights reserved. Page 8 of 17 What's CKafka? Product Introduction The message is written to the page cache and flushed by the asynchronous thread. Read message The message is directly transferred from the page cache to the socket and then sent out. When the corresponding data is not found in the page cache, the disk IO is produced at this time, and the messages are loaded from the disk to the page cache. Then, they are sent directly from the socket. 3. Zero Copy Mechanism of Broker Use the sendfile system call to send data directly from the page cache to the network. 4. Reduced Network Overhead 5. Data compression reduces the network load 6. Batch Processing Mechanism: Producer writes data to Broker in batch, and Consumer pulls data from Broker in batch Data Persistence The data persistence of Cloud Kafka is mainly achieved through the following principles: 1. Storage Distribution of Partitions in Topic In the file storage of Cloud Kafka, a topic has multiple different partitions, with each physically corresponding to a folder. Users store the messages and index files in these partitions. For example,
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages17 Page
-
File Size-