Apache Pulsar and its enterprise use cases

Yahoo Japan Corporation Nozomi Kurihara

July 18th, 2018 Who am I?

Nozomi Kurihara • Software engineer at Yahoo! JAPAN (April 2012 ~) • Working on internal messaging platform using Apache Pulsar • Committer of Apache Pulsar

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 2 Agenda

1. What is Apache Pulsar?

2. Why is Apache Pulsar useful?

3. How does Yahoo! JAPAN uses Apache Pulsar?

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 3 What is Apache Pulsar?

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 4 Agenda

1. What is Apache Pulsar?

› History & Users

› Pub-Sub messaging

› Architecture

› Client libraries

› Topic

› Subscription

› Sample codes 2. Why is Apache Pulsar useful? 3. How does Yahoo! JAPAN uses Apache Pulsar?

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 5 Apache Pulsar Flexible pub-sub system backed by durable log storage ▪ History: ▪ Competitors: › 2014 Development started at Yahoo! Inc. › › 2015 Available in production in Yahoo! Inc. › RabbitMQ › Sep. 2016 Open-sourced ( 2.0) › Apache ActiveMQ › June 2017 Moved to Project › Apache RocketMQ › June 2018 Major version update: 2.0.1 etc. ▪ Users: › Oath Inc. (Yahoo! Inc.) › Comcast › The Weather Channel › Mercado Libre › Streamlio › Yahoo! JAPAN

etc. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 6 Pub-Sub messaging

Message transmission from one system to another via Topic ▪ Producers publish messages to Topics ▪ Consumers receive only messages from Topics to which they subscribe ▪ Decoupled (no need to know each other) → asynchronous, scalable, resilient Subscribe Consumer 1 Publish

Producer Topic Consumer 2

message (log, notification, etc.) Consumer 3 Pub-Sub system

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 7 Architecture

Producer Consumer ■3 components:

‣ Broker

‣ Bookie Broker 1 Broker 2 Broker 3 ‣ ZooKeeper

Configuration Store Local ZK (Global ZK)

Bookie Bookie Bookie 1 2 3

Pulsar Cluster

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 8 Architecture - Broker

Producer Consumer ■Broker

‣ Serving node for clientsʼ requests

‣ No data locality (stateless) Broker 1 Broker 2 Broker 3

Configuration Store Local ZK (Global ZK)

Bookie Bookie Bookie 1 2 3

Pulsar Cluster

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 9 Architecture - Bookie

Producer Consumer ■Bookie (Apache BookKeeper)

‣ Storage node for messages

‣ Durable, Scalable, Consistent, Fault- Broker 1 Broker 2 Broker 3 tolerant, Low-latency

Configuration Store Local ZK (Global ZK) Apache BookKeeper: distributed write-ahead log system Bookie Bookie Bookie 1 2 3

Copyright © 2016 - 2018 The Apache Software Foundation, Pulsar Cluster licensed under the Apache License, version 2.0.

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 10 Architecture - ZooKeeper

Producer Consumer ■Apache ZooKeeper

‣ Store metadata and configuration

‣ Local ZK: within local cluster Broker 1 Broker 2 Broker 3 ‣ Configuration Store: across all clusters

Configuration Store Local ZK (Global ZK)

Bookie Bookie Bookie 1 2 3

Pulsar Cluster Copyright © 2016 - 2018 The Apache Software Foundation, licensed under the Apache License, version 2.0.

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 11 Client libraries

Supported client libraries: ▪ Java ▪ C++ ▪ Python ▪ Go (Coming soon) Also can use WebSocket API

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 12 Topic URI

Service etc. Usage detail etc. persistent://tenant/namespace/topicname persistency Usage etc.

persistent non-persistent Examples:

Producer Consumer Producer Consumer persistent://shopping/user-log/buy persistent://auction/notifications/bid Broker Broker persistent://news/sports/articles

persist … Bookie Bookie

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 13 Subscription

• Acknowledgement(ACK): confirmation that message is received • Cursor: which messages are acknowledged • Subscription: cursor and which consumers are connecting Note: Each message is kept unless all subscriptions acknowledge

sub-A Consumer acked by sub-A 2 1

Producer 6 5 4 3 ✓2 ✓1 acked by all → can be deleted

acked by sub-B sub-B Consumer 4 3 2 1

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 14 Sample code(Java) - Producer

// Create client by specifying the broker URI PulsarClient client = PulsarClient.builder() .serviceUrl("pulsar://localhost:6650") .build(); // Create producer by specifying the topic URI Producer producer = client.newProducer() .topic(”persistent://my-tenant/my-ns/my-topic") .create(); // Send a message producer.send("My message".getBytes());

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 15 Sample code(Java) - Consumer // Create client by specifying the broker URI PulsarClient client = PulsarClient.builder() .serviceUrl("pulsar://localhost:6650") .build(); // Create consumer by specifying the topic URI and subscription name Consumer consumer = client.newConsumer() .topic(" persistent://my-tenant/my-ns/my-topic") .subscriptionName("my-subscription") .subscribe(); // Receive a message Message msg = consumer.receive(); System.out.printf("Message received: %s", new String(msg.getData())); // Acknowledge the message so that it can be deleted by the broker consumer.acknowledge(msg);

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 16 Why is Apache Pulsar useful?

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 17 Agenda

1. What is Apache Pulsar? 2. Why is Apache Pulsar useful?

› High performance

› Scalability

› Multi-tenant

› Geo-replication

› Pulsar Functions

› Pulsar on

› Apache Pulsar vs. Apache Kafka 3. How does Yahoo! JAPAN uses Apache Pulsar?

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 18 High performance

High throughput & low latency even if

▪ huge number of topics Throughput and 99pct latency: 1 Topic – 1 Producer 6 ▪ high reliability is required 10 Bytes

] 5 100 Bytes ms 1 KB ex. At Oath Inc. 4 Requirements: 3

• Topics: 2.3 million 2 • Messages:100 billion [msg/day] 1 99pct Latency [ 99pct Latency • Message loss is not acceptable 0 • Order needs to be guaranteed 1,000 10,000 100,000 1,000,000 10,000,000 Throughput [msg/s] Performance(1KB message): • Throughput: 100 thousand [msg/s] Graph is from: https://yahooeng.tumblr.com/post/150078336821/ • Latency: 5 [ms] open-sourcing-pulsar-pub-sub-messaging-at-scale

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 19 Scalability

Just adding Brokers/Bookies increases serving/storage capacity ▪ Add Brokers when more producer/consumerʼs requests need to be served ▪ Add Bookies when more messages need to be stored

Producer Consumer

Broker 1 Broker 2 Broker 3 Broker X Configuration Store (Global ZK) for more serving capacity

Local ZK

Bookie Y Bookie 1 Bookie 2 Bookie 3

Pulsar Cluster for more storage capacity

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 20 Multi-tenancy Multiple services can share one Pulsar system ▪ Just use Pulsar as a “Tenant” → no need to maintain own messaging system ▪ Authentication/Authorization mechanism protects messages from interception

Service A Producer Topic A Consumer

Service B Producer Topic B Consumer

Service C Producer Topic C Consumer

Service D Producer Topic D Authentication/AuthorizationConsumer blocks unauthorized access

Pulsar System Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 21 Geo-replication

Pulsar can replicate messages to another cluster 1. Producers only have to publish messages to Pulsar in the same data center 2. Pulsar asynchronously replicates messages to another cluster 3. Consumers can receive messages from the same data center

Pulsar Cluster A Consumer Geo-replication Pulsar Cluster B Consumer

Producer Topic Consumer Topic Consumer

Consumer Consumer

Data center A Data center B

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 22 Pulsar Functions

• Lightweight compute framework (e.g., AWS lambda, Google Cloud Functions) • No need to launch extra systems (e.g., , , ) • All you have to do is to implement your logic and deploy it to Pulsar cluster • Java and Python are available and also other languages will be supported in the future

Pulsar Function Worker

Source Consumer F(x) Producer Sink

Hello Hello!

def process(input): return input + ’!’

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 23 Pulsar on Kubernetes • Pulsar can be easily deployed in Kubernetes clusters (e.g., AWS, GKE) • See https://pulsar.incubator.apache.org/docs/latest/deployment/Kubernetes/

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 24 Apache Pulsar vs. Apache Kafka Apache Pulsar Apache Kafka Architecture image: • # of partition=1 • # of storage=3 Producer Consumer Producer Consumer • # of replication=2 Broker 1 Broker 2 Broker 3 Broker 1 Broker 2 Broker 3 (replica) (leader)

seg-1 seg-2 seg-1 seg-3 seg-2 seg-3 Bookie1 Bookie2 Bookie3 partition partition

Replication unit Segment (more granular than Partition) Partition

Data distribution Distributed and balanced across all Bookies Kept in only leader and replica Brokers

Max capacity Not limited by any single node Limited by the smallest brokerʼs capacity

Data Rebalancing Not required Required when scale Geo-replication Built-in Need to start an extra process: MirrorMaker

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 25 OpenMessaging Benchmark http://openmessaging.cloud/docs/benchmarks/ • A suite of tools that can easily compare various messaging systems https://info.streaml.io/benchmarking-streaming-messaging-platforms • Pulsar showed better performance than Kafka

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 26 How does Yahoo! JAPAN use Apache Pulsar?

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 27 Agenda

1. What is Apache Pulsar? 2. Why is Apache Pulsar useful? 3. How does Yahoo! JAPAN uses Apache Pulsar?

› About Yahoo! JAPAN

› System architecture

› Self-service tool

› Case 1 - Notification

› Case 2 - Job queuing

› Case 3 - Log collection

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 28 Yahoo! JAPAN

https://www.yahoo.co.jp/

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 29 Yahoo! JAPAN – 3 numbers

100+ 150,000+ 71,300,000,000

services servers PV/month (bare-metal) (average in 2017)

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. image: aflo 30 Why Yahoo! JAPAN chose Pulsar?

▪ Large number of users → High performance & scalability

▪ Large number of services → Multi-tenancy

▪ Sensitive/mission-critical messages → Durability

▪ Multiple data centers → Geo-replication

Pulsar meets all these requirements!

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 31 System architecture in Yahoo! JAPAN

Service A (Node.js) WebSocket WebSocket Proxy Proxy Service C (C++)

Service B (Java) Geo-replication Broker Broker

Prometheus + Grafana

Collect metrics Bookie ZK Bookie ZK + Visualize

West Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. East 32 Monitoring by Prometheus (1/2)

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 33 Monitoring by Prometheus (2/2)

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 34 Self-service tool

Web UI to manage tenants, namespaces and topics (Yahoo! JAPAN internal)

UI

Create tenant

See topic stats Create namespace

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 35 Case 1 – Notification of contents update

▪ Various contents files are pushed from partner companies to Yahoo! JAPAN ▪ Notification is sent to topic when contents are updated ▪ Once services receive a notification, they then fetch contents from file server ②receive notification Service A weather, map, news etc. Consumer Pulsar FTP server Service B ftpd Producer Topic Consumer ①send notification

Partner Service C Companies ③fetch content files Consumer

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 36 Case 2 – Job queuing in mail service

▪ Indexing of mail can be heavy → letʼs execute it asynchronously ▪ Producers register jobs to Pulsar ▪ Consumers take jobs from Pulsar at their own pace

Pulsar Mail BE server

Mail BE server Consumer Take and process a job

Producer Topic Handler for indexing request Register a job Producer

Re-register if it fails

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 37 Case 3 – Log collection (under development)

▪ Collect logs from all platforms: PaaS, CaaS, … ▪ Filtering / preprocessing by Pulsar Functions ▪ Finally send to other DB/platforms: HBase, Prometheus, twillio, …

app1 app2 app3 … PaaS All logs container1 container2 … Prometheus CaaS …

Pulsar

Functions …

Filtering/Preprocessing Pulsar

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 38 Conclusion

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 39 Conclusion

▪ Apache Pulsar

› Fast, Durable, Scalable pub-sub messaging › Has useful built-in features (Geo-replication, Pulsar Functions, etc.) ▪ Feature releases (2.1.0)

› Go Client

› Pulsar IO connector framework ▪ Documents

› https://pulsar.incubator.apache.org/ ▪ Contact

› https://apache-pulsar.slack.com/

[email protected] Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 40 References

• “Open-sourcing Pulsar, Pub-sub Messaging at Scale” https://yahooeng.tumblr.com/post/150078336821/open-sourcing-pulsar-pub- sub-messaging-at-scale • “Linked In Meetup – Apache Pulsar” https://www.slideshare.net/KarthikRamasamy3/linked-in-stream-processing- meetup-apache-pulsar?qid=f7fade7e-632c-47fe-aa68- aea84a622866&v=&b=&from_search=2 • “Benchmarking Enterprise Streaming Data and Message Queuing Platforms” https://info.streaml.io/benchmarking-streaming-messaging-platforms • “Comparing Pulsar and Kafka: how a segment-based architecture delivers better performance, scalability, and resilience” https://streaml.io/blog/pulsar-segment-based-architecture/

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 41

Apache BookKeeper

Apache BookKeeper: Open-source distributed write-ahead log system ▪ Durable, Scalable, Consistent, Fault-tolerant, Low-latency

Copyright © 2016 - 2018 The Apache Software Foundation, licensed under the Apache License, version 2.0.

the number to replicate is configurable (here 2) Write Bookie

A A Read

Read-ahead B B Write cache cache C C

Journal Persistent storage D D (SSD) (HDD)

Bookie1 Bookie2 Bookie3

Data is evenly distributed and balanced across bookies I/O paths for writes/reads are separated

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 43 Existing messaging systems

Apache Kafka RabbitMQ Apache ActiveMQ

Developed since 2011 2007 2004 LinkedIn Rabbit Technologies LogicBlaze

Server-side Scala Erlang Java languages

Client-side Java Erlang Java languages 3rd party libraries Java Multiple languages .NET/C# over various 3rd party libraries protocols Protocol Original AMQP AMQP (binary protocol over STOMP STOMP TCP) MQTT MQTT JMS JMS(1.1) OpenWire

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 44 Type 1 - Exclusive

▪ Only 1 Consumer is allowed to connect ▪ Default subscription type

Topic 5 4 3 2 1 sub-A Consumer A 5 4 3 2 1 (Exclusive) Producer 5 4 3 2 1 sub-B (Exclusive) Consumer B

Consumer C

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 45 Type 2 - Shared

▪ Multiple Consumers are allowed to connect ▪ Messages are distributed to each consumer in round-robin manner → Useful for load balancing 1 Consumer A 4 Topic

5 4 3 2 1 5 2 sub Producer (Shared) Consumer B

Consumer C 3

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 46 Type 3 - Failover

▪ Multiple Consumers are allowed to connect ▪ Only one of them receives messages ▪ If the primary consumer is down, another consumer starts to receive messages 1 Consumer A → Useful for active / standby 2 Topic Down after receiving 1 and 2 5 4 3 2 1 5 4 3 sub Producer (Failover) Consumer B

Consumer C

Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 47 Multi-tenancy ▪ Multiple services can share one Pulsar system ▪ Each tenant can manage their namespaces/topics by themselves (e.g., ACL)

persistent://tenant/namespace/topicname Create tenant topic1 ns1 topic2 Super user tenant-A (Pulsar maintainer) topic3 ns2 ACL ACL topic4 ACL Register as admin Create and configure namespaces service B (e.g., ACL) Tenant admin service A (Pulsar user) Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 48