Apache Storm Tutorial

Apache Storm Apache Storm About the Tutorial Storm was originally created by Nathan Marz and team at BackType. BackType is a social analytics company. Later, Storm was acquired and open-sourced by Twitter. In a short time, Apache Storm became a standard for distributed real-time processing system that allows you to process large amount of data, similar to Hadoop. Apache Storm is written in Java and Clojure. It is continuing to be a leader in real-time analytics. This tutorial will explore the principles of Apache Storm, distributed messaging, installation, creating Storm topologies and deploy them to a Storm cluster, workflow of Trident, real-time applications and finally concludes with some useful examples. Audience This tutorial has been prepared for professionals aspiring to make a career in Big Data Analytics using Apache Storm framework. This tutorial will give you enough understanding on creating and deploying a Storm cluster in a distributed environment. Prerequisites Before proceeding with this tutorial, you must have a good understanding of Core Java and any of the Linux flavors. Copyright & Disclaimer © Copyright 2014 by Tutorials Point (I) Pvt. Ltd. All the content and graphics published in this e-book are the property of Tutorials Point (I) Pvt. Ltd. The user of this e-book is prohibited to reuse, retain, copy, distribute or republish any contents or a part of contents of this e-book in any manner without written consent of the publisher. We strive to update the contents of our website and tutorials as timely and as precisely as possible, however, the contents may contain inaccuracies or errors. Tutorials Point (I) Pvt. Ltd. provides no guarantee regarding the accuracy, timeliness or completeness of our website or its contents including this tutorial. If you discover any errors on our website or in this tutorial, please notify us at [email protected] i Apache Storm Table of Contents About the Tutorial..............................................................................................................................................i Audience............................................................................................................................................................i Prerequisites......................................................................................................................................................i Copyright & Disclaimer.......................................................................................................................................i Table of Contents ..............................................................................................................................................ii 1. APACHE STORM – INTRODUCTION .................................................................................................1 What is Apache Storm? .....................................................................................................................................1 Apache Storm vs Hadoop ..................................................................................................................................1 Use-Cases of Apache Storm...............................................................................................................................2 Apache Storm – Benefits ...................................................................................................................................3 2. APACHE STORM – CORE CONCEPTS................................................................................................4 Topology ...........................................................................................................................................................5 Tasks.................................................................................................................................................................5 Workers ............................................................................................................................................................6 Stream Grouping...............................................................................................................................................6 3. STORM – CLUSTER ARCHITECTURE .................................................................................................9 4. APACHE STORM – WORKFLOW.....................................................................................................11 5. STORM – DISTRIBUTED MESSAGING SYSTEM...............................................................................12 What is Distributed Messaging System? ..........................................................................................................12 Thrift Protocol.................................................................................................................................................13 6. APACHE STORM – INSTALLATION..................................................................................................14 Step 1: Verifying Java Installation....................................................................................................................14 Step 2: ZooKeeper Framework Installation ......................................................................................................15 Step 3: Apache Storm Framework Installation .................................................................................................17 ii Apache Storm 7. APACHE STORM – WORKING EXAMPLE ........................................................................................19 Scenario – Mobile Call Log Analyzer ................................................................................................................19 Spout Creation ................................................................................................................................................19 Bolt Creation ...................................................................................................................................................23 Call log Creator Bolt.........................................................................................................................................24 Call log Counter Bolt........................................................................................................................................26 Creating Topology ...........................................................................................................................................27 Local Cluster....................................................................................................................................................28 Building and Running the Application..............................................................................................................29 Non-JVM languages.........................................................................................................................................30 8. APACHE STORM – TRIDENT ...........................................................................................................32 Trident Topology .............................................................................................................................................32 Trident Tuples .................................................................................................................................................32 Trident Spout ..................................................................................................................................................32 Trident Operations ..........................................................................................................................................33 State Maintenance ..........................................................................................................................................37 Distributed RPC ...............................................................................................................................................37 When to Use Trident?......................................................................................................................................37 Working Example of Trident............................................................................................................................37 Building and Running the Application..............................................................................................................41 9. APACHE STORM IN TWITTER .........................................................................................................43 Twitter ............................................................................................................................................................43 Hashtag Reader Bolt........................................................................................................................................47 Hashtag Counter Bolt ......................................................................................................................................49 Submitting a Topology.....................................................................................................................................50 Building and Running the Application..............................................................................................................51

Apache Storm Tutorial

Working with Storm Topologies Date of Publish: 2018-08-13

Apache Flink™: Stream and Batch Processing in a Single Engine

DSP Frameworks DSP Frameworks We Consider

HDP 3.1.4 Release Notes Date of Publish: 2019-08-26

Hdf® Stream Developer 3 Days

ADMI Cloud Computing Presentation

Installing and Configuring Apache Storm Date of Publish: 2018-08-30

Perform Data Engineering on Microsoft Azure Hdinsight (775)

A Performance Comparison of Open-Source Stream Processing Platforms

Technology Overview

Apache Beam: Portable and Evolutive Data-Intensive Applications

HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack