In-Memory Performance Durability of Disk
© 2018 GridGain Systems, Inc. Apache Ignite and Apache Spark
Where Fast Data Meets the IoT
Akmal Chaudhri GridGain Systems
© 2018 GridGain Systems, Inc. Agenda
• IoT Demands to Software • IoT Software Stack • Device OS/RTOS • Data Collection and Enrichment • NewSQL Database • Application APIs • Demo
© 2018 GridGain Systems, Inc. IoT Demands to Software
Real-time Processing
SQL, Geo-Spatial
Analytics (BI, ML)
High-Availability
Simple Scalability
© 2018 GridGain Systems, Inc. IoT Software Stack
Application APIs
NewSQL Database
Data Collection and Enrichment
Device OS/Real-Time OS
© 2018 GridGain Systems, Inc. Apache IoT Software Stack
Application APIs
NewSQL Database
Data Collection and Enrichment
Device OS/Real-Time OS
© 2018 GridGain Systems, Inc. Apache MyNewt
Open Source RTOS Cortex M, MIPS Bluetooth, Wifi, TCP/IP
Secured Bootloader Remote Firmware Upgrade
© 2018 GridGain Systems, Inc. Data Collection and Enrichment
DURABLE MEMORY
DURABLE MEMORY
Ignite Cluster
© 2018 GridGain Systems, Inc. Apache Ignite Database, Caching and Processing Platform
Financial Telco Travel & E-Commerce Pharma & IoT Services Logistics Healthcare
SQL Key/Value Transactions Compute Services Streaming ML
Memory-Centric Storage
Ignite Native Persistence Third-Party Persistence (Flash, SSD, Intel 3D XPoint) (RDBMS, HDFS, NoSQL)
© 2018 GridGain Systems, Inc. Ignite and Spark Integration
Spark Application
Spark Worker Spark Worker Spark Worker Share state and Boost DataFrame data among and SQL Spark jobs Spark Spark Spark Spark Spark Spark Performance Job Job Job Job Job Job
No data In-Memory Shared RDD or DataFrame SQL on top movement of RDDs
GridGain Node GridGain Node GridGain Node
In-place query execution
Yarn Mesos Docker HDFS
© 2018 GridGain Systems, Inc. SQL Queries Execution Flow
Ignite Node
Toronto Montreal 2 Canada Ottawa Calgary 1
Ignite Node 3 2
Mumbai India New Delhi 1. Initial Query 2. Query execution over local data 3. Reduce multiple results in one
© 2018 GridGain Systems, Inc. Comparing Ignite and Spark
• Distributed memory-centric database • Ingests data from HDFS or another storage
• Fully fledged compute platform: SQL, • Streaming and compute engine transactions, key-value, collocated processing, ML/DL
• OLAP and OLTP • Inclined towards OLAP and focused on MR payloads
© 2018 GridGain Systems, Inc. Ignite and Spark Together
Ignite is a memory-centric store for Spark
• No data movement from Ignite to Spark + • In-place query execution
• Boost DataFrame and SQL performance • Share state and data among Spark jobs
• Faster data and streaming analytics
© 2018 GridGain Systems, Inc. DEMO
© 2018 GridGain Systems, Inc. Any Questions?
Thank you for joining us. Follow the conversation. http://ignite.apache.org
#apacheignite
© 2018 GridGain Systems, Inc.