Towards Concurrent Stateful Stream Processing on Multicore Processors

Towards Concurrent Stateful Stream Processing on Multicore Processors

Towards Concurrent Stateful Stream Processing on Multicore Processors Towards Concurrent Stateful Stream Processing on Multicore Processors Shuhao Zhang co-author: Yingjun Wu (AWS), !eng"hang (#enmin %ni&ersity of China) Bingsheng He Xtra Computing lead Groupby Prof. Bingsheng He Towards Concurrent Stateful Stream Processing on Multicore Processors Data Stream Processing Systems • [Available] Fast stream processing on large-scale multicore architectures • E.g., BriskStream1, StreamBox2, Grizzly3 • [Inadequate] Support of consistent stateful stream processing 1. BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures, SIGM D!1" 2. StreamBo$: Modern Stream Processing on a Multicore Machine, A%&!19 3. Gri((ly: Efficient Stream Processing %hrough Ada+tive Query Com+ilation, SIGM D!#. ( Towards Concurrent Stateful Stream Processing on Multicore Processors Outline • Motivation • State-of-the-art • Our Approach • Experimental Results ) Towards Concurrent Stateful Stream Processing on Multicore Processors Motivation Example • Road congestion status is shared among different streaming operators • Its consistency needs to be preserved +oll Processing: -- consistent state/ul road computes toll based on stream processing congestion the roadstatus congestion status * Towards Concurrent Stateful Stream Processing on Multicore Processors The Current Common System Design disjoint subset &ommon designs: operator parallelis$ a) Pipelined processing with message passing b) On-demand data parallelism - Towards Concurrent Stateful Stream Processing on Multicore Processors Key-based Stream Partition #oad 0 AY. e/ecutor0 3o 'onflict1 #oad0 copy Road 2 Pay e/tra penalty PIE e/ecutor( if road1 is Road2 e$,ty1 #oad0 Can not efficiently handle general case 2 Towards Concurrent Stateful Stream Processing on Multicore Processors Lock-based State Sharing #oad1 AY. e/ecutor0 #oad0 ( )6 Road 2 PIE e/ecutor( Lead to poor performance 5 Towards Concurrent Stateful Stream Processing on Multicore Processors Access Ordering A toll should based on exactly current road condition, otherwise.. 7 Towards Concurrent Stateful Stream Processing on Multicore Processors Access Ordering &ommon workaround: • Buffer and sort Lead to poor performance 8 Towards Concurrent Stateful Stream Processing on Multicore Processors Outline • Motivation • State-of-the-art • Our Approach • Experimental Results 09 Towards Concurrent Stateful Stream Processing on Multicore Processors Consistent Stateful Stream Processing Employing transactional schematics. • State transaction (): 1. txnt1{Update Road1 Cnt, update Roadx…} @ 10:00; 2. txnt( {#ead Road1 Cnt 6; @ 00:9- • Correct schedule: … 00 Towards Concurrent Stateful Stream Processing on Multicore Processors Existing Solutions Revisited What if we use existing design[1]? • Severe lock contention A new solution for scaling concurrent state access is needed! [1] S-Store: Streaming Meets Transaction Processing, VLDB’15 0( Towards Concurrent Stateful Stream Processing on Multicore Processors Outline • Motivation • State-of-the-art • Our Approach: • TStream (An extension of BriskStream[2]) • Experimental Results [2] BriskStream: Scaling Data Stream Processing on S are!-Memor" M#lticore $rc itect#res, S%&M'D’1( 0) Towards Concurrent Stateful Stream Processing on Multicore Processors Design Overview • Design 1) Dual-Mode Scheduling • Exposing Parallelism • Design 2) Dynamic Restructuring Execution • Exploiting Parallelism. Up to 4.8 times higher throughput with similar or even lower processing latency! 0* Towards Concurrent Stateful Stream Processing on Multicore Processors Design1) Dual-Mode Scheduling • Initial process on an • State access is often the input event PreProcess bottleneck. • It unnecessarily bloc=s • Access to shared states StateAccess all subsequent processes. • Process on an input Let’s postpone it. event with reference to PostProcess obtained state values 0- Towards Concurrent Stateful Stream Processing on Multicore Processors Design 1) Dual-Mode Scheduling (a)PreProcess (b)StateAccess (c)PostProcess (d)PreProcess 'ompute Mode State Access Mode 'ompute Mode Pros: Enlarging inter-event parallelism opportunities. Cons: 3eed to handle ``on-the-flyA e&ents. 02 Towards Concurrent Stateful Stream Processing on Multicore Processors Design 2) Dynamic Restructuring Execution For each state transaction: Dynamically decompose and regroup operations to avoid conflict before actual parallel evaluation is applied. e* e1 e2 e)ecutor1 e)ecutor2 txnt1: {O1, O2} txnt1 t)nt2 txn2: {O3} → O : #ead of Road 1 ' 1 * : %,date to Road 2 Cperation chains: 6 O2 ' ' O3 : #ead of Road 2 1 2 : A thread #oad 1 #oad 2 05 Towards Concurrent Stateful Stream Processing on Multicore Processors Outline • Motivation • State-of-the-art • Our Approach • Experimental Results 07 Towards Concurrent Stateful Stream Processing on Multicore Processors Benchmark Workloads • Four applications: • Grep Sum (GS); Streaming Ledger (SL); Online Bidding (OB); Toll Processing (TP). • Diverse characteristics: • Varying compute/state access ratio • Varying state transaction types: read-only, write-only • Varying data dependencies • Varying multi-partition transactions Experiments conducted on a 4-socket Intel Xeon E7- 4820 server with 128 GB DRAM. 08 Towards Concurrent Stateful Stream Processing on Multicore Processors Overall Performance Comparison (i) In general, TStream perfor$s well at large core counts. (ii) TStrea$ perfor$s slightly better when data dependency is hea&ily presented (SE). (iii) Large room for further impro&ement. (9 Towards Concurrent Stateful Stream Processing on Multicore Processors Recap • [Key Designs] 1) dual-mode scheduling, 2) transaction restructuring • [Results] TStream performs constantly better than prior solutions under varying workloads. https://github.com/Xtra-Computing/briskstream/tree/TStream (0 Towards Concurrent Stateful Stream Processing on Multicore Processors Current Work +e)t &eneration %oT Data 1isit #s 2 Management Plat,orm www.nebula.stream -lea! ." Pro,/ Volker Markl0 +Ditter@nebulastrea$ Thank "ou3 s # ao/4 ang2t#-.erlin/!e (( Towards Concurrent Stateful Stream Processing on Multicore Processors Competitor Schemes Ordering-lock-based approach (LOCK). ① Multi-versioning-based approach (MVLK). ② Static-partition-based approach (PAT). ③ • S-Store No Consistency Guarantee (No-Lock). ④• Remove locks from LOCK scheme. • Represent the performance upper bound. () Towards Concurrent Stateful Stream Processing on Multicore Processors Runtime Breakdown (a) 3o-Lock scheme spends more than 60% in others mainly due to inde/ look up. (b) Synchronization o&erhead dominates in all ,rior solutions regardless of N%BA eGect. (c) +Stream in&ol&es high #BA o&erhead. 3%BA-aDare optimization detailed in our paper. (* Towards Concurrent Stateful Stream Processing on Multicore Processors No free lunch • Currently, TStream has its drawback on efficiently handling transaction abort. • E.g., one that leaves the average speed of a road to become negative. Stay tuned for future work (-.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    25 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us