Efficient Top-Down Planning in Business Intelligence
Total Page:16
File Type:pdf, Size:1020Kb
Efficient Top-Down Planning in Business Intelligence Tobias Lauer • Alexander Haberstroh Jedox AG Collaborators: Christoffer Anselm • Zurab Khadikov • Steffen Wittmer Business Intelligence and Corporate Planning Jedox Suite Jedox Suite Web Front-End Jedox for Excel Jedox Spreadsheet Jedox ETL Manager Spreadsheet Microsoft Excel Open Office Calc Jedox User Manager Front-End Jedox Excel AddIn Jedox OO-Addin Jedox Web Jedox Report Manager Jedox OLAP Manager Excel2Web Multiprocessor Scalability Jedox Analyzer Supervision (Events, LDAP) Mobile Front-End (requires Mobile Server) Data Analysis Jedox OLAP 3rd Party Access (ODBO) iOS App (iPhone, iPad) GPU Acceleration Android App Mobile Server Jedox Mobile Android widgets Data Integration Jedox ETL SAP Connectivity Online Analytical Processing (OLAP) • Data modeled as multidimensional “cube” All regions Europe France Italy UK – Dimensions are structured hierarchically: North America USA • Base elements Canada Deviation Mexico Actual Budget • Consolidated elements Q3 Q4 Jan Jun Oct Feb Mar Aug Sep Nov Dec May Q1 Q2 Apr Jul Year – Operations: Year • Analysis: – Multidimensional aggregation (bottom-up) Q1 Q2 Q3 Q4 • Planning: – Data distribution (top-down) Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Storage model • Only store base cells with value ≠ 0 • All higher-level (consolidated) cell values are calculated on demand when needed Memory saving, data consistency Paths and values Zero and consolidated values are not stored! Path compression Writeback in Top-Down Planning • Writeback: ”opposite direction” of aggregation • Value inserted at high level of aggregation is broken down to lower levels until the base level • All underlying base cells are modified, depending on the type of writeback Ranges and areas • Base elements in each dimension are collected in ranges D0: { [0,0] , [2,2] } |D0| = 2 D1: { [0,0] , [2,3] } |D1| = 3 D2: { [0,2] } |D2| = 3 • The Cartesian product of ranges across all dimensions forms an area D0× D1× D2 Multiply-base distribution Multiply-base distribution Set-base distribution • Every relevant base cell in the area is set to the same given value • Naïve approach: search for all relevant paths and replace cell values – Problem: what about zero-value cells, which are not represented? • Better approach: (1) Delete all existing cells in area (2) Create all cells in area with new value Parallel creation of all cell paths in an area • “Parallel enumeration” of the area: – Each thread computes the path of ”its” cell from the thread ID – Problem: • Gaps between ranges of a dimension prevent simple iterations • Iterating over all ranges and counting all visited elements is inefficient – Solution: • Represent ranges by pre-calculated prefix sums (rather than start and end points) Prefix sum representation of ranges Prefix sums of gap lengths: g = Prefix sums of range lengths: r = Index i of kth relevant element in D: (1) Find smallest m such that r[m] ≥ k (2) i = g[m] + k Add-base distribution • The same given value v is added to the value of each relevant base cell • Approach: – Create all cells of the area and set value to v (as before) and store them temporarily – Find all previously existing relevant cells and add their (old) value to the one in the new temporary area – Delete old relevant cells and persist temporary storage Performance tests Timings CPU 2x GPU 3x GPU 4x GPU (in ms) Intel DualCore GeForce 260 Tesla C1060 Tesla C2050 Multiply-base 1 3,548 466 (7.6 x) 558 (6.4 x) 435 (8.2 x) Refresh 1 1,131 200 (5.7 x) 127 (8.9 x) 74 (15.3 x) Sum 4,679 666 (7.0 x) 685 (6.8 x) 509 (9.2 x) Multiply-base 2 21,542 513 (42 x) 580 (37 x) 448 (48 x) Refresh 2 5,508 961 (5.7 x) 617 (8.9 x) 347 (16 x) Sum 27,050 1,474 (18 x) 1,197 (23 x) 795 (34 x) Timings CPU 2x GPU 3x GPU 4x GPU (in ms) Intel DualCore GeForce 260 Tesla C1060 Tesla C2050 Set-base 14,979 900 (17 x) 715 (21 x) 572 (26 x) Refresh 5,598 962 (5.8 x) 610 (9.2 x) 347 (16 x) Sum 20,577 1,862 (11 x) 1,325 (16 x) 919 (22 x) Timings CPU 2x GPU 3x GPU 4x GPU (in ms) Intel DualCore GeForce 260 Tesla C1060 Tesla C2050 Add-base 110,387 1,465 (75 x) 899 (123 x) 872 (127 x) Refresh 5,621 953 (5.9 x) 608 (9 x) 346 (16 x) Sum 116,008 2,418 (48 x) 1,507 (77 x) 1,218 (95 x) Speed-up factors (compared to CPU) 100 95x 80 77x 60 40 34x 48x 20 22x 23x 0 9x 16x 4x Tesla C2050 18x 7x 3x Tesla C1060 11x Add-base Multiply-base 2 7x 2x GeForce 260 Set-base Multiply-base 1 Concluding remarks • Top-down planning creates and/or manipulates large numbers of data records – These updates are systematic and structured – GPUs are well-suited for parallel execution – CUDA implementation shows nice speedup compared to sequential CPU algorithm – Can benefit from multiple GPUs • Approach commercially implemented in Jedox Suite Contact www.jedox.com • Alexander Haberstroh [email protected] • Tobias Lauer [email protected] 16 3 24 10 21 35 42 11 4 13 29 31 17 16 2 67 24 8 13 8 13 81 54 51 5 27 39 73 44 12 46 33 1 19 3 86 54 6 92 49 13 90 28 60 16 0 3 24 10 0 21 35 0 42 11 4 13 29 0 31 17 16 0 0 2 0 67 0 24 8 0 13 8 13 0 81 54 0 0 0 0 0 0 0 0 0 0 0 51 5 27 39 73 44 12 46 0 33 1 19 0 3 86 54 0 6 0 0 92 0 0 49 0 0 13 90 0 28 0 0 60 16 3 24 10 21 35 42 11 4 13 29 31 17 16 2 67 24 8 13 8 13 81 54 51 5 27 39 73 44 12 46 33 1 19 3 86 54 6 92 49 13 90 28 60 16 3 24 10 21 35 42 11 4 13 29 31 17 16 2 67 24 8 13 8 13 81 54 51 5 27 39 73 44 12 46 33 1 19 3 86 54 6 92 49 13 90 28 60 Multiply-base distribution 240 16 3 24 10 21 35 42 11 4 13 29 31 17 16 2 67 24 8 13 8 13 81 54 51 5 27 39 73 44 12 46 33 1 19 3 86 54 6 92 49 13 90 28 60 Multiply-base distribution 240 16 3 24 10 21 35 42 11 4 13 29 31 17 16 2 67 24 8 13 8 13 81 54 51 5 27 39 73 44 12 46 33 1 19 3 86 54 6 92 49 13 90 28 60 Multiply-base distribution 240 16 3 24 10 21 35 42 11 x2 4 13 29 31 17 16 2 67 24 8 13 8 13 81 54 51 5 27 39 73 44 12 46 33 1 19 3 86 54 6 92 49 13 90 28 60 Multiply-base distribution 480 16 3 24 10 21 35 42 11 4 26 58 31 17 32 2 67 48 16 13 16 26 81 54 51 10 54 78 73 44 24 92 33 1 19 3 86 54 6 92 49 13 90 28 60 Multiply-base distribution 480 16 3 24 10 21 35 42 11 4 26 58 31 17 32 2 67 48 16 13 16 26 81 54 51 10 54 78 73 44 24 92 33 1 19 3 86 54 6 92 49 13 90 28 60 Set-base distribution B 10 16 3 24 10 21 35 42 11 4 13 29 31 17 16 2 67 24 8 13 8 13 81 54 51 5 27 39 73 44 12 46 33 1 19 3 86 54 6 92 49 13 90 28 60 Set-base distribution B 10 16 3 24 10 21 35 42 11 4 10 10 10 31 17 10 10 2 67 10 10 10 13 10 10 81 54 10 10 10 10 10 51 10 10 10 73 44 10 10 33 1 19 3 86 54 6 92 49 13 90 28 60 Set-base distribution 16 3 24 10 21 35 42 11 10 10 10 10 10 4 13 29 31 17 16 2 10 10 10 10 10 67 24 8 13 8 13 81 54 10 10 10 10 10 10 10 10 10 10 51 5 27 39 73 44 12 46 33 1 19 3 86 54 6 92 49 13 90 28 60 Set-base distribution 16 3 24 10 21 35 42 11 10 10 10 10 10 4 13 29 31 17 16 2 10 10 10 10 10 67 24 8 13 8 13 81 54 10 10 10 10 10 10 10 10 10 10 51 5 27 39 73 44 12 46 33 1 19 3 86 54 6 92 49 13 90 28 60 Set-base distribution 16 3 24 10 21 35 42 11 10 10 10 10 10 4 31 17 2 10 10 10 10 10 67 13 81 54 10 10 10 10 10 10 10 10 10 10 51 73 44 33 1 19 3 86 54 6 92 49 13 90 28 60 Set-base distribution 16 3 24 10 21 35 42 11 4 10 10 10 31 17 10 10 2 67 10 10 10 13 10 10 81 54 10 10 10 10 10 51 10 10 10 73 44 10 10 33 1 19 3 86 54 6 92 49 13 90 28 60 Add-base distribution B+5 16 3 24 10 21 35 42 11 4 13 29 31 17 16 2 67 24 8 13 8 13 81 54 51 5 27 39 73 44 12 46 33 1 19 3 86 54 6 92 49 13 90 28 60 Add-base distribution B+5 16 3 24 10 21 35 42 11 13 29 16 4 +5 31 17 +5 2 +5 +5 +5 24 8 8 13 67 +5 13 81 54 +5 +5 +5 +5 +5 +5 +5 +5 +5 5 27 39 12 46 51 73 44 33 1 +5 +5 +5 +5 +5 19 3 86 54 6 92 49 13 90 28 60 Add-base distribution 16 3 24 10 21 35 42 11 5 5 5 5 5 4 13 29 31 17 16 2 5 5 5 5 5 67 24 8 13 8 13 81 54 5 5 5 5 5 5 5 5 5 5 51 5 27 39 73 44 12 46 33 1 19 3 86 54 6 92 49 13 90 28 60 Add-base distribution 16 3 24 10 21 35 42 11 18 34 5 21 5 4 13 29 31 17 16 2 5 29 13 13 18 67 24 8 13 8 13 81 54 5 5 5 5 5 10 32 44 17 51 51 5 27 39 73 44 12 46 33 1 19 3 86 54 6 92 49 13 90 28 60 Add-base distribution 16 3 24 10 21 35 42 11 18 34 5 21 5 4 31 17 2 5 29 13 13 18 67 13 81 54 5 5 5 5 5 10 32 44 17 51 51 73 44 33 1 19 3 86 54 6 92 49 13 90 28 60 Add-base distribution 340 16 3 24 10 21 35 42 11 4 18 34 5 31 17 21 5 2 67 5 29 13 13 13 18 81 54 5 5 5 5 5 51 10 32 44 73 44 17 51 33 1 19 3 86 54 6 92 49 13 90 28 60 .