Toward Agile BI by Using In-Memory Analytics
Total Page:16
File Type:pdf, Size:1020Kb
26 Informatica Economică vol. 18, no. 3/2014 Toward Agile BI By Using In-Memory Analytics Mihaela MUNTEAN Academy of Economic Studies, Bucharest, Romania [email protected], This paper explores one of the newer technologies related to the field of Business Intelli- gence: in-memory technology. The new class of in-memory BI tools turns a BI solution into an agile BI solution. Also, the paper focuses on the main data models used by in-memory BI technologies and tries to answer following questions: Which are the main characteristics of an agile data model? And, which is the best data model that can be used for enabling an agile BI solution? Keywords: In-Memory Analytics, “Associative” Data Model, Interactive Visualization, Asso- ciative Search, In-Memory Technology Introduction shows the trend in use of these technologies 1 In the last years, emerging technologies (as search terms) using Google Trends. We such as interactive visualization, in-memory see that the interest for these technologies has analytics and associative search marginalized increased in the last years. IT role in building BI solutions. Figure 1 Fig. 1. In-memory analytics versus interactive visualization versus associative search. Interest over time Also, Figure 2 shows how these technologies has the potential to help BI systems to be- affect businesses. These technologies allow come more agile, more flexible and more re- business people to do basic exploration of sponsive to changing business requirements. larger data sets and to find better answers to This section takes a look at the pros and cons business problems. In-memory technology of in-memory BI. The primary goal of the in- DOI: 10.12948/issn14531305/18.3.2014.03 Informatica Economică vol. 18, no. 3/2014 27 memory BI technology is to replace tradi- technologies are faster than disk-based BI tional disk-based BI solutions. The important technologies. In-memory BI technologies differences between them are: speed, volume, load the entire dataset into RAM before a persistence and price [1]. query can be executed by users. Also, most For decades BI solutions have been plagued of them can save significant development by slow response times, but speed is very time by eliminating the need for aggregates important in analysis and in-memory BI and designing of cubes and star schemas. Interactive visualization In-memory analytics Associative search More easy to use More analysis speed Dynamic data exploration More intuitive for users (reducing response time) flexible analysis End- user can act as an analyst more data discovery (unexpected insights) More focus on key metrics A better understand of business problems. Less guesswork Better decisions Fig. 2. How in-memory analytics, interactive visualization and associative search affect busi- nesses The speed of in-memory technology makes query-based architectures such as: ROLAP, possible more analytics iterations within a MOLAP and HOLAP. ROLAP uses SQL or given time. Ken Campbell, director of PwC another query language to extract detail data, Consulting Services company notes: “Hav- to calculate aggregates and store them in ag- ing a big data set in one location gives you gregate tables. Detail data are stored in data more flexibility. T-Mobile, one of SAP’s cus- warehouses or data marts (disk-based persis- tomers for HANA, claims that reports that tence) and are used when necessary. MOLAP previously took hours to generate now take pre-aggregates data using MDX or another seconds. HANA did require extensive tuning multidimensional query language. HOLAP for this purpose.”[2]. (hybrid OLAP) is a combination of the two But RAM is expensive compared to disk. In- above architectures. But these query-based memory technologies use compression tech- solutions don’t maintain the relationships niques to represent more data in RAM. Also, among queries. Some of in-memory BI tech- most of in-memory technologies use colum- nologies can maintain the relationships nar compression to improve compression ef- among queries. ficiency. Today, one of challenges of BI is to allow The traditional disk-based BI solutions use users to become less dependent on IT. BI so- DOI: 10.12948/issn14531305/18.3.2014.03 28 Informatica Economică vol. 18, no. 3/2014 lutions must be easier to be used by all BI acteristics: interactive visualization, self- users. Traditional BI solutions don’t provide service, in memory processing, speed of a dynamic data exploration and interactive analysis, rapid prototyping and more flexibil- visualization. The in-memory BI tools like ity. Using a self-service BI tool, the end-user Qlikview, Tableau, Tibco Spotfire can sim- can act as an analyst. Also, use of mobile de- plify a larger number of tasks in an analytics vices and social networking inside the com- workflow. The director of Visual Analysis at pany promote to adopt this class of tools. For Tableau Software, Jock Mackinlay says “In- example, TIBCO Spotfire for iPad 4.0 inte- side Tableau, we use Tableau everywhere, grates with Microsoft SharePoint and Tibbr, from the receptionist who’s keeping track of a social tool [www.tibbr.com/] [3]. Also, conference room utilization to the salespeo- QlikView 11 integrates with Microsoft ple who are monitoring their pipelines” [2]. SharePoint and is based on HTML5 [4]. Tableau Software, a leader in Magic Quad- There are many and different in-memory BI rant for Business Intelligence and analytics solutions. Table 1 presents a comparative platforms/Garter (2014) is an example of analysis using the following criteria: 1) the how these BI tools change the businesses. main characteristics; 2) query language; 3) This class of BI tools has the following char- data model [5]. Table 1. In-memory BI solutions Solution Characteristics Example Query language Data model In-memory MOLAP cube and IBM Cognos- MDX or another hypercube OLAP data are all in Applix(TM1) multidimensional memory Actuate BIRT query language Dynamic Cu- bes - Cognos BI version 10.1 In-memory only ROLAP MicroStrategy SQL dimensional ROLAP metadata loaded model in memory hypercube although Mi- croStrategy can build complete cubes from the subset of data held entirely in memory in-memory load and store da- Tableau VizQL, a declara- relational/ multi- columnar da- ta in a columnar Software tive language dimensional da- tabase database tabase with data less modeling re- compressions quired than an techniques OLAP based so- lution In memory spreadsheet load- Microsoft DAX (Data Anal- no data modeling spreadsheet ed into memory Power Pivot ysis Expression). required VertiPaq is the internal col- umn-based da- tabase engine DOI: 10.12948/issn14531305/18.3.2014.03 Informatica Economică vol. 18, no. 3/2014 29 used by Power Pivot In memory loads and store all QlikView script language is without aggrega- “associative” data in an “asso- includes Ex- required to load tions, hierar- data model ciative” data pressor (ETL the data and to chies, cubes; Column based model that runs in tool) transform the da- can access star storage with memory; ta; scheme / snow- compression all joins and cal- AQL technology flake / cubes; techniques culations are (Associative Log- (with com- made in real time; ic); pression ratio less modeling re- don’t use query near 10:1) quired than an language or defi- OLAP based so- nition language; lution; Hybrid ap- Relational data- Oracle Data- SQL Dimensional proach/dual base +columnar base In- model format ap- database; memory a pure hypercube proach with Both formats are in-memory co- data compres- simultaneously lumnar tech- sion tech- active; nology niques Oracle Exalyt- ics In-memory machine in- cludes OBIEE, Oracle Ess- base, Oracle Endeca Infor- mation Dis- covery and in- memory Ora- cle TimesTen database [6] SAP HANA store data in both rows and columns. Hybrid stor- Multidimensional SQL Server MDX for multi- star schema for age solution model (traditional 2012 dimensional; MOLAP; (disk + RAM) OLAP Cube) or- with compres- DAX for tabular; Tabular solutions ganizes summary sion algo- use relational data into multi- rithms and modeling; dimensional multi-threaded structures; Ag- query pro- gregations are cessing, the stored in the mul- Xvelocity en- tidimensional gine delivers structure; fast access to Tabular model tabular model DOI: 10.12948/issn14531305/18.3.2014.03 30 Informatica Economică vol. 18, no. 3/2014 (In-Memory Cu- objects and da- be) ta through re- porting client applications such as Mi- crosoft Excel and Microsoft Power View Figure 3 presents a disk-based BI solution Qlikview BI solution). versus an in-memory BI solution (e.g. a ROLAP (SQL queries are generated graph- ically, flexible, not user-friendly) Data ETL sources DW hypercube MOLAP (low flexibility, limited number of dimensions, aggregated data into Data Warehouse is not required. cubes) Load data and then work off-line Data sources Figure 1. User Inter face A disk –based BI solutionData model andversus interface an inside in -ofmemory Qlikview document BI solution, in RAM Fig. 3. A disk –based BI solution versus an in-memory BI solution According to [5], an agile BI solution re- lowing data models: dimensional model (star quires: 1) an agile development methodolo- schema, snowflake or combinations), hyper- gy; 2) agile BA; and 3) an agile information cube and “associative” data model. Which of infrastructure. An agile information infra- them are agile? The next section tries to an- structure must be able to extract and combine swer to above question. Also, the next sec- data from any data sources, internal and ex- tion briefly presents a comparative analysis ternal sources including relational, semi- of these data models using the following cri- structured XML, multidimensional and “Big teria: 1) basic concepts; 2) modeling ap- Data. According with these requirements, the proach; 3) flexibility. main characteristics of a data model for agile BI are: 2 Measures and Dimensions versus Free adaptable to rapid business changes; Dimensional Analysis agile design; In dimensional model and hypercube we dis- high flexibility to analysis; tinguish between measures and dimensions.