Business Intelligence

Vol. 7, No. 2 Open Source Business Intelligence

by Brian J. Dooley

Open source business intelligence has moved into the

limelight with the recent release of software with improved

functionality. Key features like standardization, simplicity,

and cost are particularly attractive to smaller organizations

and also helping to bring BI to areas where it might not

have been considered previously. There is great promise in

open source BI, but it needs to be considered carefully, as it

is neither as free nor as simple as it might initially appear. Cutter Business Technology Council

Rob Austin Tom DeMarco Christine Davis Lynne Ellyn Tim Lister Lou Mazzucchelli Ken Orr Sheleen Quish Ed Yourdon

Access to the About Cutter Consortium Experts Cutter Consortium is a unique IT advisory firm, comprising a group of more than 150 internationally recognized experts who have come together to offer content, consulting and training to our clients. These experts are committed to delivering top-level, critical, and objective advice. They have done, and are doing, ground- breaking work in organizations worldwide, helping companies deal with issues in the core areas of software development and agile project management, enterprise architecture, business technology trends and strategies, enterprise risk management, business intelligence, metrics, and sourcing.

Cutter delivers what no other IT research firm can: We give you Access to the Experts. You get practitioners’ points of view, derived from hands-on experience with the same critical issues you are facing, not the perspective of a desk-bound analyst who can only make predictions and observations on what’s happening in the marketplace. With Cutter Consortium, you get the best practices and lessons learned from the world’s leading experts, experts who are implementing these techniques at companies like yours right now.

Cutter’s clients are able to tap into its expertise in a variety of formats including print and online advisory services and journals, mentoring, workshops, training, and consulting. And by customizing our information products and training/consulting services, you get the solutions you need, while staying within your budget.

Cutter Consortium’s philosophy is that there is no single right solution for all enterprises, or all departments within one enterprise, or even all projects within a department. Cutter believes that the complexity of the business technology issues confronting corporations today demands multiple detailed perspectives from which a company can view its opportunities and risks in order to make the right strategic and tactical decisions. The simplistic pronouncements other analyst firms make do not take into account the unique situation of each organization. This is another reason to present the several sides to each issue: to enable clients to determine the course of action that best fits their unique situation.

For more information, contact Cutter Consortium at +1 781 648 8700 or [email protected]. Open Source Business Intelligence

BUSINESS INTELLIGENCE ADVISORY SERVICE Executive Report, Vol. 7, No. 2

by Brian J. Dooley

Open source business intelligence them are written in Java and open source BI projects are (BI) is attractive for a number of conform to standards. offered with a combination of reasons, especially a lower initial open source and commercial cost, perceived lower overall cost, Complete integrated BI suites are licenses, with the commercial simplicity of operation, and adher- a relatively recent development. licenses offering added features ence to standards. The power and The prominence of open source or added service. Although this capability of business intelligence BI solutions has risen lately as sev- departs from a pure open source is growing, and the capacity to eral integrated suites have come philosophy, it provides revenues perform sophisticated data analy- of age. The principle integrated needed to integrate these diverse sis with easy-access reporting — open source solutions today are platforms, as well as an improved hallmarks of BI — is increasingly Pentaho and JasperSoft. The ability to try out the software. The attractive to smaller businesses BIRT (Business Intelligence and “free” version can generally be that do not have the budget to Reporting Tools) Project has also upgraded later, and the costs of undertake full-scale commercial gained notice, though it is less the commercial version of the implementations. Another key well advanced than the other two open source software (OSS) are emerging market is the incorpora- and is currently mainly a reporting much lower than those of high- tion of BI capability directly within solution. end commercial alternatives. applications, either fully embed- ded or accessed. Open source Integrated BI suites bring together Open source BI solutions are products are attractive to devel- a variety of existing open source primarily targeted at smaller opers because they can be cus- tools (such as Mondrian OLAP businesses and enterprises and tomized under open source and JPivot), add to them, integrate at limited implementation scenar- licenses and because many of them, and provide them with an ios. However, as BI capability accessible BI platform. The major continues to be extended to new 2 BUSINESS INTELLIGENCE ADVISORY SERVICE territories, open source provides Jedox Palo-Server. The report commercial solutions win in com- an avenue for experimentation. concludes with a discussion of plexity and capability, open source selection and implementation can generally win — or put up a For companies investigating the considerations. strong fight, at least — in flexibility use of an open source solution, and customization capabilities. it is important to note that the THE OPEN SOURCE available components are modu- ENVIRONMENT A number of studies have been lar and generally meet important done over the years on the rela- Open source software is different standards. This means that incre- tive costs of running both open in culture and economics from mental installation is possible, source and commercial products. commercial software. Develop- beginning with only the capabili- The results have generally been ment occurs across a wide range ties that are required. It is also ambiguous, heavily dependent of individuals and institutions, generally possible to mix open upon the particular situation many of whom are also users. For source and commercial products under consideration — and on purists, payment is in ego points, within the same environment, who paid for the survey. The true but, in a practical sense, most using, say, an open source data- cost of commercial software is payment arrives through service base with a commercial analytic often buried in bundling deals and contracts. Development and tool or a commercial extraction, licensing details; the true cost of distribution, though guided by an transformation, and loading (ETL) open source software tends to lie orchestrating group, tend to be tool with an open source BI in a potentially larger implementa- haphazard, and new ideas come platform. tion effort. In both cases, for com- from many directions. plex solutions, the cost — which To understand open source BI, it Commercial software develop- must include ramp-up, training, is important to consider the over- ment tends to be monolithic. customization, trials, installation, all implications of implementing Management is hierarchical, and and so forth — is likely to be far open source products as mission- special procedures must be put greater in implementation than critical applications and to look at into place to ensure maintenance in purchase. the basic requirements of a BI is done, bugs are fixed, to make solution. Therefore, this Executive Money in computing has been optimum use of available pro- Report begins with a discussion derived, through much of the grammers, and so forth. In OSS, of the open source environment industry’s history, from service, these issues are served by the before laying out the components installation, and support. It comes community of interested users, of a BI solution. The report then from customization, from modifi- augmented by the license that offers an overview of open source cation of software to unique cir- requires free distribution. BI solutions before focusing on cumstances, and from direct sales the three major projects intro- These issues tend to make open to organizations that require a duced above: Pentaho, JasperSoft, source software more standard- prebuilt or precustomized solution. and BIRT. The report also includes ized, easier to modify, and closer This could be a product assem- a brief look at other tools, includ- to the programming environment bled by specially configuring and ing Bizgres, the BEE Project, and than commercial solutions. While tweaking a number of commercial

The Business Intelligence Advisory Service Executive Report is published by Cutter Consortium, 37 Broadway, Suite 1, Arlington, MA 02474-5552, USA. Client Services: Tel: +1 781 641 9876 or, within North America, +1 800 492 1650; Fax: +1 781 648 1950 or, within North America, +1 800 888 1816; E-mail: [email protected]; Web site: www.cutter.com. Group Publisher: Chris Generali, E-mail: [email protected]. Managing Editor: Cindy Swain, E-mail: [email protected]. ISSN: 1540-7403. ©2007 by Cutter Consortium. All rights reserved. Unauthorized reproduction in any form, including photocopying, faxing, and image scanning, is against the law. Reprints make an excellent training tool. For information about reprints and/or back issues, call +1 781 648 8700 or e-mail [email protected].

VOL. 7, NO. 2 www.cutter.com EXECUTIVE REPORT 3 products developed by different code only, at no cost (and  Ancillary technology can be organizations. Or, it could just as often with some functional developed, and as long as such easily be a preconfigured open limits) for an initial period, products do not include code source product, such as an inte- but requiring a purchased licensed under GPL, they need grated BI suite. license after the passage not themselves be licensed or of that period. made available under the GPL. The basic software models, characterized by distribution type, — Freeware — provided with The GPL represents the Free are as follows: no license fee at all and Software Foundation’s philosophy. generally released only as Several less restrictive variants  Proprietary/commercial dis- binary code. Freeware is exist, including the GNU Lesser tribution, in binary form with often used as a “loss leader” General Public License (LGPL) no available source code. This to draw attention to a ven- and vendor licenses based on has been extended to include dor’s commercial products. the Mozilla Public License (MPL). cases where source code is — Open source software — While these licenses impose included for reference only or provided with source code some GPL-like restrictions on the where source code is included and permission to modify. use of software, licensed software under the assumption that only Remuneration differs. Most can be incorporated into products the vendor or vendor-licensed is available via download for that can then be licensed without operators will make changes. free, but there are often con- “importing” similar restrictions. Sale is generally through licens- figurations available for sale Some of the common licenses ing, which may be indefinite for by vendors. The main char- found in open source BI software smaller products or have a set acteristic is that the users are described in Table 1. time frame with conditions for can freely use, modify, and renewal. redistribute the software. A OSS and its variants are often also  Free, semi-free, and partly variety of licenses are avail- referred to as “free software.” Free free distribution, much of able under this general cate- software refers to the user’s free- which can be considered gory, some of which are dom to run, copy, distribute, study, either open source or allied GNU Public Licenses (GPLs). change, and improve the software, with open source. Within the rather than to lack of cost. “free” camp, there are a num- The GPL is the most widely used OSS has largely entered the enter- ber of important sub-models, open source license. Products prise through the back door. Much the most important of which under this license include the like the early personal computers, claim to be open source. In GNU Project and Linux. Key points it has been confined in the past to addition to a distribution of the GPL are: technical areas, scientific work- model, open source licenses  Software licensed under the stations, and special applications have important consequences GPL can only be copied and such as intranet Web servers. that encourage product devel- distributed under this license. Recently, with support from major opment by an extended com- vendors increasing and with a munity of engineers. “Free”  Products licensed under the significant presence developing in software variants include the GPL may be sold. key areas — combined with signif- following (not all are “free” in  Users can alter the source icant publicity — open source the sense of code sharing): code, but if the result is distrib- software has reached a level of — Shareware software — uted or published, it must be acceptance in the enterprise. typically provided in binary made available under the GPL.

©2007 CUTTER CONSORTIUM VOL. 7, NO. 2 4 BUSINESS INTELLIGENCE ADVISORY SERVICE

Table 1 — Open Source Licenses

Reference Specific License Description Apache Apache License Free software1 license, incompatible with the GPL due to a specific Version 2.0 requirement regarding certain patent issues. Apache License Free software license, non-copyleft,2 with some requirements making it Version 1.1 incompatible with the GNU GPL. Apache License Free software license, simple, permissive, but non-copyleft. Not fully Version 1.0 compatibility with the GNU GPL. Artistic Artistic License Free software license compatible with the GNU GPL. This license is 2.0 being considered for use in Perl 6. Original Artistic Ambiguously worded license, later modified in version 2.0. License Eclipse Public Free software license, incompatible with the GPL due to a specific License requirement regarding certain patent issues. Version 1.0 LGPL Lesser GNU Free software license, with limited copyleft rights permitting linking with Public License non-free modules. It is compatible with the GNU GPL. BSD Original BSD Free software license, non-copyleft, with an advertising clause that License makes it incompatible with the GNU GPL. Modified BSD Free software license, non-copyleft. Modified by removal of the License advertising clause and compatible with the GNU GPL. MPL or Mozilla Mozilla Public Free software license, weak copyleft. Some restrictions that make it License incompatible with the GNU GPL. MPL 1.1 possibly removes these restrictions, but the issue can be complex. Sun Sun Public Free software license, weak copyleft. Some restrictions that make it License incompatible with the GNU GPL. Same as the Mozilla Public License (note that the Sun Community Source Licensing Program is not a free software license). CPL Common Public Free software license, incompatible with the GPL due to a specific License requirement regarding certain patent issues. Version 1.0 EnterpriseDB Commercial license permitting code modification but no release to the public. GPL GNU Public This is a free software license and a copyleft license. License

1Free software means the user is free to run, copy, distribute, study, change and improve the software. 2Copyleft means that anyone who redistributes the software, with or without changes, must pass along the freedom to further copy and change it.

Open source and current com- an alternate pool for innovation. Open source is likely to continue mercial software models can This can be demonstrated in the its movement into the enterprise, coexist. Open source is likely to growing pool of open source BI ushered in partly by Linux and create pressure for lower costs, products. partly by the fact that, after all, it as well as to continue to develop is not so different from the other

VOL. 7, NO. 2 www.cutter.com EXECUTIVE REPORT 5 ways that software has been devel- data stores. The five processes an ideal that may not always be oped and distributed in the rela- are as follows: achievable. However, standardiza- tively brief history of computing. tion upon a single sys- 1. Storing, organizing, and group- tem backed by rules for data use ing information multidimen- COMPONENTS OF can provide the main functionality: sionally through metadata A BI SOLUTION a “single version of the truth” (that The basic definition of a BI tool is 2. Aggregating, or gathering infor- is, one set of figures used by all) as follows: an application used for mation and assembling it in a and standardization of data transforming data into information form that permits analysis sources through the use of the database system’s tools. to derive knowledge through 3. Analyzing using tools that pro- reporting and analysis of struc- vide detailed what-if analyses Business intelligence has now tured information. This definition as well as developing more reached a point where it is almost has been broadened somewhat in complex statistical universally accepted as an impor- practice and under pressure of relationships tant part of overall IT strategy, competition. 4. Presenting using “visualization” particularly among larger cor- Most BI tools are designed to tools that make data meaning- porations. Over the past several operate within a data warehouse ful to users, including charts, years, there has been consolida- environment, with structured infor- graphs, and other appropriate tion within the industry and a mation brought together through output formats commoditization of ETL tools. The the use of metadata providing data major enterprise resource plan- 5. Reporting and providing usable mappings, transformation rules, ning (ERP) firms have attempted information for planning and and control logic. Standard query to integrate BI into their massive decision making and reporting tools generate framework systems and have succeeded to a great extent, but reports based upon the collected A conventional BI solution consists the idea that a single vendor data. This area has evolved from of two parts: (1) a data warehouse will provide a single, integrated, single-use products to suites of and supporting infrastructure and corporate-wide BI suite is begin- applications designed for different (2) the tools used to extract data ning to fade, as companies recog- types of analysis and reporting and present it in a meaningful nize that different areas require tasks. It has also spawned lower- way. The data warehouse is a different approaches. This has led cost solutions that do not have the read-only database that holds con- to a decrease in ad hoc solutions, architectural requirements of the solidated and integrated data that greater integration and improved full-blown suites. has been extracted from systems standardization between systems, throughout the enterprise and and the growing popularity of No matter how complex the entered in a standardized and ana- “executive dashboard” or “enter- means used to achieve it, a suc- lyzable form through the use of prise portal” solutions for bring- cessful BI system is really the ETL tools. Data residing in the data ing data analysis from different automation of a set of simple warehouse is the most correct, sources to the users that require actions that are designed to transformed, and validated data it in a manner that suits their process business data. The basic available; it is designed for ad hoc understanding, capabilities, and elements of a BI system are to queries, analysis, and reporting, requirements. All of this favors gather data, analyze it, and report and it resides within a database the openness, flexibility, and meaningful results using informa- system. For smaller businesses, a smaller size of upcoming open tion locked in a company’s many true data warehouse represents source solutions.

©2007 CUTTER CONSORTIUM VOL. 7, NO. 2 6 BUSINESS INTELLIGENCE ADVISORY SERVICE

One important recent develop- and magnitude of a potential solu- of consultant-engineered vertical ment reflects a growing recogni- tion are better known before the market solutions that provide tion that “analysts” are not the project needs to be budgeted. a close fit to the business con- only users of BI information. As cerned. Additional programming the technology continues to infil- OVERVIEW OF OPEN SOURCE will almost always be required to trate the enterprise, it is being BI SOLUTIONS fit the solution to the business, used by an increasing array of There are a number of open and some of it can be complex. different operators with different source solutions available, most Maintenance issues are also likely capabilities and requirements. running on the Linux platform. to arise over the life of the solu- This has led to improvements in Available tools currently range tion, and the original developer user interfaces, simplification of from database systems with some may not always be available. choices, and the provision of dif- BI capability to complete BI suites. However, where such resources ferent levels of presentation. This, The main open source are available, open source may too, tends to favor a solution that are MySQL and PostgreSQL, with provide a convenient and low-cost is smaller, supports standards, and MySQL being the preferred system option for creating a customized can be easily customized. for analytics and integration proj- BI solution. ects. These are joined with a num- BI has now stretched out across It should also be noted that ber of Java and XML-based tools to the enterprise, with no single an increasing number of main- fill out open source BI suites. product dominating the market. stream BI applications and data- Major uses include management ETL tools that bring data into the base systems have migrated to of corporate performance, moni- database exist in open source Linux, largely from the proprietary toring of business activity, report- mainly as Java applets providing Unix world. However, these ing or regulatory compliance, and transformations based on XML. are not open source solutions in customer analysis. There is a wide range available, themselves. according to the specific problem Another important implementa- Some of the more popular open to be solved. tion trend is the growing need to source BI solutions are listed in establish ROI and cost justification For analysis, the leading open Table 2. for BI systems. This results from source tool is Mondrian, which The next section takes a look an atmosphere of increased con- provides OLAP capability. at the three most advanced cern over capital expenditure, Reporting and query tools are open source BI suites: Pentaho, combined with a realization that also available as Java applets; one JasperSoft, and the Eclipse BIRT today’s BI solutions can be expen- of the most interesting is JPivot, Project. (Eclipse is an open source sive to deploy and that experi- which works with Mondrian as development initiative led by BI mental systems or systems that well as with other OLAP servers. do not provide valuable informa- vendor Actuate.) Of the three, tion may be even more expensive. Implementation of an open Pentaho and JasperSoft are the This has become the primary source solution may begin with most complete initiatives. motivator for development and a relatively low cost due to the attention given to open source BI availability of “free” source code, MAJOR OPEN SOURCE BI PROJECTS solutions. Although they may not but it can turn into an adventure. be a panacea for cost issues, Usability almost always turns Today, the major open source BI they can be implemented initially upon the capabilities of internal projects are generally considered under the radar, so that the details data processing departments or to be Pentaho, JasperSoft, and

VOL. 7, NO. 2 www.cutter.com EXECUTIVE REPORT 7

Table 2 — Open Source Solutions for Business Intelligence

Product License Web Address Description Reporting JasperReports LGPL www.jaspersoft.com JasperReports is a feature-rich, high-performance report development and execution product for developers and end users. It consists of a powerful graphical report design tool and a comprehensive XML-based report definition and execution library. It is a high-performance and massively scalable standards-based system supported by a large and active community of report designers and developers. JFreeChart LGPL www.jfree.org JFreeChart is a free Java chart library. JFreeChart supports pie charts (2D and 3D), bar charts (horizontal and vertical, regular and stacked), line charts, scatter plots, time series charts, high-low-open-close charts, candlestick plots, Gantt charts, combined plots, thermometers, dials, and more. JFreeChart can be used in applications, applets, servlets, and JavaServer Pages. This project is maintained by David Gilbert. JFreeReport LGPL www.jfree.org JFreeReport is a free Java report library. JFreeReport reads data from a TableModel and generates formatted output, supporting features that include headers, footers, page numbering, grouping, totals, averages, embedded images, and more. Reports can be previewed on screen or saved in Acrobat PDF, Excel, HTML, XML, or text format. This project is maintained by Thomas Morgner. BIRT EPL www.eclipse.org/birt The BIRT project is intended to provide a next-generation reporting technology with a Web-centric design metaphor that is open source and extensible and provides an XML report design format. It is designed to act as a foundation for commercial products. The project’s chief sponsors are Actuate, IBM, and InetSoft. Analytics Mondrian CPL http://mondrian.pentaho.org Mondrian is an OLAP server written in Java. It enables you to interactively analyze very large data sets stored in SQL databases without writing SQL. It serves as the Analytic tool of the Pentaho BI suite. The latest release of Mondrian, version 2.3.2, is now available. JPivot CPL http://jpivot.sourceforge.net JPivot is a JavaServer Pages custom tag library that renders an OLAP table and lets users perform typical OLAP navi- gations like slice and dice, drill down, and rollup. It uses Mondrian as its OLAP Server. JPivot also supports XMLA data source access. OpenI MPL http://openi.sourceforge.net The first release of OpenI is a simple that does out-of-box OLAP reporting. It can be deployed on any J2EE server, and interactive OLAP reports can be created from existing cubes immediately. Future versions will include data sources other than OLAP cubes (relational databases, data mining models, and so on). ETL Clover LGPL http://cloveretl.berlios.de Clover.ETL is an open source, Java-based ETL framework that can be used to transform structured data. While using Java technology it allows for platform independence. It can be used standalone — as an application run from command line — or can be embedded in applications. Clover.ETL is accompanied by CloverGUI visual transformations designer, which is integrated with the Eclipse platform. Octopus LGPL tk Enhydra Octopus is a Java-based extraction, transformation, and loading (ETL) tool. It may connect to any Java Database Connectivity (JDBC) data sources and perform transforma- tions defined in an XML file. (Table continues on next page.)

©2007 CUTTER CONSORTIUM VOL. 7, NO. 2 8 BUSINESS INTELLIGENCE ADVISORY SERVICE

Table 2 — Open Source Solutions for Business Intelligence (continued)

Product License Web Address Description ETL (continued) Kettle LGPL http://kettle.pentaho.org Kettle is the ETL component of the Pentaho BI suite. It provides: • Data warehouse population with built-in support for slowly changing dimensions, junk dimensions, and much, much more • Export of database(s) to text file(s) or other databases • Import of data into databases, ranging from text files to Excel sheets • Data migration between database applications • Exploration of data in existing databases (tables, views, synonyms) • Information enrichment by looking up data in various information stores (databases, text files, Excel sheets) • Data cleaning by applying complex conditions in data transformations • Application integration Database EnterpriseDB Enterprise www.enterprisedb.com EnterpriseDB Advanced Server is an enterprise-class DB relational database management system (RDBMS) that is compatible with applications written for Oracle. EnterpriseDB Advanced Server is based on PostgreSQL, the world’s most advanced open source database, ensuring the world-class data integrity, security, and performance necessary for enterprise environments. In addition, the total cost of owner- ship (TCO) of an enterprise database solution powered by EnterpriseDB Advanced Server is only a small fraction of the TCO of a comparable Oracle-powered solution. Ingres GPL, www.ingres.com Ingres RDBMS is a full-featured, enterprise-class database. commercial Among its feature highlights are: • 64-bit architecture support for large-scale enterprise deployments • A small footprint requiring fewer system resources • Industry-standard data access via JDBC, ODBC, and .NET as well as support for the latest open source development environments including Eclipse, PHP, Perl, Python, and Ruby • Transaction journaling providing point-in-time recovery • Parallel query processing • Key range table partitioning • High availability cluster support • Advanced query optimization techniques MySQL GPL, www.mysql.com The MySQL database provides consistent fast performance, commercial high reliability, and ease of use. It is used in more than 10 million installations ranging from large corporations to special- ized embedded applications. It is also the database of choice for a new generation of applications built on the LAMP stack (Linux, Apache, MySQL, PHP/Perl/Python). MySQL runs on more than 20 platforms, including Linux, Windows, OS/X, HP-UX, AIX, NetWare, giving you the kind of flexibility that puts you in control. PostgreSQL BSD www.postgresql.org PostgreSQL is a powerful, open source relational database system. An enterprise class database, PostgreSQL boasts sophisticated features such as multi-version concurrency control (MVCC), point-in-time recovery, tablespaces, asynchronous replication, nested transactions (savepoints), online/hot backups, a sophisticated query planner/optimizer, and write ahead logging for fault tolerance.

VOL. 7, NO. 2 www.cutter.com EXECUTIVE REPORT 9

BIRT, though the latter is currently The included solution engine OS X, Solaris, and other Unix primarily a reporting solution. integrates reporting, analysis, platforms. Each of these projects has major dashboards, and data mining industry support, has a commer- components to form a sophisti- The Pentaho platform was built to cialization strategy, and offers a cated and complete BI infrastruc- be open and flexible and to allow suite of integrated open source ture. Components are all open developers to integrate third-party products with an established source components, written in components. The Pentaho BI roadmap for growth. Java. The solution engine exe- framework uses Acegi Security cutes BI Process Flows. BI Process for a pluggable framework to The Pentaho approach is based on Flows are also available as Web implement and combine multiple Eclipse and provides enhance- service calls that may be used by authentication schemes and ments to Mondrian OLAP — a orchestration technologies such credential stores with central Dashboard component, data as the Business Process Execution authentication storage for mining, reporting, workflow, and Language (BPEL). platform-neutral single sign-on. business framework (including a business rules engine and docu- The server architecture has been The major investment themes of ment repository). JasperSoft’s built for J2EE. The client design 2007 for Pentaho include ease of open source BI suite variant is environment is built around the use, maintenance and administra- offered under a curious “commer- Eclipse workbench, with end-user tion, and platform extensions. cial open source” license. In fact, access provided through HTML all three of these product suites and other thin-client technologies. BI Platform offer either a commercial license The Pentaho BI Platform provides or commercial product version that The Pentaho BI Platform is nor- the infrastructure and core ser- provides extended capabilities. mally deployed as a standalone server with a standalone Design vices that integrate business intel- Let’s start by examining Pentaho. Studio. In order to solve business ligence components to complete problems, a solution will need the BI suite. This includes the Pentaho to be deployed as well. New infrastructure necessary to build, deploy, execute, and support The Pentaho BI Project provides solutions can be created with applications. It is a comprehensive a comprehensive BI suite. This the Pentaho Design Studio, or pre- development and runtime envi- includes reporting, analysis, existing solutions can be obtained ronment for building complete dashboards, data integration, from Pentaho or other sources. solutions to BI problems. data mining, and a BI platform Customers can start with one component, such as Reporting, for production deployment. The The Pentaho BI Platform centers and add other components, such suite is built around the Pentaho its solutions around a workflow as Analysis and Dashboards, BI Platform, which provides the core and a service-oriented archi- as required. architecture and infrastructure tecture (SOA). Pentaho BI solu- required to build solutions to BI The Pentaho Design Studio tions are made up of collections problems. The platform includes requires the Eclipse framework of XML documents. The Pentaho an embedded workflow engine and can only be deployed to Design Studio is built with plug-ins and can be easily integrated into platforms supported by Eclipse. to make editing and managing business processes. Core platform Currently all versions of Windows these documents easier; however, services include authentication, from ME to XP are supported as the solution documents can also logging, auditing, workflow, Web are many versions of Linux, Mac be edited with any simple text services, and rules engines. editor, if necessary.

©2007 CUTTER CONSORTIUM VOL. 7, NO. 2 10 BUSINESS INTELLIGENCE ADVISORY SERVICE

The Pentaho BI Platform uses a reports on a Web site to highly on the JFreeReport project, which sophisticated combination of formatted reports for applications Pentaho owns and sponsors. business rules, services, assured like financial reporting. It also messaging, workflow, clustering, provides both scheduled and Technical capabilities include: and auditing to ensure efficient on-demand report publishing in  Generating and delivering thou- operation. formats including Adobe/PDF, sands of 1-to-20-page docu- Microsoft Excel, HTML, RTF, and ments in a single BI process The main open source compo- text. Reports can be filtered and nents that have been integrated delivered to targeted users.  Generating reports in Microsoft into the Pentaho BI infrastructure Excel, HTML, PDF, RTF, and text are listed in Table 3. Pentaho Reporting can access formats data from relational and OLAP  Generating report content from Reporting data sources as well as XML or relational, OLAP, XML, and Web Web services sources. It is a services data sources Pentaho Reporting is designed to highly extensible platform with meet a wide range of business flexible customization points and  Providing intuitive interface reporting needs, from simple a sophisticated architecture based components for business users

Table 3 — Pentaho Open Source Components

Project/Component Description and Usage License Apache Commons Logging Logging Apache Apache HttpClient Server-to-server HTTP communications Apache OpenSymphony Quartz Scheduler Apache compatible Apache log4j Logging Apache Chiba Server-side xForm to HTML conversion Artistic Eclipse Platform Desktop workbench Eclipse Eclipse Modeling Framework Workbench Modeling Framework Eclipse Eclipse Graphical Editing Workbench Graphical Editor Eclipse Framework JBoss Application Server Application server used for sample LGPL deployments JBoss Hibernate Object persistence layer LGPL JBoss Portal JSR 168 compliant portal server used LGPL for sample deployments JFreeChart Chart engine (www.jfree.org/jfreechart) LGPL Acegi Single sign-on and LDAP integration BSD (pro version) MetaStuff dom4j XML parser BSD Mozilla Rhino JavaScript processor Mozilla Sun JavaMail E-mail delivery Sun Sun Java Database Database access Sun Connectivity Sun JIMI Image management Sun

VOL. 7, NO. 2 www.cutter.com EXECUTIVE REPORT 11

 Providing a drag-and-drop integration, and workflow integra-  Providing user-by-user cus- graphical report design tion. These capabilities are sup- tomization and preferences environment using the Pentaho ported by scheduling, Web  Using the best open source Report Designer services, content navigation and analysis components available management, security, application  Integrating with Pentaho integration, and auditing. Analysis  Providing technology encom- Analysis components, such as can be performed on relational passing an out-of-the-box Dashboards and pivot views data sources using Pentaho reporting server, embeddable  Supporting deployment of Analysis Services (based on analysis components, and reporting as an embedded Mondrian OLAP). anything in between Java library, as a packaged Web reporting application, or Pentaho Analysis provides tools The main open source compo- as part of an entire BI suite that make it possible to freely nents that have been integrated explore business information in into Pentaho Analysis are shown  Conforming to all standards a Web-based environment; to in Table 5. and interfaces provided by drill down into data; and to cross- the Pentaho BI Platform tabulate data. The incorporated In addition to these components, Pentaho Spreadsheet Services Pentaho Analysis takes advantage The main open source compo- of the open source software nents that have been integrated allow users to browse, drill, pivot, and chart against Pentaho included in the Pentaho BI into Pentaho Reporting are shown Platform. in Table 4. Analysis Services from within Microsoft Excel. In addition to these components, Dashboards Technical objectives include: Pentaho Reporting takes advan- Pentaho Dashboards display, tage of Pentaho BI Platform  Providing pivot table views of arrange, and control BI content. components such as JBoss, OLAP data All Pentaho components includ- JFreeChart, and Eclipse. ing Reporting and Analysis can  Providing advanced analytic contribute content to Pentaho graphical views of OLAP data Analysis Dashboards. Dashboard widgets  Providing intuitive interface can be created to display dials Pentaho Analysis provides exten- components for business users and gauges. External content sive analysis capabilities that such as Web pages, third-party include a pivot table viewer  Providing drill-through to and applications, and RSS feeds (JPivot), advanced graphical from reporting content can also be integrated. Filter displays using SVG or Flash,  Integrating analysis with controls can be added to provide integrated dashboard widgets, business processes subject-based content filtering. data mining integration, portal

Table 4 — Open Source Components in Pentaho Reporting Project/Component Description and Usage License Apache FOP (Formatting PDF document production Apache Objects Processor) Eclipse BIRT HTML and PDF report designer and engine Eclipse Sun JIMI Image management Sun

©2007 CUTTER CONSORTIUM VOL. 7, NO. 2 12 BUSINESS INTELLIGENCE ADVISORY SERVICE

Table 5 — Open Source Components in Pentaho Analysis

Project/Component Description and Usage License JPivot Web pivot tables/charts Common Public License Mondrian ROLAP modeler Common Public License

Role-based security and filtering is Pentaho Dashboards is a Pentaho  Using the best open source easily built in. project. Third-party open source data mining components components are limited to available Pentaho Dashboards can be utilities like XML parsing and the  Adhering to all the technical embedded into applications, common libraries. The Eclipse objectives of the Pentaho JavaServer Pages (JSPs), or within framework is used for the dash- BI Platform JSR 168–compliant portals using board creation tool. the provided portlets. The main open source compo- Data Mining nent integrated into Pentaho Technical objectives include: Data Mining is Weka, which has Pentaho Data Mining incorporates  a GPL license. Providing browser- and Weka, a collection of machine- portal-independent dashboard learning algorithms applied to JasperSoft BI Suite displays data mining tasks. These algo-  Providing reusable display wid- rithms are combined with OLAP JasperSoft provides reporting, gets (gauges, dials, charts, etc.) technologies to provide machine- analysis, and data integration in a intelligent data analysis to end system that is meant to be easily  Integrating reporting, analysis, users. Data mining tools can operated by anyone, from casual and dashboard content analyze historical data to create business users to analysts to exec-  Providing configurable, com- predictive models. Pentaho utives, in a form that can be incor- mon filtering controls Reporting and Analysis compo- porated in businesses of all sizes. nents can then be used to distrib- It is designed to provide a com-  Providing role-based security ute this information. prehensive, flexible, seamlessly and filtering embeddable, and affordable  Providing user-by-user cus- Technical objectives include: solution. Key features include: tomization and preferences  Integrating open source data  Interactive data analysis/OLAP  mining with reporting and Using the best open source  Interactive and managed OLAP data sources components available reporting   Integrating data mining with Providing technology encom-  High-performance data business processes passing an out-of-the-box integration reporting server, embeddable  Enhancing compliance and  Reports for screen or print dashboard components, and corporate governance by anything in between applying data mining tools It incorporates a production  Adhering to all the design goals to business process data report designer and provides of the Pentaho BI Platform

VOL. 7, NO. 2 www.cutter.com EXECUTIVE REPORT 13 advanced on-demand reporting capabilities through prebuilt Web It provides easy integrations with for Salesforce.com. services connectors. existing IT infrastructure servers and services and can be embed- JasperSoft is available in two Technical details are provided in ded in applications using public versions: the “free” open source Table 6. Java APIs and Web services. system and a professional edition, which provides added value. The JasperSoft BI Platform JasperServer professional edition provides sev- eral assurances and conveniences The BI platform provides the JasperServer is a high-perfor- that may be important to many framework for products in the mance, interactive, standalone customers, including: modular JasperSoft BI Suite, report server designed to provide making it possible to manage, managed reporting for work-  Advanced features secure, and deliver full business groups, small businesses, and intelligence capabilities to differ-  Broader platform support enterprises. It includes a secure ent user communities. report management repository,  Managed release cycles standards-based report defin- The modular components of the  Premium support subscriptions itions, integration interfaces, BI suite are: JasperReports, and the iReport  Commercial license with  JasperServer — interactive graphical report designer. It sup- indemnities and managed reporting for ports drag-and-drop, Web-based  Rights to bundle with other JasperReports ad hoc reporting; self-service; and commercial products interactive parameterized reports.  JasperAnalysis — interactive Report scheduling and distribution JasperSoft BI Suite is designed data analysis/OLAP server capabilities are provided, and principally to be a standalone BI  JasperETL — high-performance historical report versioning and solution for workgroups and small data integration system auditing is available for regulatory organizations. It includes a fully compliance. JasperServer can be functional, production-ready  JasperReports — report gener- seamlessly embedded into other reporting and BI server that ator for screen or print software applications. deploys quickly. Faster deployment JasperServer, JasperAnalysis, yields a faster ROI. It is built on a JasperServer’s enterprise-level and JasperETL run on a shared BI modular architecture and designed RDBMS-based repository is opti- platform that provides a common to leverage existing enterprise soft- mized for speed and stores report framework to allow reporting, ware and inhouse skills. definitions, images, fonts, data analysis, and data integration. The source definitions, OLAP views, The JasperSoft BI Suite can also system is designed to meet the and so on for fast and secure be embedded into other business needs of very small to very large access. applications, such as CRM, organizations. The BI platform is finance, ERP, and human capital an integral part of the JasperSoft Using a standard Web browser, management systems. JasperSoft BI Suite. It is a production-ready users can create, run, save, and is pure Java and offers a clean environment, with security fea- interact with reports. Users only business API. Non-Java applica- tures that include external authen- see reports they are authorized to tions, such as those written in PHP tication, role-based authorization, use and can select from multiple or Perl, easily access JasperSoft BI and single sign-on. User interfaces prompted parameters to retrieve are configurable and replaceable. the data they need.

©2007 CUTTER CONSORTIUM VOL. 7, NO. 2 14 BUSINESS INTELLIGENCE ADVISORY SERVICE

Table 6 — JasperSoft Technical Overview (Source: JasperSoft)

JasperReports JasperServer JasperAnalysis JasperETL Overview Product Report design Interactive Interactive analysis/ Data integration/ classification and execution and managed OLAP server ETL reporting Architecture Java API (Library): Servlet Engine, Web services; Used in Server, Future: Portal integration Java Application, Applet JasperReports JasperServer Architecture Architecture Language Java Java Java Java (generates Perl) APIs Java Java, JSP, SOAP, XML/A, Java Command line HTTP, Web services Embeddable Yes Yes Yes Yes Standalone Yes Yes Yes Yes User Features Output HTML, PDF, RTF/ HTML, PDF, RTF/ HTML, PDF, XLS Perl code Word, XLS, CSV, Word, XLS, CSV, XML XML Parameterized Programmable Built-in and reports extensible Ad hoc Yes, using iReport Yes, using iReport reporting or Web interface Interactive Yes data analysis Dashboard Programmable Server Features BI objects Yes, can access Yes Yes Pro. only repository using iReport plug- in for JasperServer Scheduler Yes Yes Administration Yes, can Yes Yes Yes administer using iReport plug-in for JasperServer Security Yes Yes Yes Clustering Yes Yes Planned Data sources JDBC, EJB, POJO, JDBC, Future JDBC, JNDI, Bean 30+ connectors XML Extensions Custom data Yes Programmable Yes sources Data caching Yes Yes and reuse

VOL. 7, NO. 2 www.cutter.com EXECUTIVE REPORT 15

Table 6 — JasperSoft Technical Overview (continued) (Source: JasperSoft)

JasperReports JasperServer JasperAnalysis JasperETL Report Design Features Graphical iReport iReport and ad Yes designer hoc Web interface Report definition JRXML JRXML language Scripting Java, Groovy Java, Groovy Perl language Query language SQL, Hibernate SQL, Hibernate MDX SQL (HQL), XPath (HQL), XPath (XML), EJBQL, (XML), EJBQL, MDX MDX Query builder SQL, MDX SQL, MDX SQL Business Features License LGPL and GPL and GPL and GPL and commercial commercial commercial commercial OEM-able Yes Yes Yes Yes Technical Yes Yes Yes Yes support Consulting Yes Yes Yes Yes services Training Yes Yes Yes Yes

JasperServer includes the follow-  E-mail notifications with detail data and is accessible ing scheduling and distribution report links or optional report through an easy-to-use Web inter- features: attachments face designed for business users. JasperAnalysis is backed by a  Time-zone selection JasperServer Professional high-performance relational OLAP  Scheduled run begin and features extensions including (ROLAP) server engine, and is end dates ad hoc reporting. MDX- and XML/A-compliant. It is pre-integrated with JasperServer.  Maximum number of JasperAnalysis recurrences Features include: JasperAnalysis provides capabili-  Run report every x minutes, ties to explore causes, trends,  Analytic operations hours, days, weeks patterns, anomalies, and corre-  Sort across/within hierarchy  Calendar-based recurrence: lations using a standard Web every day/weekday/day of browser. It is designed to comple-  Show empty rows/columns ment and extend the power of month  Swap axes (pivot) JasperReports by providing drill-  Output to PDF, HTML, Excel, down, “slice and dice,” pivot, fil-  Show chart RTF ter, and charting capabilities. It  Show source data/drill-through enables dynamic drill-down to to detail

©2007 CUTTER CONSORTIUM VOL. 7, NO. 2 16 BUSINESS INTELLIGENCE ADVISORY SERVICE

 Color-coded exception alerts standardizes data warehouse/data metadata repository that enables mart updates, thus simplifying and team development of data inte-  Change data cube standardizing the end-user experi- gration projects.  Filter ence. JasperETL can routinely extract, transform, and load data  Show MDX query JasperReports from operational systems into a  Configurable output to Excel “star schema”–style database, JasperReports is a feature-rich, or PDF where it can be safely and quickly high-performance report develop- accessed for interactive end-user ment and execution product for  OLAP server administration reporting and analysis. developers and end users. It con-  Manage/tune OLAP server sists of a powerful graphical report configuration JasperETL meets operational data design tool and a comprehensive integration needs that include XML-based report definition  Flush OLAP cache data consolidation, duplication, and execution library. It is a high-  Secure user and role-based synchronization, quality, migra- performance and massively scal- permissions tion, and change data capture. able standards-based system Components include: supported by a large and active JasperAnalysis Professional fea- community of report designers  ture extensions include drillable Job Designer — provides a and developers. It is capable charts, an enhanced user inter- graphical editor and functional of creating dashboards, tables, face, and an OLAP server man- view of the ETL process crosstabs, and charts with com- agement utility.  Transformation Mapper — plex layouts for screen or print. It provides a graphical editor supports flexible and extensible JasperETL and view of complex mappings data sources and a wide range of and transformations output formats. Built-in virtualiza- JasperETL is based on the open tion enables output of arbitrarily  source Talend engine and pro- Business Modeler — provides large reports, limited only by disk vides a complete and ready-to- a nontechnical graphical view storage resources. JasperReports run data integration platform with of the business information also delivers detailed printer- high-performance data ETL capa- workflow ready output and is often used bilities. JasperETL works with the for detailed forms, invoices, Numerous included connectors JasperSoft BI Suite but can also and other complex operational (30-plus) permit output and input be used in standalone mode to reports. provide comprehensive ETL capa- from and to many different data bilities for other applications and sources including flat files, XML The iReport graphical report systems. files, all databases, POP and FTP designer component for servers, and more. Included meta- JasperReports provides easy JasperETL simplifies and auto- data configuration wizards help access to all JasperReports mates data integration through configure heterogeneous data capabilities. iReport is a power- easily created, managed, and sources and complex file formats ful Java client application that maintained data integration including positional, delimited, includes optional integration with processes. It provides an intu- CSV, RegExp, XML, and LDIF data. the popular Eclipse integrated itive GUI and is usable by small, development environment (IDE). JasperETL Professional feature medium-sized, and large orga- It provides capability to retrieve, extensions include a multiuser nizations. It simplifies and store, and modify reports when

VOL. 7, NO. 2 www.cutter.com EXECUTIVE REPORT 17 used with JasperServer and Complex reports and report JasperSoft BI Applications Jasper4Salesforce. features are supported, along JasperSoft and its partners offer with internationalization. prebuilt BI applications that inte- Features include: grate with or are embedded in JasperReports Professional  popular applications. With these Dashboards, tables, crosstabs, includes the JasperReports report- applications, any CRM, ERP, or charts, and gauges ing library, iReport graphical other application can have com-  Report output in PDF, XML, report designer, and updates for plex reporting, interactive report- HTML, CSV, XLS, RTF, and text the iReport User Manual and ing, integrated analysis, visual JasperReports Ultimate Guide.  Page-oriented or continuous dashboards, and data integration output for screen or print to and from other data sources. Jasper4Salesforce  Integrated bar code support Jasper4Salesforce delivers BIRT  Visual text rotation advanced on-demand reporting BIRT is an open source Eclipse- for Salesforce.com. This includes  Styles library based reporting system that inte- exception reports, reports on any grates with Java/J2EE applications  Drill-through/hypertext links, combination of standard and cus- to produce sophisticated reports. including support for PDF tom objects and fields, drag-and- It provides core reporting features bookmarks drop ad hoc, and highly complex such as report layout, data access, and ready-to-print reports. It pro-  No limit to report size and scripting. The BIRT Project vides on-demand, subscription- is a top-level project of Eclipse The iReport graphical report based usage and advanced (www.eclipse.org). The current and chart designer component reporting; it is Salesforce.com- released version is 2.1.2. provides: certified at the professional, enterprise, and unlimited levels. The BIRT Project addresses a  Comprehensive library of chart wide range of reporting needs types, including meter, ther- Features include: within a typical Java application. mometer, and multi-axis charts BIRT aims to address the problem  Exception reports  Built-in expression builder with of ad hoc reporting capabilities  syntax checker, object methods Flexible queries by providing Eclipse-based open source and extensible tools and list, and wizards  Integration of graphs and frameworks that allow developers  charts directly into other Graphical query builders for to easily incorporate reporting areas of Salesforce.com SQL and MDX functionality.   Ability to be used in Eclipse Drag-and-drop ad hoc reporting The BIRT Project’s initial releases or as a pure Java (Swing)  Ability to incorporate graphical have focused upon reporting. application report designer Other projects will expand BIRT  Ability to build, test, and run  High performance into other areas of BI such as data JasperReports from the desktop modeling, ETL, APIs, and frame-  Advanced report management environment works to build business-user query tools for flexible ad hoc access to  Preintegrated with the Jasper4Salesforce also includes data, as well as analysis tools. JasperServer repository a library of predefined reports.

©2007 CUTTER CONSORTIUM VOL. 7, NO. 2 18 BUSINESS INTELLIGENCE ADVISORY SERVICE

BIRT is built on the open source  TOC, paging Web applications, corporate Eclipse platform. Eclipse is an desktop applications, ISV Web  A direct XML source editor open source software develop- applications, or ISV Eclipse-based for report design ment project dedicated to pro- applications. viding a robust, full-featured,  Multipass processing commercial-quality, industry (top N/bottom N) Extensibility is a key principle for platform for the development the BIRT Project. It provides:  Data source access including of highly integrated tools. The POJOs, JDBC, CSV, XML  Data source extensibility Eclipse platform is designed for building IDEs that can be used  A WYSIWYG editor  Application-specific, design- to create diverse C++- and time query builders  An integrated chart wizard Java-based applications.  Custom design-time and  Report component libraries runtime data access Actuate Corporation also provides  Report templates Actuate BIRT, a commercial ver-  Custom business logic sion of the BIRT Project. It is now  Styles, CSS import, themes extensibility MySQL Network Certified and a  Wizards for guided  Capability to incorporate com- recommended reporting and ana- development plex business logic scripting lytics component of a MySQL data warehouse scale-out solution set.  Report outline capability  Capability to access existing Actuate BIRT delivers functionality and new Java code  Preview within report designer that is identical to the Eclipse BIRT for iterative development  Visualization extensibility project — its open source (EPL) equivalent.  Context pass-through to  Capability to build new visual Actuate BIRT includes reporting data source data presentation “widgets” and charting offerings that are  Capability to call stored  Capability to extend charting immediately available on an procedures with new chart types; new annual subscription pricing basis. output formats The BIRT Project is intended to BIRT features include:  provide a next-generation report- Capability to target report out- put for specific devices and  A palette of report components ing technology with a Web-centric formats — text (character large object, design metaphor that is open source and extensible and pro- or CLOB; HTML), data, images The next major release, 2.2, is vides an XML report design for- (binary large object, or BLOB), targeted for June 2007. tables, grids, lists, labels mat. It is designed to act as a foundation for commercial prod-  Sorting, grouping, filtering, con- ucts. The project’s chief sponsors Eclipse Report Designer (ERD) ditional highlighting, mapping are Actuate, IBM, and InetSoft. The ERD is an Eclipse-based desk-  Scripting in JavaScript/Java top authoring environment for In its current version, BIRT is report development. Eclipse  Cascading and dynamic report designed to provide a personal Report Designer enables applica- parameters desktop report development tion and report developers to cre-  Hyperlinking, bookmarks tool or reporting technology ate simple and complex reports for integration into corporate for use within their organization.

VOL. 7, NO. 2 www.cutter.com EXECUTIVE REPORT 19

The tool caters to the broad range Eclipse Chart Engine (ECE) with the same open source prod- of report development skills, from ucts, and both the supporting The ECE project provides a rich the nonprogrammer report devel- communities and companies business-chart generation capabil- oper focused on report layout to involved are in active communi- ity that can be used as a stand- the application developer looking cation with each other. The result alone charting component and for sophisticated control over is to create a different spin on also provides the chart generation report creation. competition. The major focus is service within the ERE project. on collaboration, and products Visual presentation of business need to be selected according Eclipse Report Engine (ERE) data in the form of charts is a to how well they fit company common and key aspect of The ERE allows Java application needs. Implementation and inte- many reports and other forms of developers to quickly integrate gration issues are often of greater business intelligence. As such, a powerful report generation and consequence than the software robust charting capability is essen- viewing capabilities into their platform selected. applications without having to tial within the overall BIRT Project. build the infrastructure from The Pentaho BI Project is currently lower-level Java components. Web Based Report Designer (WRD) the most comprehensive BI suite. The WRD project delivers a fully It includes reporting, analysis, The Eclipse Report Engine project customizable and extensible 100% dashboards, data integration, enables reports to be generated HTML-based tool for creating data mining, and a BI platform. using the XML report designs reports with basic layouts and JasperSoft is the second most created by the Eclipse Report data manipulation. comprehensive, but it lacks the Designer, Web Based Report data mining capabilities of Designer, or any other tool. To The tool will leverage components Pentaho and focuses upon report- support this, the ERE provides such as style sheets and templates ing. Movement between open two core services: generation created using the ERD. In addi- source components and commer- and presentation. tion, the tool will provide an cial versions or upgrades is also Eclipse-based customization facil- important and differs among The generation service within the ity for the user interface for full open source BI solutions. The ERE is responsible for connecting branding and embedding within upgrade path from BIRT is to the to the specified data source(s), Java applications. proprietary Actuate iServer. For retrieving and processing the data JasperSoft, the upgrade path is (sorting, grouping, aggregations, The goal of this project is to pro- also to a proprietary system, and so on), creating the report vide an accessible and easy-to- though it is closer to the open layout, and generating the report use report design environment to source version and there is document. meet the needs for ad hoc report more versatility in upgrade paths. creation by business users within Similarly, Pentaho’s Professional The presentation service within any Java application. Edition builds on and extends the ERE provides a rich set of Pentaho’s open source capabili- viewing capabilities for report Comparisons ties with additional features added content. This includes the infra- for mission-critical or large-scale structure for viewing a document Comparisons between open deployment. online, for printing a document, source BI suites can be difficult and for generating alternate out- because the suites frequently JasperSoft’s approach differs from put documents such as PDFs. either incorporate or interoperate that of Pentaho and BIRT in that

©2007 CUTTER CONSORTIUM VOL. 7, NO. 2 20 BUSINESS INTELLIGENCE ADVISORY SERVICE it focuses more on embeddable that are worthy of note. Three default configuration and software. Embedding eases inte- of these are the Bizgres Project, designing features gration, extends functionality, and the BEE Project, and the Jedox  Include a range of popular stimulates a strong partnership Palo-Server. packages, beyond those community. The JasperSoft BI included in Greenplum’s com- Suite is already integrated with The Bizgres Project mercially supported products products from innovative compa- The Bizgres Project is a commer-  nies like Novell, Salesforce.com, cially sponsored and community Create an environment where SpikeSource, and SugarCRM. supported open source project commercial vendors can Additional partners include (www.bizgres.org). The main sup- easily get involved to make BEA, IBM, Oracle, Sun, Eclipse porters currently are Greenplum, PostgreSQL a supported plat- Foundation, EnterpriseDB, JBoss JasperSoft, Kinetic Networks, and form for their offering Open Source Federation, and Loyalty Matrix. The goal of Bizgres  Promote a global perspective MySQL. is to build a complete database by supporting as many lan- system for business intelligence BIRT’s strongest feature is its guages and geographic locales exclusively from free software. association with Eclipse. Eclipse as possible is an active, growing open source Bizgres is targeted at propelling  Provide binary releases and a development organization with PostgreSQL into practical, real- robust build environment to support from IBM, HP, SAP AG, world use within mainstream enable business users to easily and other major players in the IT businesses needing high-quality install and test Bizgres industry. BIRT is developed by the RDBMSs for business intelligence. The BEE Project Eclipse community and features Bizgres is designed to make tight integration with Eclipse. The PostgreSQL an alternative BEE is a suite of tools supporting BIRT committee has an active to Oracle, Sybase, Informix, and BI project implementation within dialog with these other projects Microsoft proprietary databases. midsized companies under an to understand where BIRT can open source GPL license. It is leverage existing or proposed The stated goals of Bizgres are to: being developed by Insight functionality. Strategy (http://bee.insightstrategy.  Create a complete database cz). It aims to optimize data stor- Other areas of comparison are system for business intelligence age for analysis and to focus on in overall infrastructure, or how with capabilities exceeding ETL process and multilayer appli- the BI platform brings the open competing database systems, cations. The architecture is devel- source components together. built for and by the community oped on the ROLAP methodology Infrastructure factors vary in areas  Build the database system with the aim of covering projects such as security, administration, completely from open source having data volumes up to 50 GB. auditing, failover, scalability fea- software BEE is released under the GNU tures, portal, and other functions. General Public License, with a  Provide a robust development stable version of the product avail- OTHER TOOLS platform for building software, able under a commercial license particularly open source While Pentaho, JasperSoft, and (similar to MySQL). For the com- software BIRT have shared the limelight mercial product, technical support recently, there are other open  Emphasize usability and a “just is provided. source BI efforts in development works” philosophy in selecting

VOL. 7, NO. 2 www.cutter.com EXECUTIVE REPORT 21

The BEE Project ETL tool is based for spreadsheet data storage and determination of the areas in on simple daemon services, analysis. Palo-Server is sponsored which analysis will prove most ensuring distributed data trans- by Jedox GmbH (www.jedox.com), fruitful, the actual budget avail- mission and transformation. The which develops and markets able, and factors of existing infra- environment is administered by a Worksheet-Server, an Excel-to- structure. Open source systems centralized supervising applica- Web solution with support for are provided on the basis of low tion. The suite is designed for dis- Palo-enabled Excel Workbooks. initial cost and simplicity. In some tributed processing with robust Palo-Server provides a central cases, implementation will be rel- encryption. A GUI for administra- database for enterprise spread- atively straightforward. If extensive tion of ETL processes and model- sheets in Microsoft Excel and data integration is required, how- ing is available within the system. offers interfaces to many third- ever, a simplified approach may party databases provided by SAP, be impossible. To work through The ROLAP server provides its IBM, Microsoft, and Oracle. the possibilities in this case functionality through a SOAP will almost invariably require application interface for a poten- Palo-Server is based on spread- a consultant. tially rich set of client applications. sheet cells, rather than operating The incorporated Web portal has as a traditional relational data- Implementation Costs been designed as a primary user base. It offers a powerful, multi- Implementation and deployment interface for report design, pre- dimensional data model based on costs for BI projects can contain a sentation, and data manipulation OLAP principles (data cubes). A high proportion of consulting fees. through a Web browser. It uses hierarchal data storage structure Depending on complexity, there XML, HTML, CSS, and JavaScript allows for incremental data inputs may also be development, cus- to aid in communications and into central hierarchies with tomization, and integration costs. ensure easy administration of the results computed automatically. One of the most important and whole environment. Integration expensive areas of deployment is with tools of the R Project for It can be installed locally or in a creation and integration of a data Statistical Processing is available. company network. In network warehouse, if necessary. This is mod,e all users work with the likely to involve a considerable The BEE project also includes same data; changes in one effort, and cost depends upon the the BeeWebAnalyser 1.0, a new spreadsheet are immediately type of data required, where it is open source tool that uses its visible on other workstations in located, and how it is transformed. own ROLAP technology for data the intranet. analysis and decision support. It Light open source BI solutions is embedded in the BEE Project In addition to providing multi- implemented without a data decision support system and dimensional data analysis of warehouse need to be evaluated enables a detailed interactive existing data, Palo-Server allows carefully, as they impose the risk analysis of Web page visits inputting of new data to the of developing multiple versions (Apache Web server logs) using dataset. This helps differentiate of data that will result in derived a broad range of different cus- Palo-Server from relational OLAP figures and analysis being out tomizable views. servers. of sync. This can be remedied through a variety of technologies Jedox Palo-Server SELECTION AND IMPLEMENTATION and data usage rules, but the solu- Palo-Server is a cell-oriented, tion itself can create additional multidimensional OLAP (MOLAP) Selection of an open source cost. Light solutions may also fail data server, specifically developed BI approach requires careful to provide the advantages that

©2007 CUTTER CONSORTIUM VOL. 7, NO. 2 22 BUSINESS INTELLIGENCE ADVISORY SERVICE would be available from a full BI systems, availability of implemen- demonstrates how they are mov- suite approach. tation and maintenance help, ing into this sector. Offering a availability of training, and any highly standardized system with Added Requirements need to modify systems or data low initial cost makes it possible In addition to implementation to meet requirements of the pro- for companies to experiment with concerns, deployments of open posed solution. the technology without large-scale source BI capabilities require expenditure — and still ensure careful consideration of the areas CONCLUSION that anything developed can be integrated with a more compre- around the implementation. Open source BI software has come hensive offering in the future. Although cost is a key factor, it is into focus recently as a result of important to avoid unnecessary both new advances and increasing The jury is still out with respect sacrifices in the rush to find a need. The software itself is starting to the overall adequacy of open solution. Key additional areas to to provide useful analytic capabili- source BI solutions, particularly for consider include: ties in preintegrated suites that fill larger organizations. The software out capabilities and reduce the  Security has only very recently reached a problems of selection and imple- level where it is likely to provide  Training mentation. At the same time, the adequate and reasonably com- power and sophistication of com-  Scaling plete capabilities. However, the ponents — including the open suite projects — and open source Security issues are always a con- source database products — are BI software in general — are cern in implementation of a data- continuing to develop. These tech- currently in rapid development base solution, particularly since nological developments are com- mode. They have ambitious the BI solution inevitably incorpo- ing at a time when the need for roadmaps for rolling out advanced rates critical data. Access needs low-cost, testable, and more easily features, particularly in analytics. to be guarded, and if the solution used BI solutions is becoming Lesser-known open source “com- is delivered online, adequate safe- apparent. The areas in which this petition” is continuing to develop, guards need to be installed on need is most visible include a wide and many integrated solutions the network to prevent break-ins. range of smaller businesses, which are likely to fall below the radar Training is important and must be require BI solutions to handle an entirely as integrators and consul- considered from the beginning. ever-increasing information and tants build their own suites out of Not only must information be analysis workload; the extension of available components. made available in an easily under- BI possibilities into nontraditional standable way, but users must be areas; and as a directly embedded One thing is certain: open source trained in how to make the most component in software. BI is beginning to have a positive of the system. impact upon the development Open source BI does not neces- and implementation of BI solu- Scaling is important in keeping sarily compete with the major tions in general. By providing an with low costs, because cost will commercial solutions because it alternative model for pricing and grow according to the breadth of is generally targeted at a lower delivery, it is shaking the compla- the project and to who must have level of user. It does, however, cency out of the market and yield- access to the data. Analysis of put price pressure on BI solutions ing new ideas. This is an area of possible candidates should be aimed at the lower end of the great potential that needs to be rigorous and should include a market. The combined commer- watched for developments in the careful evaluation of appropriate- cial/open source initiatives repre- near future. ness, compatibility with existing sented in each of the major suites

VOL. 7, NO. 2 www.cutter.com EXECUTIVE REPORT 23

ABOUT THE AUTHOR Brian J. Dooley is an author, ana- lyst, and journalist with more than 20 years’ experience in analyzing and writing about IT trends. He has written six books, numerous user manuals, hundreds of reports, and more than 2,000 mag- azine features. Mr. Dooley is the founder and past president of the New Zealand chapter of the Society for Technical Communication. He initiated and is on the board of the Graduate Certificate in Tech- nical Communication program at Christchurch Institute of Technology, and he is on the edi- torial advisory board for Faulkner Technical Reports. Mr. Dooley currently resides in New Zealand, where he maintains a Web site at http://bjdooley.com. He can be reached at [email protected]

©2007 CUTTER CONSORTIUM VOL. 7, NO. 2 CUTTER CONSORTIUM

ACCESS TO THE EXPERTS A Collaborative Community of Thought Leaders

Cutter Consortium is a unique IT advisory firm, experts — experts who are implementing comprising a group of more than 150 inter- these techniques at organizations like yours nationally recognized experts who have come right now. together to offer research, consulting, and Cutter’s clients are able to tap into this collab- training to our clients. These experts are orative community of thought leaders in a committed to delivering top-level, critical, and variety of formats including online and multi- objective advice. They have done, and are media research, mentoring and inquiries, and doing, groundbreaking work in organizations training and consulting. And by customizing worldwide, helping companies deal with our research products and training/consulting issues in the core areas of software develop- services, you get the solutions you need while ment and agile project management, enter- staying within your budget. prise architecture, business-technology trends and strategies, risk management, metrics, business intelligence, and sourcing. The Cutter Difference Cutter’s goals are to further the thinking in the Cutter’s research, inquiry response, consulting, business-technology arena and to help organi- and training are qualitatively different from zations leverage technology for competitive that of other analyst firms in many vital ways: advantage and business success. We accom-  Cutter Research gives clients Access to plish our mission by serving as a catalyst for the Experts — their latest writings and collaboration between business-technology thinking. All of Cutter’s research — right thought leaders worldwide and by giving down to the last E-Mail Advisor — is clients access to this think tank through our provided exclusively by internationally research, training, and consulting — all of recognized expert practitioners. Cutter has which is provided exclusively by the world’s no desk-bound analysts. Cutter’s research leading business-IT experts. allows clients to tap into this brain trust Cutter offers a different value proposition than and get the latest thinking on the business- other IT research firms: We give you Access to IT issues challenging enterprises worldwide. the Experts. You get practitioners’ points of  view, derived from hands-on experience with Cutter Inquiry Privileges give clients the same critical issues you are facing, not the Access to the Experts — personalized perspective of a desk-bound analyst who can guidance from the world’s top only make predictions and observations on practitioners. Every inquiry is fielded by a what’s happening in the marketplace. With Cutter Senior Consultant, Fellow, or Faculty Cutter Consortium, you get the best practices member. Clients can purchase bundles of and lessons learned from the world’s leading hours and may allot some of the inquiry ACCESS TO THE EXPERTS

time to senior executives, such as the CIO,  Written content is likened to pairing him/her with a top business-IT “consultancy in print” since it provides strategist so they can brainstorm every hands-on, actionable solutions from month. This allows the two of them to expert practitioners who are successfully build a rapport and gives the Cutter expert implementing these ideas, whether it be a deep understanding of the issues the IT strategic planning, security strategies, enterprise is facing. The Cutter expert or risk management. This relates back to quickly becomes a valuable advisor. the fact that the content is written by the people who are at the forefront of their  Cutter Consulting and Training gives fields and who are guiding companies daily. clients Access to the Experts — hands-on assistance from best-in-class  Cutter Research — from reports to consultants. Unlike other consulting and podcasts, E-Mail Advisors to Webinars training firms that use senior partners to — is prepared with both the IT and the sell the work but junior staff to execute, business user in mind. Cutter uses only its best-in-class experts for  Information and advice is truly objective. Cutter’s focus is on the user community — every assignment. The Consortium’s great Cutter is unique in the research space in the people who use technology-related products strength is that it can draw from its more having no relationships with vendors. It is than 150 internationally recognized experts and services to deliver business results. Vendor well known that other analyst firms derive to assemble the ideal team to help your independence (particularly in terms of funding) significant portions of their revenues enterprise tackle any business-technology from vendors and that the choice of which means that Cutter’s reports don’t read like challenge it faces. vendors to evaluate, and the slant of the commercials. The advice given by Cutter is  Clients call Cutter “the thinking person’s research, can be influenced by this factor. practical and pragmatic — it draws on a wealth research firm” because of Cutter’s  Emphasis on agility. The leaders of the dedication to debate of the business- of experience from a wide pool of ’hands-on’ agile project management and software technology issues enterprises face and its development movement are all part of industry practitioners. Their breadth and depth success in attracting the thought leaders Cutter. Agile methodologies are evolving of coverage is reflected in the frequency and worldwide to conduct this debate. Clients in real time, as new projects push the get cutting-edge thinking plus multiple “format of reports — the underlying focus is on previous limits. Clients take advantage viewpoints so they can determine what’s the supporting principles which enable success, of the latest breakthroughs via Cutter best for their individual enterprise/ research, training, and consulting. not the latest fads. department/project. Reliance on current industry trends should not replace the hard  Focus on project management, risk — Paul Ramsay, work of figuring out what’s best for your management, and change management Service Delivery Manager, organization given its business strategy so IT can accomplish the changes business Equinox, and drivers. demands. Software engineering’s pioneers Auckland, New Zealand are key to the Cutter team and they know  Emphasis is on strategies and processes, that delivering results depends on successful not on vendor/product detail. This is based project, risk, and change management. on Cutter’s view that it is not the choice of product or vendor that determines  Emphasis on IT when it matters — IT as ” success or failure, but sound strategies innovation, as the hyper-differentiator and business processes, agility, leadership that will give your business the competitive and effective teams, and hard-hitting edge it needs to succeed. financial analysis.

Cutter Consortium Access to the Experts

The peace of mind that comes from knowing you’re relying on the world’s leading experts. About the Practice plus consulting and training services. consultingandtrainingservices. plus below. Eachofthesepracticesincludesasubscription-based periodicalservice, intotheninepractice areas Cutter Consortiumalignsitsproductsandservices Other CutterConsortiumPractices AvailablefromtheBusinessIntelligencePractice Products andServices pitfallscompaniesmustconsiderwhenembracingthesetechnologies. the as products. You’ll discoverthebenefitsofimplementing thesesolutions,aswell mining, knowledgemanagement,CRM,andbusinessintelligencestrategies clients areensuredopinionatedanalysesofthelatestdatawarehousing, andconsulting,mentoring,training, subscription-basedservice Through Cutter’s Practice helpsclientsleveragedataintorevenue-generatinginformation. available toimplementbusinessintelligenceinitiatives,theBusinessIntelligence ofthetools culture thatacceptsandembracesthevalueofinformation,tosurveys management initiatives.From tacticsthatwillhelptransformyourcompanytoa how toencourageemployeesparticipateinknowledgesharingand manage technicalissueslikedatacleansingaswellmanagementsuch the fullspectrumofbusinessintelligence.Clientsgetbackgroundtheyneedto reviews, insightintoorganizationalandculturalissues,strategicadviceacross in thatitprovidesclientswiththefullpicture:technologydiscussions,product weapon thatenablesthemtomakebetterbusinessdecisions.Thepracticeisunique strategic enterprise data,augmentitifappropriate,andturnintoapowerful Cutter Consortium’sBusinessIntelligencePracticehelpscompaniestakealltheir rightstrategicdecisionsthefirsttime. the competitive inthee-businesseconomy. It’smoreimportant thanevertomake management arecriticalissuesenterprisesmustembraceiftheytoremain The strategiesandtechnologiesofbusinessintelligenceknowledge Practice Business Intelligence • • • • • • • • • • • • • • Sourcing &Vendor Relationships Enterprise RiskManagement &Governance Measurement & BenchmarkingStrategies IT Management Enterprise Architecture Business Technology Trends &Impacts Business-IT Strategies Business Intelligence Agile ProjectManagement Research Reports Mentoring Inhouse Workshops Consulting Service The BusinessIntelligenceAdvisory includes: having onenterprisesworldwide.Theteam business intelligencestrategiesandtacticsare and continuetostudytheimpactthat been implementedbyleadingorganizations, books, developedmethodologiesthathave They havewrittengroundbreakingpapersand reputation asatrailblazerinhisorherfield. Consultants,eachhasgainedastellar Senior intelligence. LikeallCutterConsortium many disciplinesthatmakeupbusiness Intelligence teamarethoughtleadersinthe Business The SeniorConsultantsonCutter’s Team Senior Consultant • • • • • • • • • • • • • • • • • • • • Karl M.Wiig Karl Mike Sisco Ed Schuster Michael Schmitz Ricardo Rendón Thomas C.Redman Gabriele Piccoli Ken Orr Larissa T. Moss David Loshin Lisa Loftis André LeClerc David C.Hay Curt Hall David Gleason Jonathan Geiger Clive Finkelstein Ken Collier Stowe Boyd Verna Allee