JAVASCRIPT CHARTING

Scaling for the Enterprise with Metric Insights

© 2013 Copyright Metric insights, Inc.

A REVOLUTION IS HAPPENING ...... 3! Challenges ...... 3! Borrowing From The Enterprise BI Stack ...... 4! Visualization Layer ...... 4! Source Data Layer ...... 5! RDBMS sources ...... 6! Big Data sources ...... 6! SaaS and Cloud Application data ...... 6! Business Intelligence Tool data ...... 6! Data Connection Layer ...... 7! Load Management ...... 7! Error Handling ...... 7! Schedules ...... 7! Data Dependencies ...... 7! Data Caching Layer ...... 8! Data Formatting ...... 8! Data Thinning ...... 8! Lifecycle Management ...... 8! User Management and Security Layer ...... 8! Alerting & Email Distribution Layer ...... 9! Collaboration Layer ...... 9! METRIC INSIGHTS SOLUTION ...... 10! BI TOOLS DON’T SUPPORT THIS ...... 11!

© 2013 Copyright Metric insights, Inc. A REVOLUTION IS HAPPENING

A revolution is happening. New JavaScript Visualization libraries now make it possible for anybody to create beautiful and powerful web-based visualizations in data. The days requiring a team of developers or expensive business intelligence tools are over. Inspired by the cool, interactive graphics in Nate Silvers’ NYT election blog, web developers and business intelligence professionals are embracing these technologies across the enterprise. A visualization smorgasbord awaits us all, but there are a number of challenges that are holding the feast in check.

Challenges

The accessibly of the JavaScript Visualization libraries make it easy for anybody with a little programming expertize and sense of design to build engaging visualizations. The challenge arises when you go beyond a handful of visualizations with static data sets to an enterprise-scale solution with many visualizations, with changing data from different data sources, and scores of users with different consumption needs and security requirements.

Most of the JavaScript Visualization examples on the web today show a dynamic chart with a static data set. But what has to happen to scale that across the enterprise? How do you load balance across multiple data sources? How do you ensure the wrong users do not see sensitive data? How do go to sleep at night knowing that you have a stable system and will wake up in the morning with happy users?

Page 3 Borrowing From The Enterprise BI Stack

It turns out that the traditional business intelligence (BI) tools have solved many of these problems. The traditional BI stack is worth looking at to identify components that need to be addressed in a JavaScript Visualization deployment. Most notable in the stack diagram comparison below is the number of areas that are missing from a JavaScript

Visualization deployment. Each of these areas must be taken into consideration to create an enterprise-scale JavaScript Visualization solution.

Visualization Layer

The traditional BI stack has a proprietary visualization component built in as part of the solution. Typically, you can choose different visualizations from a pre-set library, but you don’t have an option to change the libraries.

Page 4 A major characteristic of choosing to go with a JavaScript Visualization solution is you can choose the visualization library that makes the most sense for you. JavaScript Visualization libraries abound. Different libraries are optimized for different purposes and audiences. You will want to choose the appropriate library depending on your needs. For example, commercial support might be more important for your organization. Or, perhaps, creating visualizations with a unique look and feel might be your highest concern.

Another advantage of JavaScript Visualization libraries is that you can easily combine different visualization libraries to leverage the strengths of each library to most elegantly solve a given visualization challenge.

Some of the more popular libraries include:

• Highcharts*+*www.highcharts.com* • FusionCharts*+*www.fusioncharts.com* • D3*+*www.d3js.org* • *Charts*–*developers.google.com/chart/*

There are many other libraries with different purposes as well. For a more comprehensive list, see this article: 50 JavaScript Libraries for Charts and Graphs

Source Data Layer

All enterprises are characterized by the growing diversity of data sources. The ideal of a single data warehouse that captures all the data in an enterprise has quickly passed to the reality of multiple data silos as new services are implemented at departmental levels and new SaaS applications are introduced. The following are some considerations that must be taken with different types of data sources.

Page 5 RDBMS sources SQL Servers, Oracle Servers, MySQL databases and many others are commonly deployed in most enterprises. A successful JavaScript Visualization system must be able to connect through an ODBC or JDBC connection and reliably pull data on a regular basis.

Big Data sources Hadoop, MongoDB, Cassandra, and other Big Data sources are now becoming common additions to most data-driven companies. Each of these have particular attributes that must be taken into consideration when creating robust JavaScript Visualization based on these sources. For example, the long query times that are a common characteristic of such systems mandate some sort of caching layer to buffer the end users from such long wait times. Further, load control is important as such systems typically can’t handle a simultaneous spike in user requests.

SaaS and Cloud Application data Saleforce.com, Google Analytics, Omniture, Twitter, and many other cloud applications are a treasure-trove of corporate data, but can be difficult to access. Pulling data from each of these systems typically requires custom development through proprietary APIs of each system. Doing regular data dumps via a CSV file is another common alternative that requires a batch file management system to handle.

Business Intelligence Tool data Tableau Software, Business Objects, SAS, IBM Cognos, Microstrategy and other business intelligence tools typically have key corporate data that is already defined in a way consistent with corporate goals. Getting access to such data, and combining it with other data, is a key factor for success for JavaScript Visualization deployments. There are two ways to get access to such data: connecting directly to the underlying data sources (RDBMS, Big Data, etc.) or connecting directly to the reports in the Business Intelligence tools. For the latter, each Business Intelligence tool typically has an API that can be used to build a custom script to pull data out of reports.

Page 6 Data Connection Layer

Getting the data is just the beginning of the process, but this in itself requires effort. Data often resides in multiple places and it requires the developer to understand how the data is stored, how it is related and then write the appropriate scripts to extract what is important. These scripts must be able to incrementally update data as new information becomes available in the underlying source.

Load Management As you roll out more and more data visualizations it is necessary to manage query load so as not to overwhelm source systems. At scale, to effectively manage query load you must manage query processing so that no more than a maximum number of data fetches are run concurrently. The logic to support intelligent query load balancing across multiple data sources can be complex to build and time-consuming to maintain.

Error Handling Having a scalable environment that can deal gracefully with situations where source systems become unavailable or when data sources have not been updated as expected. Capturing errors, building in retry logic and being able to reload data automatically creates a scalable solution. An enterprise-scale solution must also detect source data that is stale and inform users that the chart data is out of date.

Schedules No enterprise use case sees you extracting data only once for your charts. It is necessary to manage data updates as new information becomes available. Usually, different source data should be sourced based on different schedules. For example, weekly charts may require updates only once the week is complete. Having a robust system to manage multiple data load schedules is necessary.

Data Dependencies Most organizations must manage complex dependencies around collecting data for visualizations. There are often multiple processes that load an underlying data warehouse and you must ensure that you pull the data for a visualization only when it is

Page 7 in a final and consistent state. Managing complex dependencies across multiple data sources and systems can be daunting challenge.

Data Caching Layer

Getting the relevant data out of source systems and cached in a way that makes it easily accessible and usable by a JavaScript Visualization library is key to any enterprise-scale deployment. This must be managed over multiple data sources and prepared for each visualization to be rendered.

Data Formatting Once extracted the data needs to be structured in a format that can successfully drive the visualizations such as JSON objects or simple file structures such as .csv.

Data Thinning JavaScript charting inherently runs in the browser, and therefore has a limit to the amount of data it can deal with in one go. When scaling charting to the enterprise, you are often dealing with large data sets where data thinning is required. For instance, you may want to visualize data collected every minute over long time periods. Rather than overloading the browser with millions of data points, a scalable approach requires that the data be “thinned” so as to reduce data volumes without removing any key data points in the trendline.

Lifecycle Management

A major difference between building a visualization on a static data set versus a dynamically changing one, is the need to have on-going maintenance of the solution. Ensuring that data loads incrementally every period, and that old data is purged at the appropriate time, is necessary. As underlying data sources are changed, updated or replaced, the design of the data management layer needs to be able to evolve as well.

User Management and Security Layer

A necessity for enterprise-scale solutions is to have access control lists and users with different viewing and editing privileges. Further, the ideal solution is to be integrated with

Page 8 the enterprise’s existing LDAP, Active Directory, or other access control systems already in place.

Alerting & Email Distribution Layer

An automated process to alert users to changes in their dashboards helps increase engagement with their dashboards. Traditional BI dashboards actually have very low engagement rates in an enterprise. The industry-wide engagement rate has been documented as low as 10-15% of total users in an enterprise actually regularly use the BI dashboards (Source: BARC).

For your JavaScript Visualizations, you want to ensure that your hard work is not being lost. A great way to increase engagement in your dashboards is to set up an alerting and email distribution list infrastructure so you can let your users know when to look at the appropriate dashboards.

Collaboration Layer

Data visualizations in general focus on answering the What? question. Context as to why something happened is often lost. Overlaying context on a visualization representing a static data set is straightforward, but in an enterprise, where data is changing constantly and where there are a many influencing and impacting factors, it is a complex process. Often context is held by people within the organization, and therefore it is necessary to capture this alongside the presentation layer in a way that allows users to collaborate around the information being presented.

Page 9 METRIC INSIGHTS SOLUTION

Metric Insights offers a turn-key solution that enables you to deploy leading edge JavaScript Visualization libraries at an enterprise scale. As depicted in the following diagram, Metric Insights provides for all the major areas that are missing for ad-hoc JavaScript Visualization deployments.

In addition to the above, some of the key features supported include:

• Robust*JavaScript*API*support.*This*enables*you*to*use*any*JavaScript* Visualization*library.*You*can*also*incorporate*multiple*JavaScript*Visualization* into*your*solution*to*build*on*the*unique*capabilities*of*each*library.**

• Visualization*Container*Management.**As*you*create*many*different* visualizations,*Metric*Insights*provides*a*mechanism*to*group*and*manage*them* in*one*framework.**By*delivering*security,*organization,*and*search*together* with*a*favorites*management*system,**you*enable*your*users*to*focus*on*the* visualizations*that*are*most*important*to*them.*

Page 10

• Rich*Source*Data*Connector*Support.*Metric*Insights*connects*to*100s*of* different*data*sources*in*the*enterprise.**Alongside*the*library*of*pre+built* connectors*there*are*a*number*of*generic*connectors,*such*as*JDBC*and*web* services,*that*allow*you*to*connect*to*any*open*data*source.***Further,*it*typically* takes*only*two*weeks*to*create*a*new*data*connector*to*a*new*or*proprietary* data*source.

BI TOOLS DON’T SUPPORT THIS

Most enterprises already have one or more BI tools deployed in the organization. At first glance, it seems that they could be a natural platform to support JavaScript Visualizations, but this is not the case for several reasons.

Page 11 First, each BI tool vendors’ data model is optimized for their own proprietary visualization layer. These tools are typically optimized for in-memory data management for ad hoc analysis and other concerns. These models often change and evolve frequently and would require continual re-working of your visualizations to ensure they are functional.

Second, these tools do not provide a well defined JavaScript API required for rapid deployment of JavaScript Visualizations and to provide a level of abstraction between your data layer and the visualizations themselves.

Third, these tool vendors typically define themselves by their visualization and reporting capabilities. As organizations, they are reluctant to support 3rd-party offerings that might be perceived as bypassing the main functionality of their tool.

Page 12

Have questions?

Metric Insights 123 10th Street San Francisco CA 94104 www.metricinsights.com [email protected] 1-800-993-346

About Metric Insights Metric Insights (metricinsights.com) bridges the last mile to Business Intelligence and Big Data. Metric Insights lets your users cut through the noise, focus immediately on the critical business issues that warrant their attention, and take action. Our Push Intelligence platform connects quickly and easily to your existing business intelligence tools, big data and SaaS applications. Metric Insights uniquely delivers a patented KPI warehouse, collaboration and notification technologies that tell you when your key business metrics have changed, and, more importantly, why.