On Improving User Response Times in Tableau

Total Page:16

File Type:pdf, Size:1020Kb

On Improving User Response Times in Tableau On Improving User Response Times in Tableau Pawel Terlecki Fei Xu Marianne Shaw Tableau Software Tableau Software Tableau Software [email protected] [email protected] [email protected] Valeri Kim Richard Wesley Tableau Software Tableau Software [email protected] [email protected] ABSTRACT responsiveness of the interface. To achieve that across all the The rapid increase in data volumes and complexity of applied supported data sources, one often needs to apply dedicated analytical tasks poses a big challenge for visualization solutions. optimization techniques to alleviate efficiency and capability It is important to keep the experience highly interactive, so that shortcomings of various architectures. users stay engaged and can perform insightful data exploration. During data exploration, each user interaction with the application Query processing usually dominates the cost of visualization generates an adhoc query workload. This workload is difficult to generation. Therefore, in order to achieve acceptable response predict and requires powerful analytic databases, e.g. column times, one needs to utilize backend capabilities to the fullest and stores, to process requests with low latencies. Additionally, we apply techniques, such as caching or prefetching. In this paper we leverage caching and query optimization techniques to provide discuss key data processing components in Tableau: the query high levels of interactivity. processor, query caches, Tableau Data Engine [1, 2] and Data Sharing the results of data exploration via Tableau Server in the Server. Furthermore, we cover recent performance improvements intranet or Tableau Online, in the cloud, presents a different set of related to the number and quality of remote queries, broader reuse of workload challenges. Users can combine visualizations created cached data, and application of inter and intra query parallelism. during data exploration into dashboards and publish them to Server. A dashboard consists of multiple visualizations; they are 1. INTRODUCTION often linked by actions, such as filtering of one view based on Data visualization and big data processing have become key selection in another. elements of modern analytic platforms. Lower hardware prices The query workload generated by published dashboards is easier allow customers to purchase powerful computational clusters, to predict than the adhoc workload generated during exploratory collect high volumes of detailed data and perform sophisticated ad analysis. This is because queries are generated by interacting with hoc analysis. Moreover, there is a growing demand for schema components already defined in the dashboard. Moreover, standardized solutions in related areas, such as: data federation, caching is more efficient as it can be applied across multiple users exploration of semi-structured and graph data sets, and data accessing the same dashboards. preparation or knowledge discovery. The increasing popularity of Although with different emphasis, providing a highly interactive visual analytics is driving development in these fields and experience in both aforementioned scenarios depends upon the provides the research community with new data processing and efficient retrieval of new data from external sources and maximal usability challenges. utilization of previously obtained results. In this paper, we In the past years, Tableau has become a leader in providing easy- present recent performance improvements focused on increasing to-use data visualization tools. It currently connects to over 40 product interactivity, including: different data sources, from simple file-based sources, through SQL/MDX databases or web data sources, to Hadoop clusters. In Query batch preparation and concurrent query a common intuitive interface, it allows for rich analysis of execution to improve dashboard generation; different areas of data, including filtering, custom calculations – Persisted and distributed intelligent cache to improve potentially at different levels of detail, window and statistical response times across users and their interactions; functions and many others. Parallel plans and fast scanning of RLE encoded Tableau aims to provide a fast interactive environment for data columns in Tableau Data Engine; exploration. A key challenge, both today and as we continue to Managing data models and extracts in Data Server add more sophisticated data analysis, is maintaining high along with temporary table support. Permission to make digital or hard copies of all or part of this work for personal The text is organized accordingly. Section 2 covers the general or classroom use is granted without fee provided that copies are not made or background. In Sect. 3 we describe how dashboards are generated distributed for profit or commercial advantage and that copies bear this notice focusing on query processing and caching. Section 4 gives an and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is overview of the Tableau Data Engine and its recent performance permitted. To copy otherwise, or republish, to post on servers or to redistribute features. In Sect. 5 we look at connection and data model to lists, requires prior specific permission and/or a fee. Request permissions from management in the Data Server. Section 6 covers related work. [email protected]. Future plans are given in Sect. 7 and the paper is concluded in SIGMOD’15, May 31–June 4, 2015, Melbourne, Victoria, Australia. Sect. 8. Copyright © 2015 ACM 978-1-4503-2758-9/15/05…$15.00. http://dx.doi.org/10.1145/2723372.2742799 Figure 1: Sample dashboard for the FAA Flights On-Time dataset view it from different angles. Quality of experience depends on 2. BACKGROUND the degree of interactivity, a factor that distinguishes dashboards Tableau is an application for rapid interrogation of structured from old-fashioned static reports. data. The application has a simple, intuitive drag-and-drop A dashboard is a collection of zones organized according to a interface that generates queries against a wide spectrum of data certain layout. Different zone types include various charts, maps, sources. The application has a large number of built-in chart filters, calculated text, images etc. One defines the behavior of types. Users are offered a rich environment for analyzing their individual zones first and then specifies dependencies between data, including defining new calculations, forecasting, filtering, them. Complicated scenarios might involve tens or even hundreds and combining data from heterogeneous data sources [47]. of zones. User actions that add and remove items in a worksheet are Let us consider a simple example of a dashboard (Fig. 1) defined represented with VizQL [40], a set of specialized algebraic for the popular FAA Flights On-time dataset [43]. It covers all the operations. This representation is used to automatically generate flights in the US in the past decade. The two upper maps show the queries against a specified data source. Queries are issued and the number of flight origins and destination by state and, at the same results are interpreted to produce the requested visualization. As time, allow specifying origins and destinations for the slave charts the user interacts with elements of the visualization, additional at the bottom. Each chart is annotated with average delays and queries may be issued to the underlying database. flights per day. The bottom charts cover airlines operating the The Tableau Data Engine (TDE) is a proprietary column store flights, destination airports, breakdown of cancellations and database that can be used to allow users to extract data from an delays by weekdays, and distribution of arrival delays broken existing database or text file. When a TDE extract is incorporated down by hours of a day. The right-hand side has filtering, total into a user's workbook, a user can continue to interrogate their count of visible records and static legends. data when a network connection to the database is not available. Rendering a dashboard requires retrieving necessary data and Additionally, the extract reduces the query load on the backing post-processing it for visualization. Due to dependencies between database. Users can easily switch from an extract back to a live zones, the entire process might take several iterations in which connection, and extracts can be refreshed when appropriate. obtained responses are used to specify subsequent requests; the Tableau Server allows users to easily share their visualizations process is discussed in detail in Sect. 3.3. with others. Users publish data sources or visualizations to Server Most zones require data from one or more data sources in order to and specify sharing permissions for them. These visualizations be rendered. The requests are specified internally by queries with can be viewed and interacted with via another Tableau expressiveness matching the capabilities of the data sources. application, or simply through a Web browser. If visualizations These internal queries eventually get turned into textual queries in are published with accompanying TDE extracts, a schedule can be appropriate dialects and sent to the corresponding data sources. created to automatically refresh the extracts, ensuring the data is always current. In the next section we discuss efficient data retrieval and generation of high-quality remote queries as both vastly 3. QUERY PROCESSING FOR contribute to the latency
Recommended publications
  • Dashboard Design for Real-Time Situation Awareness
    Dashboard Design for Real-Time Situation Awareness Stephen Few, author of Information Dashboard Design Dashboard Design for Real-Time Situation Awareness Few if any recent trends in business information delivery have inspired as much enthusiasm as dashboards. When they work, they provide a powerful means to tame the beast of data overload. Despite their popularity, however, most dashboards live up to only a fraction of their potential. They fail, not because of poor technology—at least not primarily—but because of poor design. The more critical that information is to the well being of the business, the more grievous is the failure, because the remedy is so readily available. The term “dashboard” refers to a single screen information display that is used to monitor what’s going on in some aspect of the business. The key word is “monitor.” A dashboard presents the key data that you must effi ciently monitor to maintain awareness of what’s going on in your area of responsibility. Most dashboards are used to monitor information once a day, because more frequent use is unnecessary given the rate at which the information changes and speed at which responses must be made. Some jobs, however, require constant monitoring in real time, or close to it, because the activities that you track are happening right now and delays in responding can’t be tolerated. There is perhaps no better example of this type of dashboard than one that monitors the brisk and sometimes harried activities of a call center. Much like air traffi c control systems or cockpits in airplanes, call center dashboards must be designed to support real-time “situation awareness.” They must grab your attention when it’s needed, they must make it easy to spot what’s most important in a screen full of data, and they must give you the means to understand what’s happening and respond without delay.
    [Show full text]
  • Business Intelligence: a Discussion on Platforms, Technologies, and Solutions
    Business Intelligence: A Discussion on Platforms, Technologies, and solutions Tutorial RCIS 2013-Paris May 29-31 Introduction • Presenters: – Noushin Ashrafi • [email protected] – Jean-Pierre Kuilboer • [email protected] • Time and Discussion Frame – 90 minutes • Audiene can ask questions any time during presentation. 4/19/2013 RCIS 2013-Paris May 29-31 2 Tutorial overview knowledge Discovery from Data • The Thrust of the tutorial is to examine Business Intelligence (BI) Platforms to develop business analytics applications. • Specifically, we will address: – Requirements, development , and capabilities of the BI. – Self-service, enterprise, and cloud BI. – Confusing BI concepts such as: Platform, infrastructure, technology and architecture . – Solutions by three well-known vendors: Microsoft, TableauSoftware and IBM. 4/19/2013 RCIS 2013-Paris May 29-31 3 Overview of Business Intelligence • BI is revolutionizing decision making and information technology across all industries. This phenomenon is largely due to the ever-increasing availability of data. • The explosive volumes of data are available in both structured and unstructured formats, and are analyzed and processed to become information within context hence providing relevance, and purpose to the decision making process. 4/19/2013 RCIS 2013-Paris May 29-31 4 4/19/2013 RCIS 2013-Paris May 29-31 5 What is Business Intelligence? • BI is a content-free expression, so it means different things to different people. • BI is neither a product, nor a service – BI refers to people, processes, technologies and practices used to support business decision making. 4/19/2013 RCIS 2013-Paris May 29-31 6 4/19/2013 RCIS 2013-Paris May 29-31 7 What is Business Intelligence? • BI is an umbrella term that combines architectures, technology, analytical tools, applications, and methodologies to help transform data, to information, to knowledge, to decisions, and finally to action.
    [Show full text]
  • IBM Infosphere Master Data Management
    Data and AI Solution Brief IBM InfoSphere Master Data Management Maintain a trusted, accurate view of Highlights critical business data in a hybrid environment • Support comprehensive, cost-effective information Organizations today have more data about products, governance services, customers and transactions than ever before. • Monitor master data quality Hidden in this data are vital insights that can drive business using dashboards and results. But without comprehensive, well-governed data proactive alerts sources, decision-makers cannot be sure that the information • Simplify the management they are using is the latest, most accurate version. As data and enforcement of master volumes continue to grow, the problem is compounded over data policies time. Businesses need to find new ways to manage and • Integrate master data govern the increasing variety and volume of data and deliver management into existing it to end users as quickly as possible and in a trusted manner, business processes regardless of the computing modality. • Quickly scale to meet changing business needs IBM InfoSphere Master Data Management (InfoSphere MDM) provides a single, trusted view of critical business data to • Optimize resources using users and applications, helping to reduce costs, minimize risk flexible deployment options and improve decision-making speed and precision. InfoSphere MDM achieves the single view through highly accurate and comprehensive matching that helps overcome differences and errors that often occur in the data sources. The MDM server manages your master data entities centrally, reducing reliance on incomplete or duplicate data while consolidating information from across the enterprise. Data and AI Solution Brief With InfoSphere MDM, you can enforce information governance at an organizational level, enabling business and IT teams to collaborate on such topics as data quality and policy management.
    [Show full text]
  • Montana COVID-19 Cases Map and Dashboard Metadata
    Montana COVID-19 Cases Map and Dashboard Metadata For assistance, please e-mail or call the Montana Joint Information Center 1-888-333-0461 A list of FAQs and answers related to COVID-19 can be found here. Updates to the information are made by 11:30 a.m. daily. Description: The data for this dashboard is derived from local health officials at the county level who report cases to the Montana Department of Health and Human Services (DPHHS). DPHHS tabulates case data and then gives the data to the Montana State Library to publish through this web feature service. The daily updates are managed by the Disaster & Emergency Service State Emergency Coordination Center. Montana public health agencies and the Governor's Coronavirus Task Force are actively working to limit the spread of novel coronavirus in Montana. The Montana State Library is assisting the effort by geo-enabling public health information to help decision-makers, State Emergency Coordination Center and the Governor's Coronavirus Task Force, understand the spread of the disease. Information is reported based on the previous day. Information on the dashboard and maps will change based on further public health investigation. Cumulative Cases Indicator: This includes all reported Montana residents meeting the CDC case definition for novel coronavirus infections, and some non-Montana residents who were diagnosed with novel coronavirus infections while in Montana. New Cases Indicator: Daily aggregate of new cases in Montana, reported to DPHHS, for the previous full day. These newly reported cases can be classified as either active or recovered. Updates to the information are run by 11:30 a.m.
    [Show full text]
  • Governance by Dashboard
    Governance by Dashboard A Policy Paper Jamie Bartlett Nathaniel Tkacz Open Access. Some rights reserved. As the publisher of this work, Demos wants to encourage the circulation of our work as widely as possible while retaining the copyright. We therefore have an open access policy which enables anyone to access our content online without charge. Anyone can download, save, perform or distribute this work in any format, including translation, without written permission. This is subject to the terms of the Demos licence found at the back of this publication. Its main conditions are: · Demos and the author(s) are credited · This summary and the address www.demos.co.uk are displayed · The text is not altered and is used in full · The work is not resold · A copy of the work or link to its use online is sent to Demos. You are welcome to ask for permission to use this work for purposes other than those covered by the licence. Demos gratefully acknowledges the work of Creative Commons in inspiring our approach to copyright. To find out more go to www.creativecommons.org PARTNERS CREDITS This report was commissioned by: Published by Demos March 2017 © Demos. Some rights reserved. Unit 1, 2-3 Mill Street London SE1 2BD T: 020 7367 4200 [email protected] www.demos.co.uk CONTENTS Foreword 5 Introduction 7 The rise of the data dashboard 9 Dashboards in action 11 Dashboards transformation 14 Factors for successful dashboard implementation 18 4 5 FOREWORD Data dashboards – interfaces which display complex data to a user, often displayed in real time, and typically drawn from multiple sources – are an increasingly important way in which government departments make decisions.
    [Show full text]
  • Business Intelligence Dashboard for Academic Program Management
    Paper ID #13300 Business Intelligence Dashboard for Academic Program Management Dr. Mudasser Fraz Wyne, National University I have a Ph.D. in Computer Science, M.Sc. in Engineering, and B.Sc. in Electrical Engineering. Currently I serve as a Chair for Department of Computer Science, Information and Media Systems and Professor of Computer Science at School of Engineering and Computing, National University. I am also the Program Lead for MS in Information Technology Management and have also served recently as the lead for BSc in Information Systems, the co-Lead for MSc Computer Science and Program Lead for MSc in Database Administration programs. My association with ABET (Accreditation Board of Engineering and Technol- ogy) US dates back to 2001, as a certified program evaluator for BSc in Computer Science and BSc in Information Systems. At present, I am also serving as the Commissioner for the Computer Accredita- tion Commission (CAC). Previously, I have taught at 6 different countries for over 25 years. I have been privileged to be part of the DESY Group (Deutches Elecktronen Synchrotron), Hamburg Germany, as a research fellow, and worked with an MIT group, led by a Nobel laureate. On the research side, I have been fortunate enough to secure a number of grants and have served on numerous international Ph.D. Thesis committees, been a member of the editorial boards for 7 interna- tional journals, and served as the Chair and Co-Chair for 12 international conferences. For recognition of my research activities, I have been invited to a number of international conferences as Invited Speaker, chaired panel discussions and numerous international conference sessions.
    [Show full text]
  • Business Intelligence: Dashboard
    International Journal of Engineering Research & Technology (IJERT) ICONECT' 14 Conference Proceedings Business Intelligence: Dashboard Tushar Mulik1, Pravin Malusare2, Omkar Takalkar3,Prof. Monali Patil4 Department of Information Technology Mumbai University Mumbai, India [email protected],2 [email protected], 3 [email protected], 4 [email protected] Abstract - Business intelligence systems combine operational intelligence is viewed as being proactive. Essential data with analytical tools to present complex and competitive components of proactive BI are: information to planners and decision makers. The objective is • Real-time data warehousing, to improve the timeliness and quality of inputs to the decision • Data mining, process. Business Intelligence is used to understand the • Automated anomaly and exception detection, capabilities available in the firm; the state of the art, trends, and future directions in the markets, the technologies, and the •Proactive alerting with automatic recipient regulatory environment in which the firm competes; and the determination, actions of competitors and the implications of these actions. • Seamless follow-through workflow, The emergence of the data warehouse as a repository, advances • Automatic learning and refinement, in data cleansing, increased capabilities of hardware and • Geographic information systems software, and the emergence of the web architecture all • Data visualization combine to create a richer business intelligence environment than was available previously. II. LITERATURE SURVEY Much has been said and written about BI systems. To begin I. INTRODUCTION the research, study begins with collecting information from Demand for Business Intelligence (BI) applications periodicals, journals and reports. BI systems are used almost continues to grow even at a time when demand for most in all the sectors of industry for different purposes like information technology (IT) products is soft.
    [Show full text]
  • Design Concepts for an Outage Control Center Information Dashboard
    INL/EXT-15-37425 Rev. 0 De sign Concepts for an Outage Control Center Information Dashboard Jacques Hugo Shawn St. Germain Ronald Farris Cheradan Thompson McKenzie Whitesides December 2015 The INL is a U.S. Department of Energy National Laboratory operated by Battelle Energy Alliance DISCLAIMER This information was prepared as an account of work sponsored by an agency of the U.S. Government. Neither the U.S. Government nor any agency thereof, nor any of their employees, makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness, of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. References herein to any specific commercial product, process, or service by trade name, trade mark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the U.S. Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the U.S. Government or any agency thereof. INL/EXT-15-37425 Rev. 0 Design Concepts for an Outage Control Center Information Dashboard Jacques Hugo Shawn St. Germain Ronald Farris Cheradan Thompson McKenzie Whitesides December 2015 Idaho National Laboratory Idaho Falls, Idaho 83415 http://www.inl.gov Prepared under Work Package L-12N060305 Prepared for the U.S. Department of Energy Office of Nuclear Energy Under DOE Idaho Operations Office Contract DE-AC07-05ID14517 EXECUTIVE SUMMARY The nuclear industry, and the business world in general, is facing a rapidly increasing amount of data to be dealt with on a daily basis.
    [Show full text]
  • Prototype and Deployment of a Big Data Analytics Platform
    Prototype and Deployment of a Big Data Analytics Platform CTT2.0 Carbon Track and Trace Deliverable D3.1 Atle Vesterkjær, Patrick Merlot (Numascale) Oslo, Norway| 14 December 2016 climate-kic.org Contents Preface ................................................................................................................................................................. 2 Requirement Analysis ........................................................................................................................................ 3 Previous E2E Analytics IoT platform ....................................................................................................................... 3 CTT Analytics IoT platform ......................................................................................................................................... 5 State of the art technology overview ..................................................................................................................... 6 Adaption of standard and develop-ment of custom components ............................................................. 7 Standard components ................................................................................................................................................. 7 Custom components .................................................................................................................................................... 9 Deployment of the platform ............................................................................................................................
    [Show full text]
  • Study on Business Intelligence Tools for Enterprise Dashboard Development
    International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 04 | Apr -2017 www.irjet.net p-ISSN: 2395-0072 Study on Business Intelligence Tools for Enterprise Dashboard Development K.Gowthami1, M.R. Pavan Kumar2 1Student(M.Tech), Dept of CSE, Sree Vidyanikethan Engineering College, Tirupati, India. 2Assistant Professor, Dept. Of CSE, Sree Vidyanikethan Engineering College, Tirupati, India. ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - This paper presents detailed information on dashboards, OLAP and data mining to charting and ETL. It Business Intelligence (BI) tools for developing a dashboard covers all the analytic areas of Business Intelligence in a typical enterprise setup. Few most popular BI tools like projects, with innovative themes and engines. The SpagoBI SpagoBI, Power BI, Tableau, Qliksense and Jaspersoft are server includes an interactive dashboard and integration with other reporting tools. considered based on the ease of use, support in terms of training and minimal initial cost at the end, a sample SpagoBI mainly consists of the following modules: dashboard has been developed using one of the tools Power BI to demonstrate the feasibility of the same for the Business data visualization. Key Words: Dashboard, Business Intelligence, Data Reports, Visualization, Data Analytics. 1.INTRODUCTION Business Analytic or Business Intelligence (BI) is becoming more needed by the top management of any enterprise to visualize, analyze and prepare the strategic Fig 1: SpagoBI Architecture planning for the future. However, the availability of business intelligence tools, the support on the existing SpagoBI Server, It includes analytic tools and features infrastructure (including software and hardware), act as a core for the suite.
    [Show full text]
  • Fast Database Restarts at Facebook
    Fast Database Restarts at Facebook ∗ Aakash Goel, Bhuwan Chopra, Ciprian Gerea, Dhrúv Mátáni, Josh Metzler, Fahim Ul Haq, and Janet L. Wiener Facebook, Inc. ABSTRACT 1. INTRODUCTION Facebook engineers query multiple databases to monitor and Facebook engineers query multiple database systems to analyze Facebook products and services. The fastest of monitor and analyze Facebook products and services. Scuba[5] these databases is Scuba, which achieves subsecond query is a very fast, distributed, in-memory database used exten- response time by storing all of its data in memory across sively for interactive, ad hoc, analysis queries. These queries hundreds of servers. We are continually improving the code typically run in under a second over GBs of data. Scuba pro- for Scuba and would like to push new software releases at cesses almost a million queries per day for over 1500 Face- least once a week. However, restarting a Scuba machine book employees. In addition, Scuba is the workhorse behind clears its memory. Recovering all of its data from disk | Facebook's code regression analysis, bug report monitoring, about 120 GB per machine | takes 2.5-3 hours to read and ads revenue monitoring, and performance debugging. format the data per machine. Even 10 minutes is a long One significant source of downtime is software upgrades, downtime for the critical applications that rely on Scuba, yet upgrades are necessary to introduce new features and such as detecting user-facing errors. Restarting only 2% of apply bug fixes. At Facebook, we are accustomed to the the servers at a time mitigates the amount of unavailable agility that comes with frequent code deployments.
    [Show full text]
  • Designing Dashboards for Multiple Target Audiences with SAS Visual
    Paper SAS4323-2020 Designing Dashboards for Multiple Target Audiences with SAS® Visual Analytics Kunal Shah, SAS Institute Inc. ABSTRACT There is no doubt that organizations have a vast amount of data available at their disposal. Yet, they often fall into the trap of creating a single dashboard for a wide variety of audiences to derive meaningful insights from the available data. Different audiences expect different insights from the data. Even the dashboard design thinking differs based on the target audience. This presentation focuses on three key audiences (executives, middle management, and individual contributors) and how to use SAS® Visual Analytics to create dashboards that generate powerful insights for each of these. The presentation also covers dashboard design best practices and recommendations from field consultants to help you weave a compelling story from your organizational data. INTRODUCTION Dashboards have become a popular tool in today’s business world. They are designed and implemented by nearly every industry, non-governmental organization (NGO), and even educational institution to enable data-driven decision making. There are literally hundreds of solutions that incorporate some kind of dashboard building capability. Yet, very little attention has been paid to the design and usability of these dashboards. Deriving value out of data isn’t just about defining a laundry list of success metrics but rather communicating a story and compelling the user to take action. Traditionally, dashboards are designed with data in mind - the amount of data, the kind of data, and the type of graph/chart to visualize the data. Users end up cramming a bunch of charts and gauges into a single dashboard and would eventually find themselves trying to fit in a large amount of organizational data into limited screen space.
    [Show full text]