Advice from IBM Watson Health on Tuning Tableau Dashboards for Fast Load Times and Rapid Performance
Total Page:16
File Type:pdf, Size:1020Kb
Welcome # T C 1 8 Advice from IBM Watson Health on Tuning Tableau Dashboards for Fast Load Times and Rapid Performance Brad Wheeler - VP, Analytics & Outcomes, IBM Watson Health Eman Alvani - Tableau Analyst, IBM Watson Health Agenda A Brief Background of our Tableau Story Why Performance Matters General Guidelines Using Performance Recorder Dashboard Tuning Examples Monitoring Dashboard Performance A Brief Background 17 years of healthcare technology experience Transitioned to Software Development Director, Business Operations Started at OmniSYS in VP, Implementations & Marketing during college Analytics 2000 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 Sr. Director, Software Development BBA in Marketing and MBA VP, Analytics & Outcomes from Texas A&M - Created Business Commerce Intelligence Department as Director, Business Intelligence 10+ years of business analysis experience Tableau Analyst Business Data Analyst at BA from Southern Dimont; CRM Reporting Methodist University Lead 2006 07 08 09 10 11 12 13 14 15 16 17 18 Sr. Financial Analyst, FP&A MBA from Pepperdine Began working as associate University at GS Our Legacy Environment SQL Server data Excel reports driven by Individual reports warehouse consolidates complex SQL data generated for each client multiple production aggregations and Excel and manually uploaded to databases into single macros web portal source for reporting Our New Environment – Powered by Tableau Online dashboards Clients login to web portal New dashboards provide powered by Tableau for access to data many more insights and refreshed nightly greater level of detail Before & After Report Example Very heavy table and text usage. Lack of “headline” or key takeaway User clicks to view terminology stored in separate table Individual filters used for each table and chart in the dashboard Before & After Report Example “Headlines” Filters apply across and bold all dashboard callouts call elements and are attention to hidden by default key takeaways Terminology and field definitions Visual heavy stored in tooltips representation of all the same data shown in original version After One Year of Tableau 95% of all client reports are automated and delivered via Tableau Clients have near real-time access to their data, compared to monthly static reports Dedicated client reporting staffing reduced from 13 to 3, freeing team up to work on more meaningful tasks across Watson Health A typical client user has access to 30-40 dashboards Fast loading dashboards are critical, especially when dealing with large data sets! 5.3 second average dashboard load time 93% of all dashboards load in under 10 seconds A Few General Performance Guidelines Extracts generally perform 30% – 40% faster than live connections so use them whenever possible Count Distinct functions are notoriously slow and a common cause for slow performance Using calculated fields in filters can cause slow performance so filter on native fields when possible Many filters with “Show Relevant Values Only” can cause slow dashboard interactions When using Set filters, the In/Out option is usually more efficient that the Members in Set option Watch out for too many fields in the Details shelf or ATTR() fields in the tooltip Reducing the number of queries required to load a dashboard can often help improve performance Most importantly Performance tuning is often more of an art than a science and requires curiosity, creativity, experimentation, and patience Why Performance Matters Using Performance Recorder Accessing Performance Recorder From Tableau Desktop: Help -> Settings and Performance -> Start Performance Recording Reading Performance Recorder Results After stopping the recording a new Tableau workbook will open with the results Dashboard events sorted chronologically Dashboard events sorted by time Clicking an event will show the query The Results Workbook is Editable The performance recording results are returned in a Tableau workbook that is fully editable, so you may customize to help understand the data Original Customized Dashboard Tuning Example 1 Clinical Medical Conditions Overview The Starting Point This dashboard examines medical condition, service categories, and totals by state in four broad sections: Key Metrics Top conditions (bar chart) Filled map by state Service Category (tree map) Parameters allow users to set the number of top conditions to show and toggle between various metrics Original Load Time: 49 seconds Original Performance Recording Results Running the Performance Recorder we can see that the Top Conditions sheet is the “long pole in the tent” so we’ll start there Testing the Filters Just as a test, let’s delete all the filters to see if that’s the problem Original filters on Top We’ll keep the action filter, Top N, and Conditions Sheet Clinical Condition to force a refresh Filters Are Not the Problem Removing all filters had minimal effect so we’ll undo that change Removing all filters only decreased the load time by 2 seconds. They’re not the problem so we can add them back Filters Are Not the Problem Removing all filters had minimal effect so we’ll undo that change Removing all filters only decreased the load time by 2 seconds. They’re not the problem so we can add them back A Closer Look at the Detail Shelf Having calculated fields in the Detail shelf can often cause slow performance; let’s see what removing c_sort Measure Name does A Significant Reduction in Load Time Removing that one field from the detail reduced the load time by 33%! Load time dropped from 49 seconds to 34 Field Required for Title But now that the field is gone, we get “None” in the chart title Alternative: New Sheet We can solve that by making a new sheet just for the title Putting it Together on the Dashboard We’ll hide the original sheet title and replace it with the newly created “Title Sheet” Title Sheet Original Sheet Calculated Field Being Used as a Filter Let’s take a closer look at this Top N filter Combination of Two Fields This filter is using a combination of two calculations to limit our results to top N Alternative: Using a Set We can achieve the same result by using a Set Set the filter to “In” the Top Clinical Conditions Set. All filters must be in context when using sets Checking the Results Our results are exactly the same when using a Set filter compared to the calculated field Using Set filter Using original filter Checking the Results Using the Set filter reduced the query time for Top Locations to 19 seconds! After these two changes, Top Locations is no longer the longest running query and our overall load time decreased to 25 seconds Let’s move on to Geo next Another Calculated Field in the Detail Shelf We have another calculated field in the Detail shelf. Let’s take the same approach as the Top Locations and create a new title sheet Checking the Results Moving the c_Metric_Name field from the Geo Detail shelf to its own title sheet reduced the Geo load time by 58% Geo load time dropped from 24 seconds to 10 Still Room for Improvement Let’s see if there’s something we can do to decrease the Top Conditions query time Top Conditions running at nearly 18 seconds A Calculated Field in the Text Field After some experimenting, the C_Cost Label field stands out as an improvement opportunity A Closer Look at the Calculation This field uses a calculated field whose sole purpose is to show or hide a “$” symbol in the chart label An Alternative – Two New Fields with Default Formatting After trying multiple options, it was discovered that the most effective way to achieve this functionality for this workbook was to 1) create two “label” fields, 2) set default formatting for each, and 3) add them both to the label Step 1 – create two “label” fields and set default formatting for each An Alternative – Two New Fields with Default Formatting After trying multiple options, it was discovered that the most effective way to achieve this functionality for this workbook was to 1) create two “label” fields, 2) set default formatting for each, and 3) add them both to the label Step 2 – set default formatting for each field An Alternative – Two New Fields with Default Formatting After trying multiple options, it was discovered that the most effective way to achieve this functionality for this workbook was to 1) create two “label” fields, 2) set default formatting for each, and 3) add them both to the label Step 3 – add both fields to the data label Checking the Results Using this method still provides the “$” when needed and reduces the query time by about 6 seconds Dashboard Load Time Comparison Total run time decreased from 49 seconds to 12! Original – 49 seconds Revised - 12 seconds A Review of our Changes All changes are “behind the scenes” that are invisible to the end user 76% reduction in load time compared to original view Dashboard Tuning Example 2 Outreach Modality Overview The Starting Point Outreach Modality Overview dashboard showing count and rates by communication modality Original load time: 135 seconds Update 1: Default Sorting Sorting requires running an extra query to determine the sort order for a chart Sorting queries are usually fast but in this instance it’s taking 33 seconds This Chart is Being Sorted by Patient Response Rate Let’s Take a Closer Look at that Field Patient Response Rate is calculated by comparing two COUNT DISTINCT fields What if We Don’t Sort by Default? Clearing the default sort makes very little difference in the presentation since there are so few rows Sorted (original) Non-sorted 30 Seconds Saved No extra query is being ran to determine sort order, saving approximately 33 seconds No extra query being ran Update 2: Hiding vs.