IDC Infobrief, Sponsored by Alteryx Data Is the Lifeblood of Digital Transformation
Total Page:16
File Type:pdf, Size:1020Kb
State of Data Science and Analytics An IDC InfoBrief, Sponsored by Alteryx Data is the lifeblood of digital transformation of organizations leverage data across 80% multiple organizational processes IDC’s Digital Transformation Platform Source: IDC Data Preparation, Analytics and Science Survey Commissioned by Alteryx, February 2019 N=836 pg 2 State of Data Science and Analytics An IDC InfoBrief, Sponsored by Alteryx Survey respondents indicate data is being used across all business functions Customer Acquisitions/ Retention Supply Chain/ Fraud/Risk Market Product Logistics Intelligence Management R & D Service Delivery Plant/ Pricing Facilities Customer service/ Support Finance HR Manufacturing Source: IDC Data Preparation, Analytics and Science Survey Commissioned by Alteryx, February 2019 N=836 pg 3 State of Data Science and Analytics An IDC InfoBrief, Sponsored by Alteryx There are approximately 54M data workers worldwide Regional Distribution of Data Workers Worldwide ROW 6% 39% 28% NA AP 21% 6% EU LA Source: IDC Data Preparation, Analytics and Science Survey Commissioned by Alteryx, February 2019 N=836 pg 4 State of Data Science and Analytics An IDC InfoBrief, Sponsored by Alteryx Data workers spend 90% of their work week on data related activities Activities Performed by Data Workers Weekly Time by Activity AppDev Preparation 96% Data 7% Science Searching Searching 92% 13% 15% Analytics 88% Analytics Preparation Data Science 76% 32% 33% Data App Dev 59% % of Data Workers Searching for and preparing data are the most common activities regardless of role of data worker Source: IDC Data Preparation, Analytics and Science Survey Commissioned by Alteryx, February 2019 N=836 pg 5 State of Data Science and Analytics An IDC InfoBrief, Sponsored by Alteryx 44% of time is being wasted every week because data workers are unsuccessful in their activities Average Percentage of Time Wasted by Activity Searching 51% Preparation 47% 44% Science Unsuccessful Successful 42% Time Time Analytics 42% 56% Data App Dev 37% % of Weekly Time On average this represents 16 hours #fail per week per data worker Source: IDC Data Preparation, Analytics and Science Survey Commissioned by Alteryx, February 2019 N=836 pg 6 State of Data Science and Analytics An IDC InfoBrief, Sponsored by Alteryx Top challenges cited by data workers are indicative of underlying issues that are responsible for failures Too much time spent in data preparation 33% Also indicative of productivity issues were the highest ranked skills gaps Slow response time to requests 29% Lack of collaboration 28% Lack of creative and Too many versions of analytic models 28% analytical thinking Lack of clarity in requirements 25% Analytical/statistical skills Analytic tools are too complicated/lack of training 25% Lack of the knowledge of data sources 22% Data preparation skills Lack of understanding of analytics and data science 22% % of Data Workers Source: IDC Data Preparation, Analytics and Science Survey Commissioned by Alteryx, February 2019 N=836 pg 7 State of Data Science and Analytics An IDC InfoBrief, Sponsored by Alteryx Variety of data sources, diversity of data types, data volumes, multiple targets for analytics, and data science outputs result in complex solutions Data Sources Types of Data Processed Outputs Spreadsheets 50% Survey 42% Cloud Dbs 40% Demographic 40% Trend analysis 45% Desktop Dbs 38% Transactional 37% Decision making 45% Analytical Dbs 34% Master 36% Static reports 44% SaaS Apps 33% Log file 32% Insight sharing 38% In-Memory 30% Relationship 31% Dynamic dashboards 38% RDBMS 28% Multimedia 30% Business predictions 37% Mainframe 26% IoT 28% Issue identification 37% Flat files 25% Semi-structured 27% Data applications 32% NoSQL 23% Social media 27% % of Data Workers Hadoop 12% Spatial 25% % of Data Workers % of Data Workers Average number of data sources per Average number of rows processed Average number of target outputs per 6 analytics or data science activities 40M per analytics or data science activity 7 analytics or data science activity Workers are using four to seven different tools to perform data activities Source: IDC Data Preparation, Analytics and Science Survey Commissioned by Alteryx, February 2019 N=836 pg 8 State of Data Science and Analytics An IDC InfoBrief, Sponsored by Alteryx 88% of data workers, approximately 47M people worldwide, use spreadsheets for data activities Data Visualization 3 hrs 45 min Spreadsheet functions are used as a Data Shaping 3 hrs 40 min proxy for data preparation, analytics Preparing data for visualization 3 hrs 36 min and data application dev tools: in an external tool Data Cleansing 3 hrs 33 min The most used spreadsheet functions by 50-70% of users Data Blending Data Activities 3 hrs 23 min include data summarization, date- Using Macros 3 hrs 5 min time manipulation, transpose, lookup, conditional formulas. Performing “What-if” Analysis 2 hrs 58 min Average hours per week per user Data workers that use spreadsheets are spending 60% of their work week (24 hours) in spreadsheets Source: IDC Data Preparation, Analytics and Science Survey Commissioned by Alteryx, February 2019 N=836 pg 9 State of Data Science and Analytics An IDC InfoBrief, Sponsored by Alteryx Most frequently used methods of getting data into spreadsheets lack control and governance, which can lead to data compliance and trust issues Spreadsheet file links 50% Inefficiencies are also reflected in the use of spreadsheets when source data is updated: Importing data 50% Workers are spending on average, Spreadsheet data insertion 7 hours per week manually updating (e.g. VLOOKUP) 46% formulas, pivot tables, cell and sheet Manual Copy/Paste 43% references. % of Data Workers Source: IDC Data Preparation, Analytics and Science Survey Commissioned by Alteryx, February 2019 N=836 pg 10 State of Data Science and Analytics An IDC InfoBrief, Sponsored by Alteryx The State of Data Science and Analytics: Key Findings Data is becoming increasingly important to success in the digital economy Users across all geographies, company sizes, industries and departments use data Data workers spend the majority of their work week on data activities More time is being spent on analytics than on data science or application development Complexity, diversity and scale are top challenges for data workers, resulting in inefficiencies and wasted time Users have to overcome skills gaps, learn Searching for and preparing data are the most and use multiple tools to work with data frequent and the most inefficient activities Users resort to the ubiquitous spreadsheet in face of complexity Resulting in a lack of data and analytics traceability, uncontrolled distribution and time intensive effort People and organizations are still resistant to change Inefficiencies are costing time and money, restricting opportunities for simplification Source: IDC Data Preparation, Analytics and Science Survey Commissioned by Alteryx, February 2019 N=836 pg 11 State of Data Science and Analytics An IDC InfoBrief, Sponsored by Alteryx Call to Action – Reduce friction, address data activity inefficiencies to improve worker effectiveness Reduce friction within data related activities to Data workers say the top 3 barriers to bringing simplify solutions, improve productivity and the in a new tool: lives of data workers: Technical Cost Time Friction of having to use multiple tools: 1 2 3 Compatibility consolidate activities onto one end-to-end data platform. These three barriers are the result of the friction Friction of skills gaps: look for platforms currently being experienced by data workers. that offer self-service access to data preparation, analytics and data science. Multiple tools result in higher 1 maintenance and training costs Friction of productivity: platforms focused on ease of use and automation to improve Manual activities are costing opportunities for success. 2 people time = money Friction of data trust: platforms that Multiple tools and incompatibility implement control and provide end-to-end 3 challenges integration traceability. Business cases for a new tool that reduces friction can be quantified by the cost of data worker failures: the opportunity cost of wasted time. Source: IDC Data Preparation, Analytics and Science Survey Commissioned by Alteryx, February 2019 N=836 pg 12 State of Data Science and Analytics An IDC InfoBrief, Sponsored by Alteryx Survey demographics by country and role Source: IDC Data Preparation, Analytics and Science Survey Commissioned by Alteryx, February 2019 N=836 pg 13 State of Data Science and Analytics An IDC InfoBrief, Sponsored by Alteryx Survey demographics by department and industry Operations 22.3% Services 22.6% Finance (incl. Tax or Audit) Transportation, Communication, 15.9% Utilities 12.6% IT 15.0% Manufacturing 12.6% Sales 13.8% Retail or Wholesale 11.7% Risk/Compliance 11.3% Finance 11.6% Marketing 8.3% Healthcare and Life Science 11.1% Supply Chain/Procurement 6.8% Other 8.7% HR 6.3% Government 4.9% Other 0.3% Education 4.2% Survey demographics by company size 19.1% 12.1% 12.6% 11.0% 11.1% 7.9% 7.7% 4.6% 4.8% 6.0% 3.0% 100 to 199 200 to 249 250 to 499 500 to 999 1,000 to 1,499 1,500 to 1,999 2,000 to 2,999 3,000 to 4,999 5,000 to 9,999 10,000 to 20,000 to 19,999 49,999 Source: IDC Data Preparation, Analytics and Science Survey Commissioned by Alteryx, February 2019 N=836 pg 14 .