Why Predictive Analytics Should Be “A CPA Thing”
Total Page:16
File Type:pdf, Size:1020Kb
Why Predictive Analytics Should Be “A CPA Thing” August 2014 AUTHOR: Adam Haverson, CPA/CITP Senior Manager, Business Analytics CapTech Consulting Richmond, VA REVIEWERS: Tom Burtner, CPA/CITP Partner, Technology Consulting McGladrey McLean, VA April Cassada, CPA/CITP Director, Data Analysis Auditor of Public Accounts Commonwealth of Virginia Richmond, VA Maritza Cora Associate Project Manager — IMTA Division Member Specialization & Credentialing AICPA Durham, NC Iesha Mack Project Manager — IMTA Division Member Specialization & Credentialing AICPA Durham, NC Karen Percent Director of Internal Audit Duke Medicine Durham, NC Susan Pierce Senior Technical Manager — IMTA Division Member Specialization & Credentialing AICPA Durham, NC Copyright © 2014 American Institute of CPAs. All rights reserved. DISCLAIMER: The contents of this publication do not necessarily reflect the position or opinion of the American Institute of CPAs, its divisions and its committees. This publication is designed to provide accurate and authoritative information on the subject covered. It is distributed with the understanding that the authors are not engaged in rendering legal, accounting or other professional services. If legal advice or other expert assistance is required, the services of a competent professional should be sought. For more information about the procedure for requesting permission to make copies of any part of this work, please email [email protected] with your request. Otherwise, requests should be written and mailed to the Permissions Department, AICPA, 220 Leigh Farm Road, Durham, NC 27707-8110. Table of Contents Predictive Analytics: The Concept .............................................................3 The Challenges of Modeling ...................................................................4 Best Practices .................................................................................5 The CPA’s Role ................................................................................5 IMTA Resources ...............................................................................6 External Resources ............................................................................6 “Predictive Analytics” (PA) means using facts about the past to make inferences and then using these inferences to make assertions about the future. The purpose of this paper is to explain the value of PA and explain why CPAs are well positioned to be crucial participants in its adoption. The paper will highlight the growing importance of PA, the challenges of modeling and also define best practices, including how to initiate PA projects, how a CPA provides value and resources for further research and learning. Predictive Analytics: The Concept The primary difference between fact-based/historical The potential for improved decision-making has created reporting and predictive analytics is inference. growth in predictive analytics applications that is not Historical reporting (by definition) tells us where we are restricted to private enterprise — as adoption by and where we have been. How much budget is left? charitable organizations, government and regulatory What was the volatility of the portfolio last month? How bodies is increasing as well. Some common active has year-over-year sales revenue changed by product uses of PA are audit risk models, actuarial science and and region? This type of reporting is a summary of facts insurance policy risk modeling, predictive failure of and any uncertainties in this summary typically stem computer equipment and bridges, marketing analytics, from data accuracy, completeness and timing. customized coupon and advertisement placement, fraud risk modeling and stock and option valuation modeling. However, the boom also has increased dependence upon Predictions combine facts about the past with and risk resulting from predictive analytics. Dependence inferences to anticipate the future. How much on PA has grown because PA techniques provide higher budget will be left in three months given that predictive value than simple traditional techniques such my business units typically underestimate as gut instinct, trend lines and flat reporting. Business Q3 spending by 10%? How much is portfolio risk from model failure increases as models are trusted volatility likely to decrease next month given for direct input into the decision-making process. The that volatility is at historically high levels and combination of these factors means that model failure is more likely to shake the house and there is an typically reverts to average levels? How will ever-growing need for business professionals who can sales revenue of a core product change help build, validate and responsibly use PA. Advanced given the combination a recent slowdown business users with PA concepts fluency are needed but increased marketing expenditures on to help build and validate models along with their IT the product? partners. When decision-makers trust a prediction, it will be used to improve tactics and strategy, enhancing decision quality. Without this trust, investment in PA will The simplest predictive model asserts that what has be limited and inferences will be ignored. happened in the past will continue to happen in the future. Even a simple model of this type can create “Trust” is a familiar concept to CPAs, who often are powerful explanatory value. For example, electrical recognized as their clients or organizations most trusted consumption, retail spending and the commute time advisers. Integrity, objectivity and subject matter between two addresses in a city follow powerful and expertise within business are optimal characteristics consistent cycles. In fact, much of the variation in for effective product owners in the PA space. CPAs are these variables can be explained by the time of year disciplined consumers and producers of information and (electrical consumption or retail spending) and time are adept at documenting the reasoning process. These of day/week (commute time). Additional predictive or behaviors are prerequisites for trust building. “independent” variables can be layered into a model to increase its stability and power and specialized modeling techniques can be used to extract more value. The application of predictive analytics ranges from these simple models that we use every day to complex multivariate models implemented in specialized systems such as SAS, R and SPSS. Why Predictive Analytics Should Be “A CPA Thing” | 3 The Challenges of Modeling Data quality is one of the most significant reporting This all sounds easy enough. But, in practice, it is problems. Missing or biased data can result in difficult to build and validate a model. Some reasons erroneous reporting results and ultimately can lead to for this difficulty include: bad decisions. We have to understand and account 1. It often is difficult to understand causality. for data quality issues before the results of our reports Does a fast car cause a driver to exceed the can be considered reliable. Perhaps the “sales revenue” speed limit? column that we mentioned earlier contains values in different currencies. We have to find a way to convert 2. Data quality issues affect model results. these values to a standard base before our report is One source rounded dollars up to millions but the meaningful. other used standard rounding. PA also depends on solid data quality but also is 3. It often is difficult to obtain the data that you want. dependent on model choice. Let’s look at a simple You did not collect historical data from one of the model predicting future inflation as its five-year locations in your model. historical average. Is this model valuable? If there is a demonstrable relationship between inflation from 4. The real world is the lab — you can’t always be period to period (and this is the case) then the use flexible with experiment design. of a trailing average is reasonable. But should it be a Would there have been fewer audit findings if you five-year trailing period, a three-year trailing period had used a different sampling system? or a one-year trailing period? How can one make this 5. Independent variables affect each other. determination, and is it appropriate? There is nuance The type of customer who is complaining behaves in finding relationships between the variable you want differently during price increases. to predict (called the dependent variable) and the variables that will provide input to your model (called 6. The relationships between variables often are the independent variables). This requires specialized not linear. subject-matter expertise. It is necessary to show that How much would a client care about a price change independent variables are causal to the predicted of 1%, 5% or 500%? outcome as opposed to being non-causally correlated. 7. Relationships are not necessarily stable over time. Imagine that you are trying to predict the speed that Who is watching the same advertisement at 2pm a person is driving but that your model only contains and 6pm? information about the type of car that they own. The model might have some predictive power. However There are tools to test which model results in the best adding a variable containing the number of speeding predictions and the stability of the model over time. tickets the driver has received over the past three years This exceeds the scope of this paper. But the resources may result in a more accurate prediction. Furthermore, listed in the last section provide more technical adding