<<

Predictive

Mani Janakiram, PhD Director, Supply Chain Intelligence & Analytics, Intel Corp. Adjunct Professor of Supply Chain, ASU October 2017

" is very difficult, especially if it's about the future." --Nils Bohr, Nobel laureate in Physics

M. Janakiram Oct 2017 Training Objective

• Data Analytics is now becoming the game changer in the industry and it is much more relevant for supply chain functions. Predictive Analytics is an advanced Data Analytics that leverages historical data and combines it with models to predict future outcomes. It can be used across the entire gamut of supply chain such as Plan, Source, Make, Deliver and Return. • In this session, we will focus on the what, where and how predictive analytics can be used. Some of the techniques covered in this session include: Regression, Forecasting, Basic Algorithms (Clustering, Cart) and Decision Trees. Through use cases, we will demonstrate application of predictive analytics in supply chain functions.

2 M. Janakiram Oct 2017 What if you were able to….

• Predict the outcome of your new product launch?

• Evaluate the impact of your campaigns hourly and make inventory adjustments in real-time?

• Predict potential failure of your critical equipment and plan contingencies?

• Predict the market sentiments and buying behavior of your consumers in real-time?

• Predict the outcome of the US Presidential elections?

3 M. Janakiram Oct 2017 Data to Insights: Driving Industry 4.0

Cognitive Analytics

4

4 M. Janakiram Oct 2017 Supply Chain Analytics

5

5 M. Janakiram Oct 2017 Predictive Modeling

Data Correlated Build Predict History Variables Model Future Y (x1, x2, x3, , y) Y = f(x1,x2,x3…) Given Xs, calc Y

• Learn the model based on historical data

• Correlation is used to understand strength of linear relationship. Beware causation versus correlation

• Predictive techniques (y=f(x)) such as Regression, Time Series Forecasting, Machine learning, Decision Tree are used to build models

Terms such as and machine learning are often used

6 M. Janakiram Oct 2017 Predictive Model Algorithms (y=f(x)) Supervised Learning Unsupervised Learning Y = Numeric Y = Categorical No Y

Regression or Regression Classification Clustering Classification

Linear Neural Linear Discriminant Decision Hierarchical Networks Trees

K Means Best Logistic Bayesian Support Stepwise Subsets Regression Networks Vector Machines Principal Components Analysis Time Random Gradient K Nearest Lasso Ridge Series Forests Boosting Neighbors Forecasting = we will look at these today 7 M. Janakiram Oct 2017 Better Prediction  Better Model!

y

Three models: Under fit: bias Over fit: variance Just right! x

Balance accuracy of fit on training data and generalization of future or test data to get a better model!

8 M. Janakiram Oct 2017 Predictive Modeling

• Regression Analysis

• Time Series Forecasting

• Machine Learning & Decision Trees

9 M. Janakiram Oct 2017 Regression and Correlation

• Regression analysis is a tool for building statistical models that characterize relationships between a dependent variable and one or more independent variables, all of which are numerical. – A regression model that involves a single independent variable is called simple regression. A regression model that involves several independent variables is called multiple regression. • Correlation is a measure of a linear relationship between two variables, X and Y, and is measured by the (population) correlation coefficient. Correlation coefficients will range from −1 to +1. Scatter plot evaluates the relationship between two variables.

10 M. Janakiram Oct 2017 • Find the line to minimize the squared error of the fit (least squares loss function)

5 Fit Y = β0 + β1 x 4 (note – model can Y extend to any 3 number of x’s)

2 Error of fit to this point ⇒ This is called a residual 1

1 2 3 4 5 x Predictive Model Y = 1 + 0.67x

11 M. Janakiram Oct 2017 Watch out for Nonsense Correlations!

• Beware lurking variables! • Correlation does not mean Causation

NOTE: images captured from the web 12 M. Janakiram Oct 2017 Simple Linear Regression yˆ = b01 + bx Example: Keith’s Auto Sales Reed Auto periodically has Number of Number of a special week-long sale. TV Ads (x) Cars Sold (y) As part of the advertising 1 14 campaign Reed runs one or 3 24 2 18 more television commercials 1 17 during the weekend preceding 3 27 the sale. Based on a sample of 5 previous sales, determine Σx = 10 Σy = 100 whether number of cars sold is = = influenced by number of TV Ads. x 2 y 20

See Excel output 13 M. Janakiram Oct 2017 Simple Linear Regression Excel Output

14 M. Janakiram Oct 2017 Multiple Regression Model ^ y = b0 + b1x1 + b2x2 + . . . + bpxp

Example: Programmer Salary Survey A software firm collected data for a sample of 20 computer programmers. A suggestion was made that regression analysis could be used to determine if salary was related to the years of experience and the score on the firm’s programmer aptitude test. The years of experience, score on the aptitude test, and corresponding annual salary ($1000s) for a sample of 20 programmers is shown in the table. Exper. Score Salary Exper. Score Salary 4 78 24.0 9 88 38.0 7 100 43.0 2 73 26.6 1 86 23.7 10 75 36.2 5 82 34.3 5 81 31.6 8 86 35.8 6 74 29.0 10 84 38.0 8 87 34.0 0 75 22.2 4 79 30.1 1 80 23.1 6 94 33.9 See Excel output 6 83 30.0 3 70 28.2 6 91 33.0 3 89 30.0 15 M. Janakiram Oct 2017 Multiple Linear Regression Excel Output

16 M. Janakiram Oct 2017 Predictive Modeling

• Regression Analysis

• Time Series Forecasting

• Machine Learning & Decision Trees

17 M. Janakiram Oct 2017 Time Series Forecasting?

• Forecasting is an estimate of future performance based on past/current performance • Qualitative and Quantitative technique can be used for forecasting • Forecasting can be used to get insight into future performance for better planning and managing the future activities • Forecasting examples: – Revenue forecasting – Product demand forecasting – Equipment purchase – Inventory planning, etc.

18 M. Janakiram Oct 2017 Forecasting Inputs

Internal: Qualitative: Sources: • Performance Methods: • Judgment history • Intuition • Experience • Expectations

External: Quantitative: • Economy • Data • Market • Statistical • Customers analysis • Competition • Models

Factors: • Promotions • Seasonal changes • Customer tastes • Price changes • Trends / cycles • Gov’t regulations

Forecast model should be better than Naïve models (which use Average or last period values) Forecast !

19 M. Janakiram Oct 2017 Forecasting Methods

Graphical Techniques

Life Cycle Models

Combination of qualitative and quantitative techniques can be used for better forecasting..

20 M. Janakiram Oct 2017 Qualitative Models

• Field Sales Force – Bottom-Up Aggregation of Salesmen’s Forecasts (subjective feel). Surveys are also used very often to get a feel for what the future holds. • Jury of Executives – A forecast developed by combining the subjective opinions of the managers and executives who are most likely to have the best insights. We use it internally (called as Forecast Markets) for product forecast and transition. • Users’ Expectations – Forecast based on users’ inputs (mostly subjective) • Delphi Method – Similar to Jury of Jury of Executives but the process is recursive until consensus among the panel members is reached.

21 M. Janakiram Oct 2017 Quantitative Time Series Models • Naïve Forecast – Simple Rules Based upon Prior Experience • Trend – Linear, Exponential, S-Curve and Other Types of Pattern Projection • Moving Average – Un-weighted Linear Combination of Past Actual Values • Filters – Weighted Linear Combination of Past Actual Values • Exponential Smoothing – Forecasts Obtained by Smoothing and Averaging Past Actual Values • Decomposition – Time Series Broken Into Trend, Seasonality, Cyclicality, and Randomness • Data mining – Mostly CART and Neural Network techniques are used for forecasting 22 M. Janakiram Oct 2017 Causal/Explanatory Models

• Regression Models: – Modeling the value of an output based on the values of one or more inputs. Also called causal models (can have multiple inputs). • Econometric Models: – Simultaneous Systems of Multiple Regression Equations • Life Cycle Models: – Diffusion-based approach, used to predict market size and duration based on product acceptance behavior • ARIMA Models: – Class of time series models that take into consideration, various behavior of the response in question based on past time series data. • Transfer Function Models: – Class of time series models that take into consideration, various behavior of the response along with causal factors based on past time series data.

23 M. Janakiram Oct 2017 Simple Moving Average (or UWMA)

• The Simple Moving Average, or Uniformly Weighted Moving Average (UWMA) for a First 3 data time period t, with a range span of k, is points averaged described as:

k ∑ xt−i SMA = i=1 t k

• The forecasted value for time t+1 becomes: k ∑ x(t+1)−i i=1 SMA + = t 1 k

24 M. Janakiram Oct 2017 Exponentially Weighted Moving Average (EWMA)

• In contrast to Simple Moving Average, Exponentially Weighted Moving Average (EWMA) for a time period t, uses all of previously observed data, instead of just a fixed range.

• Smoothed value for a time period, t, indicated by Zt.

Zt = α yt−1 + (1−α) Zt−1 or

Zt = Zt−1 + α (yt−1 − Zt−1)

• Each previous observation receives different weight, with more recent observations being weighted more heavily, and weights decreasing exponentially.

• How these weights decrease determined by smoothing parameter, α. Normally it ranges from 0.2 to 0.4. – When α approaches 0, all prior observations have equal weight – When α approaches 1, most weight is given to more recent observations

• Smoothing method should be done on stationary time series without seasonality.

25 M. Janakiram Oct 2017 Forecasting Scenario*

• Process: Sets of 10 coin flips

• What’s the naïve model? • Can you beat it? If so, how?

26 M. Janakiram Oct 2017 Forecasting: Common Pitfalls • Good-fitting model ≠ accurate forecast – Standard modeling methods can be (ab)used to over-fit the data

Forecast

Fitting a model to history is easy – good forecasting is not. (SAS Institute) 27 M. Janakiram Oct 2017 Example: Regression vs. Time Series Time series: Collection of observations made over equally spaced time intervals E.g.: cost, inventory levels, shipment times, weekly forecast values

Time Series Example: Monthly CO2 concentrations from Mauna Loa captured for 14 years • Estimate CO2 data for next 2 years • Intuitively, we could do better than the simple linear regression, right?

Bivariate Fit of CO2 By Year/Month Forecast 360 350 356

345 352

348

340 346

CO2 342

335

Predicted Value 338 Simple linear regression model shows overall trend of data, but it is 334 330 not a best representation of all 332 patterns in data! 328 325 326 0 50 100 150 200 01/1973 01/1974 01/1975 01/1976 01/1977 01/1978 01/1979 01/1980 01/1981 01/1982 01/1983 01/1984 01/1985 01/1986 01/1987 01/1988 01/1989 01/1990 Year/Month Row

Linear Fit

Time Series Models can yield better predictive models! It takes advantage of correlation (s) between the current value and other responses that occurred previously in time 28 M. Janakiram Oct 2017 Components of a Time Series

• The time series data can mainly have the following components:

Level Trend Cyclical Seasonal Irregular

1350000 3 375000 1150000 2.5 370000 950000 2 365000 750000 1.5 360000 550000 1 355000 350000 0.5 350000 150000 0 Jul Jan Sep Jun Oct Dec Feb Nov Apr Aug Jul Mar May Jan Sep Jun Oct

Dec Feb Jul Nov Apr Aug Dec Jan Feb Apr Jun Sep Oct Nov Mar Aug May Mar May Level Linear (Trend) Poly. (Trend) Seasonality

Level: Absence of trend, Trend: Continuing pattern of increase or Seasonal: A repeating seasonality, or noise. decrease, either straight-line or curve pattern of increases and Also, check for stability decreases that occurs (Stationary vs. Non- with a regular frequency stationary First plot your data chart to understand any behavior and relationships

29 M. Janakiram Oct 2017 Models to Account for Seasonality

• Think of creating model to account for Seasonality as extension of the exponential smoothing.

• There are 3 terms to estimate in a seasonal model, the Level, the Trend, and the Seasonal components.

Now look at this CO2 data Seasonal from this perspective! Trend Level Forecast 360

356

352

348 346

342

Predicted Value 338

334 332

328 326 0 50 100 150 200 Row 30 M. Janakiram Oct 2017 Sample Size Guidance

50-100 observations More is recommended better. .

• 1 to 2 years of weekly data. • 4 to 8 years of monthly data. • 12 to 24 years of quarterly data.

31 M. Janakiram Oct 2017 Predictive Modeling

• Regression Analysis

• Time Series Forecasting

• Machine Learning & Decision Trees

32 M. Janakiram Oct 2017 Machine Learning (ML)/Data Mining (DM)

  Pre-processing Set clear business goal Get the right data

Training data ML/DM follows the  problem solving Mine the data methodology

Testing data  Model/Result  Presentation Determine what is interesting New data

33 M. Janakiram Oct 2017 Decision Trees (Classification) • Finds the best factor split points to minimize

the classification error Y is a 2 level Y = Or categorical variable 5 x1 4 < 4.5 x2 3 x2

2 < 3 Model classification 1 error

The model is a 1 2 3 4 5 x1 piecewise constant Predictive Model function If x1< 4.5 and x2 < 3 Then Y = Else Y =

34 M. Janakiram Oct 2017 Iris data – Clustering Analysis

• Iris data first published by geneticist and statistician Ronald Fisher in 1936 is widely used in machine learning for benchmarking. • The sepal length, sepal width, petal length, and petal width are measured in mm on fifty Iris specimens from each of three species: 1. Iris Setosa 2. Iris Versicolor 3. Iris Virginica • Machine learning goal is to classify the species based on the flower characteristics

35 M. Janakiram Oct 2017 Iris data

Y: 1=Setosa, 2=Versicolor & 3=Virginica; X: Sepal Length, Sepal Width, Petal Length and Petal Width Scatterplot Matrix

7.5 Species setosa 6.5 Sepal versicolor 5.5 length virginica

4.5 4.5 4 3.5 Sepal w idth 3 2.5 2 7

5 Petal 3 length

1 2.5 2 1.5 Petal w idth 1 0.5

4.5 5 5.5 6 6.5 7 7.5 8 2 2.5 3 3.5 4 4.5 1 2 3 4 5 6 7 .5 1 1.5 2 2.5

36 M. Janakiram Oct 2017 Tree building

Scatterplot Matrix 8.0 Species 7.5 setosa 7.0 versicolor 6.5 virginica 6.0 Sepal 5.5 length 5.0 4.5 4.5

4.0

3.5 Sepal w idth 3.0 1

2.5 OPTIMAL

2.0 7 6 5 4 Petal length 3 2 2 1

2.5

2.0 2

1.5 Petal w idth 1.0 1

0.5

4.5 5.5 6.5 7.5 2.0 2.5 3.0 3.5 4.0 4.51 2 3 4 5 6 7 .5 1.0 1.5 2.0 2.5

37 M. Janakiram Oct 2017 Tree building 1.00

virginica 0.75

0.50 versicolor Species

0.25 setosa

0.00 All Row s

1.00

virginica Split 1 0.75

0.50 versicolor Species

0.25 setosa

0.00 Petal w idth>=0.8 Petal w idth<0.8 All Row s

1.00

virginica 0.75

0.50 versicolor Species

0.25 setosa

0.00 Petal w idth<1.8 Petal w idth>=1.8 Petal w idth>=0.8 Petal w idth<0.8 All Row s 38 M. Janakiram Oct 2017 Single tree: Classification vs. regression Classification Regression Categorical Response: Species Continuous Response: Species Number

Output: Classification counts, Output: mean and variance for and Misclassification % each node. 39 M. Janakiram Oct 2017 Gartner Magic Quadrant - Data Science

40 M. Janakiram Oct 2017

41 M. Janakiram Oct 2017 Questions?

42 M. Janakiram Oct 2017