<<

Index

A correlation, cont. approximate 95% Confidence Intervals, to choose lags, 249 43 cross sectional attribute importance, 282, 295–299 difference between cross sectional , 242, 253–257 and , 243 Crystal Ball, 44–47, 65–69 90% , 45 B assumptions, 44–47, 65–68 bounded dependent variable, 378–398 cumulative distribution, 7, 23 built in synergies, 315, 334, 346, 353, 381, 394 D descriptive , 5–30 C dispersion, 11 categorical, 11–12, 15–16 dummy variables, 275–305 , 11–12 Durbin Watson, 242–246, 253–257 column , 15–16, 27–28, 61–65, 71–72 E confidence interval, 41–43, 49–58, 60–61, Empirical Rule, 13–14 70–1, 74–77 equations, 91, 103–104, 202, 224, 275, alternate scenarios, pairs, 54–58, 74–77 277, 279, 288–289, 292–293, 301–303, conservative, 55 319–320, 333–334, 343–344, 347–348, margin of error, 45 354, 377, 379–381, 387–389, 392 one sample, 41–44, 60–61 in logits, 377, 379–381, 387–389, 392 proportion, 50–54 interactions, 343–344, 347–348, 354 two sample, two segment, 49–50, natural logarithms, 347–348 70–71 rescaling from logits, 380–381, conjoint analysis, 278–283, 295–299 388–389 attribute importance, 282, 295 square roots, 320, 334, 354 hypotheticals, 279–280, 295 standard format, 103–104 orthogonal array, 280–281 with indicator variables, 275, 277, 279, part worth utilities, 279–283, 295–296 288–289, 292–293, 302–303 contingency analysis, 171–192 Excel chi square, 174–177, 187–190 autocorrelation, assess, 253–257 chi square, sparse cells, 175–177 chi square, PivotTable, 187–190 conditional probability, 171–174 column chart, 27–28, 61–65, 71–72 crosstabulation, 171–172 confidence interval, 60–63, 70–71, joint probability, 171–172 76–77 Simpsons Paradox, 177–182 alternate scenarios, pairs, 76–77 sparse cells, 175–177 one sample, 60–63 continuous, 11–13 two segments, 71–72 correlation, 105–113 conjoint analysis, 295–299 and regression, 109–113 contingency analysis, 185–194

406 Index

Excel, contingency analysis, cont. Excel, logit regression, cont. chi square, 187–190 sensitivity analysis, 392–398 summary , 190–192 synergies, 394–398 correlation, 124–125 marginal impact of drivers, 221–227, crosstabulation, PivotTable, 185–187 263–264, 334–336, 367–369, Crystal Ball, 65–69 393–396 Durbin Watson, 253–257 model building, 224–35 fit and forecast, 260–263 autocorrelation, assess, 253–257 forecasting, 258–271 Durbin Watson, 253–257 Durbin Watson, 262 forecasting, 250–265 illustrate fit and forecast, 260–263, illustrate fit and forecast, 260–263, 365–367 365–367 impact of drivers, 263–264, impact of drivers, 263–264, 334–337, 367–369 334–337, 367–369 lag, choice of, 250–253 lag, choice of, 250–253 prediction intervals, 258–260, multicollinearity symptoms, 216 301–302 partial F test, 217–220 predictions from model equation, prediction intervals, 258–260, 257–260, 301–303, 333–334, 301–302 336–337, 363, 367–368, 392–394 predictions from model equation, recalibrate, 259–260, 302–303, 257–260, 301–303, 333–334, 364–365 336–337, 363, 367–368, 392–394 validation, 257–259, 301–302, sensitivity analysis, 221–226, 363–364 263–265, 297–299, 303–305, , 20 334–336, 367–369, 393–394 hypothesis test, 59–60, 69, 74–76 time series, 250–265 alterante scenarios, pairs, 74–76 model validation, 257–259, 301–302, one sample, 59–60 363–364 two sample, 69 monte carlo simulation, 67–71 indicator variables, 295–305 multicollinearity symptoms, 216 interactions, 326–337 multiple regression, 216–227 adding, 361–362 partial F test, 217–220 illustrate fit and forecast, 365–367 sensitivity analysis, 221–226 sensitivity analysis, 367–369 , 326–337 lag, choice of, 250–253 assess , 326–327 logit regression, 386–398 equation, square roots, 334 equations, 393 marginal impact, 334–337 marginal impact, 392–398 marginal response, 334–337 rescale, 391–398 rescale, 327–328, 334, 336 bounded dependent variable to back from square roots, 334 logits, 391 inverses, 328 bounded dependent variable to natural logarithms, 327–328 odds, 391 square roots, 327–328 from logits, 394 sensitivity analysis, hypotheticals, from odds, 394 336 odds to logits, 391 synergies, 335–336 Index 407

Excel, cont. forecasting, cont. partial F test, 217–220 correlation to choose lags, 241, 244, , 74–75 252–253, 256 PivotChart, PivotTable, 26 Durbin Watson, 242–246, 253–257 portfolio analysis, 170–175 hold out observations, 241 beta, 172 inertia, 238–239 Efficient Frontier, 172–175 interactions, 343–344 expected rate of return, beta, 170–171 lag, choice of, 239–241,244, 250–253, prediction intervals, 258–260, 301–302 256 predictions from model equation, Leading Indicator, 238 257–260, 301–303, 333–334, recalibration, 246, 259–260 336–337, 363, 367–368, 392–394 residual analysis to identify recalibrate, 259–260, 302–303, 364–365 unaccounted for trend or cycles, regression, 114–127 242–244, 253–256 rescale, 326–328 validation, 235, 241, 246, 257–259 sensitivity analysis, multiple variable selection, time series, 237–239 regression, 221–226 shortcuts, 29–30, 78–79, 126–127, G 193–194 gains from nonlinear regression, 324 t test, 59–60, 69, 74–76 one sample, 59–60 H paired, alternative scenarios, 74–76 histogram, 5–6, 17–19 two segments, two samples, 69 hold out observations, 249 time series, 253–264, 301–303, hypothesis, 38–40, 48–49, 54–57, 59–60, 333–337, 363–369 69, 74–76 autocorrelation, assess, 253–257 alternate scenarios, pairs, 54–57, 74–76 Durbin Watson, 253–257 alternative, 38 illustrate fit and forecast, 260–263, null, 38 365–367 one sample, 38–40, 59–60 impact of drivers, 263–264, paired, alternate scenarios, 54–57, 334–337, 367–369 74–76 lag, choice of, 250–253 two segment, two sample, 48–49, 69 prediction intervals, 258–260, hypotheticals, 222–223, 279–280, 295, 301–302 334–336, 356–357, 368, 381–384, predictions from model equation, 392–393 257–260, 301–303, 333–334, 336–337, 363, 367–368, 392–394 I recalibrate, 259–260, 302–303, indicator variables, 275–305 364–365 conjoint analysis, 278–283, 295–299 validation, 257–259, 301–302, hypotheticals, 279–280, 295 363–364 part worth utilities, 279–283, 295 validation, 257–259, 301–302, 363–364 equations, 275–277, 279, 286, 288–289 F modify intercept, 275–276 seasonality, 283–290 forecasting, 235–265 segment differences, 276–278 autocorrelation, 242, 254–257 structural shift, 291–293, 299–305 408 Index indicator variables, cont. model building, cont. value of product attributes, 278–283, autocorrelation, 242, 253–257 295–299 correlation to choose lags, 241, 244. inertia, 238–239, 255 252–253, 256 inference, 35–77 cross sectional versus time series, 243 interactions, 343–369 equation, 202, 206, 209, 224 baseline, 343–344, 347, 351, 361 F test, multiple regression, 204 built in synergies, 346, 348–349, forecasting, 239–244, 246, 253–257, 353–355 259–260 equations, 343–344, 347–348, 354 autocorrelation, 242, 253–257 main effect not significant, 347 lag, choice of, 239–241, 250–253 modify slope, 343–344, 348–349 recalibration, 246, 259–260 segment response differences, 343–350 residual analysis to identify sensitivity analysis, 356–357, 367–369 unaccounted for trend or cycles, structural shifts, 351–69 242–244, 253–256 time series, 359–69 goals, 201, 235 indicator variables, 275–305 J inertia, 238–239 jointly significant, 209 joint significance, 209 Leading Indicator, 238 L marginal response, multiple regression, lag, choice of, 239–241, 244, 250–253, 202 256 multicollinearity, 203–209, 217–220 Leading Indicator, 238 joint significance, 209 limited, dependent variable, 377–398 partial F test, 207–209, 217–220 logit regression, 377–398 remedies, 206–207 built in synergies, 381–384, 394–396 symptoms, 205, equations, 377, 379–381, 387–389 multiple regression, 201–227 limited or bounded dependent variable, equation, 202, 224, 275, 277, 279, 377 288–289, 292–293, 301–303, logits, 377, 379–380, 387–388, 319–320, 333–334, 343–344, 391–392 347–348, 354,377, 379–381, odds, 377, 380, 388 387–389, 392 rescaling, 377, 379, 380, 387–388, 391, F test, 204 394 joint significance, 209 back from logits, 380, 388, 394 marginal response, 202 to logits, 377, 379, 387, 391 multicollinearity, 203–209, 217–220 to odds, 380, 388, 394 partial F test, 207–209, 217–220 s shaped response, 377 remedies, 206–207 symptoms, 205 M RSquare, 212 margin of error, 43–44, 60–62, 70–71, 73, sensitivity analysis, 211–213, 76–77 221–227, 320–322, 334–337, memos, 147–148 356–357, 367–369 model building, 201–227, 235–265, partial F test, 207–209 275–305 RSquare, multiple regression, 212

Index 409 model building, cont. Normally distributed, 12–14 sensitivity analysis, 211–213, 221–227, 320–322, 334–337, 356–357, O 367–369 one tail test, 39–41 time series, 235–246, 250–259 orthogonal array, 279–280 autocorrelation, 242, 253–257 outliers, 7–10, 20–22 hold out observations, 241 lag, choice of, 239–241, 244, P 250–253, 256 recalibration, 246, 253–257 p value, 39, 59–60, 69, 74 residual analysis to identify part worth utilities, 279–283, 295–299 unaccounted for trend or cycles, partial F test, 207–209, 217–220 242–244, 253–256 pie chart, 54, 72–73 validation, 235, 241, 246, 257–259 PivotChart, PivotTable, 24–28, 172–173, validation, 235, 241, 246, 257–259 185–187, 190–192 variable selection, logic, 201–202 portfolio analysis, 149–168 variable selection, time series, 237–246 beta, 152–160, 165–166 model building process, 201–227, Efficient Frontier, 161, 166–168 235–246 expected rate of return, 149–151, 158, monte carlo simulation, 44–47, 65–69 164–165 PowerPoints, 145–147 N predicted performance, y hat, 91 prediction intervals, 99–102, 118–123 nominal, 12 nonlinear regression, 331–337 Q built in synergies, 315, 334–338 equation, square roots, 320, 334 quantitative, 11–12 nonconstant response, 313 R Normalize positively skew, 314–315, 327 recalibration, 246, 259–260 relative strength of drivers, 320–322, regression, 91–127 334–337 ANOVA, 95 rescaling, 314–315, 317, 320, 324, conditional prediction intervals, 327–328, 334, 348 101–102, 122–123 back from square roots, 320, 334 equation, 92–93, 114–115 from natural logarithms, 348 equation, standard format, 114–115 gains, 324 F test, 93–96 negative values, inverses, 314–315 heteroskedasticity, 98, 116 square roots, natural logarithms, mean square error, MSE, 94 317, 327–328 prediction intervals, 99–100, 118–123 sensitivity analysis, 320–322, 334–337 regression sum of squares, SSR, 94–95 square roots, natural logarithms, 317, residuals, 93–94, 98–99, 116–117 320, 327–328, 334 plot, 98, 114, 116 Tukey’s Ladder of Powers, 313–315, Normal, 99, 117 327 RSquare, 95, 107 Normalize positively skewed, 314–315, sensitivity analysis, 101 327 slope, 96–98, 109–112 410 Index regression, cont. skewness, 313–319, 326, 328 , 94–95, 99–100, 116 assess, 315–316, 326–327 sum of squared errors, SSE, 94 correction, 317–318, 327–328 relative strength of drivers, 320–322, Normalize positively skew, 317, 334–337 327–328 rescaling, 318, 320, 324, 334, 348, rescaling negative values, inverses, 377–379, 387, 391–392 318, 328 from bounded dependent variable to Tukey’s Ladder of Powers, 313–315, logits, 377, 379, 387, 391 327 from limited dependent variable to standard error, 36–38, 51, 53, 57, 59, 70 logits, 377, 379, 387, 391 structural shift, 291–293, 299–305 from natural logarithms, 348 Student t, 36–38 from square roots, 320, 334 gains, 324 T negative values, inverses, 318 time series s shaped response, 377–378 autocorrelation, 242, 254–257 to logits, 377, 379, 387, 391 correlation to choose lags, 241, 244, to odds, 392 252–253, 256 to square roots, natural logarithms, 317, difference from cross sectional, 243 327–328 Durbin Watson, 242–246, 253–257 residual analysis to identify unaccounted interactions, 351–377 for trend or cycles, 242–244, 253–256 residual analysis to identify round, 10 unaccounted for trend or cycles, 242–244, 253–256 S variable selection, 237–239 scale, 11–12 Tukey’s Ladder of Powers, 313–315, 327 seasonality, 283–289 sensitivity analysis, 219–222, 328–331 V significance level, 39, 69 validation, 235, 241, 246, 249, 257–259