Subject index

Symbols angle() (axis-label suboption) . . . 124– ∆β ...... 274–275 125 ∆χ2 ...... 275 ANOVA ...... see regression β ...... see regression, standardized Anscombe quartet ...... 199, 201, 226 coefficient anycount() (egen function) ...... 88 *...... see do-files, comments append...... 316–319 +...... see operators arithmeticmean...... see average //...... see do-files, comments arithmeticalexpressions...... see #delimit ...... 36–37 expressions &...... see operators ascategory (graph dot option) . . . 149 N ...... 82–83, 95 ASCII files ...... 292–301 all...... 47–48 ATS ...... 354 b[ ]...... 185 augmented component-plus-residual plot merge (variable)...... 310–312 ...... 204 n ...... 81–82, 95 autocode() (function)...... 152 | ...... see operators autocorrelation...... see regression, || ...... 104 autocorrelation  ...... see operators average ...... 15–16, 153, 156 ~ ...... see operators avplots ...... 207–208 aweight (weighting type) ...... 68–69 A axis labels ...... 107–108, 122–125 Academic Technology Service. .see ATS scales...... 117–119 added-variable plot...... 206–209 titles ...... 107–108, 126–127 additiveindex...... 79 2 transformations...... 118–119 adjusted count R ...... 266–267 adjusted R2 ...... 195–196 ado-directories...... 359 B ado-files balanced panel data ...... 312–313 basics ...... 335–337 bands() (twoway mband option) . . 202– programming ...... 337–351 203 aggregate ...... see collapse bar (graph type). . . .105, 108, 146–147, AIC ...... 267 159 Akaike’s information criterion. .see AIC bar charts ...... 146–147, 159 Aldrich–Nelson’s p2 ...... 326 batchjobs...... see do-files alphanumerical variables . . . .see strings Bayesian information criterion...... see analysis of ...... see regression BIC 370 Subject index

Bernoulli’s distribution . . . see binomial command + . . . . . see commands, break distribution command line. .see windows, command beta...... see regression, standardized commands coefficient abbreviations ...... 14, 46–47 bias...... 200 access previous ...... 14 BIC ...... 267 break...... 14 bin() (histogram option) . . . . 164–165 e-class...... 71 binary variables. .see variables, dummy end of commands ...... 36–37 binomial distribution...... 254–255 external ...... 45–46, 337, 354 BLUE ..see Gauss–Markov assumptions internal ...... 45–46, 337, 354 bookmaterials...... 2–3 long...... see do-files, line breaks bookstore...... 353 r-class...... 71 bootstrap ...... 228–229 search...... 23 box (graph type) ...... 105, 161–163 comments...... see do-files, comments box plots ...... 22, 161–163 component-plus-residual plot . . 203–204 Box–Coxtransformation...... 227 compound quotes ...... 347–348 ...... 227 bcskew0 compress...... 319 break...... see commands, break compute...... see generate browse...... 291–292 cond() (function)...... 347 by prefix. .18–19, 60–61, 81–86, 95, 156 conditional-effects plot ...... 222–223 by() (graph option)...... 131 conditions...... see if qualifier by() (tabstat option)...... 157 confidence interval ...... 187–188 bysort ...... 60–61 connect() (scatter option) . . 112–115 byte(storagetype)...... 100 connected (plottype) ...... 112–115 see C contingencytable...... frequency tables, two-way calculator...... see pocket calculator contract ...... 68 caption() (graph option) . . . . 128–129 Cook’s D ...... 209–212 capture...... 38–39 categories ...... 138–139 cooksd (predict option)...... 209 cd ...... 9–10 correlation center...... see variables, center coefficient ...... 178–179 chi-squared negative...... 178 likelihood-ratio...... 143 positive...... 178 Pearson...... 143 weak...... 178 2 classification tables...... 264–267 count R ...... 265–267 clockposition...... 121 covariate . . . . see variables, independent cluster samples...... 229–230 covariatepattern...... 267 cmdlog ...... 32–34 cplot ...... 179 CMYK ...... 111 cprplot ...... 203–204 CNEF ...... 317 Cramer’sV...... 143 coefficient of determination ...... 191 cross-tabs...... see tables collapse ...... 81 .csv ...... see spreadsheet format comma-separated values...... see Ctrl + Break ....see commands, break spreadsheet format Ctrl + C ...... see commands, break Subject index 371 cumulated probability function . . . . . see discrepancy ...... 210–211, 272–273 probit model discrete (histogram option) . . . . . 144 display...... 55, 328–329 D distributions Data Editor...... 301–302 describe ...... 137–174 data matrix ...... 291–292 grouped ...... 150–152 dataregion...... 107 do ...... 26 datatypes...... see storage type do-files datasets analyzing...... 40 ASCII files ...... 294–301 basics...... 25–27 combine ...... 306, 308–319 comments...... 36 describe ...... 11–12 create...... 40 export...... 319–320 editors ...... 25–26 hierarchical ...... 85, 313–316 error messages ...... 26–27 import ...... 293–294 execute...... 26 load ...... 10–11 exit...... 39 nonmachine-readable . . . . . 301–306 from interactive work ...... 29–34 oversized ...... 322–323 line breaks ...... 36–37 panel data...... 232–233 master ...... 40–43 preserve...... 66 organization ...... 39–43 rectangular...... 292 set more off...... 37–38 reshape...... 232–236 versioncontrol...... 37 restore...... 66 doedit...... 25,30 save ...... 27–28, 319 dot (graph type) ...... 105, 148–149, sort...... 14–15 159–161 titanic...... 246 dot charts ...... 148–149, 159–161 dates double(storagetype)...... 100 combiningdatasets...... 305 drop ...... 12,48 elapsed dates ...... 93–94 dummyvariables...... see variables, fromstrings...... 94 dummy degreesoffreedom...... see df delete...... see erase E density ...... 163, 165–167 e() (saved results) ...... 71–72 describe...... 8–9, 11–12 e(b) (savedresult)...... 271 destring ...... 90 e-class...... see commands, e-class df...... 190 edit ...... 301–302 DFBETA ...... 205–206 egen ...... 87–88 dfbeta ...... 206 EMF ...... 135–136 dictionary...... 298–301 Encapsulated PostScript...... see EPS dir ...... 10,27 encode ...... 90 directory endogenous variable...... see variables, change ...... 9–10 dependent contents...... 10 enhancedmetafile...... see EMF working directory . . 3, 9–10, 58–59 Epanechnikov kernel ...... 167–168 discard ...... 337 EPS ...... 135–136 372 Subject index erase...... 43,310 functions...... 57 ereturn list ...... 72 fweight (weighting type) ...... 66–68 error messages fxsize() (graph option)...... 133–134 ignore ...... 38–39 fysize() (graph option)...... 133–134 invalidsyntax...... 17 error-components model...... 240–242 G estat bootstrap ...... 228 gammacoefficient...... 143 estat classification...... 265, 290 Gauss curve . . . see estat dwatson ...... 217 Gaussdistribution...... see normal estat effects ...... 230 distribution estat gof ...... 268, 290 Gauss–Markovassumptions...... 199 estat ic ...... 267, 277 GEE ...... 242 Excel files ...... 292–293 generalized estimation equations . . . see exit (inDo-Files)...... 39 GEE exit Stata ...... 27–28 generate ...... 24, 75–86 exogenousvariable...... see variables, generate() (tabulate option). . . .147, independent 219 expand ...... 68 gladder (statistical graph) ...... 107 export...... see datasets, export graph ...... 103–136 expressions ...... 55–57 graph region ...... 107, 116 extensions...... 59 graphs 3D...... 107 F combining ...... 132–134 F test ...... 191–192 connecting points ...... 113–114 FAQ ...... 23,353 elements...... 107–108 fence...... 162 export...... 135–136 filenames ...... 58–59 multiple ...... 129–134 Fisher’sexacttest...... 143 overlay ...... 129–131 five-number summary ...... 156, 161 print ...... 134–135 fixed format...... 298–301 titles ...... 128–129 fixed-effects model ...... 236–240 types ...... 104–107 float() (function)...... 101 weights...... 208 float(storagetype)...... 100 grid lines ...... 119, 124–125 foreach ...... 61–64, 313 grouping forvalues ...... 64–65 byquantiles...... 151 free format...... 296–298 intervals with arbitrary width. .152 frequencies intervals with same width . . . . 151– absolute ...... 139–140 152 conditional ...... 141–142 GSOEP ...... 4, 11–12, 96–97, 229, 232, relative ...... 139–140 306–308, 317 frequency tables ...... 20–21 one-way ...... 139–140 H two-way ...... 140–143 help ...... 22–24 frequencyweights...... see weights help files ...... 351–352 function (plottype)...... 105 histogram...... 144–146, 163–165 Subject index 373 histogram (plottype)...... 105 line (plottype) ...... 108, 112–115 homoskedasticity...... see regression, linear combination ...... 180–181 homoskedasticity linear probability model ...... see Hosmer–Lemeshowtest...... 268 regression, LPM Huber/White/sandwich estimator. .216 linearregression...... see regression linearity assumption ...... 202–204, I 268–272 if qualifier ...... 17, 52–55 list ...... 12–14 imargin() (graph combine option) . . . local ...... 73, 327–331 ...... 133 localmacros...... see macros importing...... see data, import localmeanregression...... 269 in qualifier ...... 14–15, 51–52 see LOWESS infile...... 296–301 loess...... influential cases ...... 205–213, 272–275 log() (function)...... 77 input ...... 302–304 log (scale suboption) ...... 118–119 inputting data ...... 301–306 log files insheet ...... 295–296 finishrecording...... 39 inspect ...... 138 interruptrecording...... 32 interaction terms . . . . 220–221, 280–281 log commands ...... 31–34 invnormal() (function)...... 77 SMCL ...... 38 iscale() (graph combine option) . . . . start recording ...... 38–39 ...... 133 logarithm ...... 77 iteration block ...... 262–263 logicalexpressions...... see expressions K logistic...... 261 kdensity ...... 165–170 logistic regression Kendall’s tau-b ...... 143 coefficients ...... 259–262 kernel density estimator ...... 165–170 command...... 257–259 key variable ...... 309–311 dependent variable ...... 249–254 diagnostic ...... 268–275 L estimation ...... 254–257 label data ...... 319 fit ...... 263–268 labels marginaleffect...... 287 and values ...... 99–100 logit ...... 257–259 datasets...... 319 logitmodel...... see logistic regression display ...... 99–100 logits ...... 252–253 values ...... 21, 98–99 loops variables ...... 21, 97–98 foreach ...... 61–64 legend ...... 107–108, 127–128 leverage ...... 210, 272 forvalues...... 64–65 lfit (plottype)...... 129–130 lower() (function)...... 90 likelihood ...... 255–256 LOWESS ...... 204, 269–270 likelihood-ratio χ2 ...... 264 lowess (plottype)...... 269–270 likelihood-ratio test . . 275–277, 279–280 lowess (statistical graph) . . . . . 269–270 limits...... 9 LPM ...... see regression, LPM 374 Subject index

M mlabvposition() (scatter option) . . . macro ...... 121 extended macro functions . . . . 348– MLE ...... see maximum likelihood 349 mlogit...... 285–286 local ...... 73–74, 327–331 more off ...... 37–38 manuals...... 4 MSS ...... 189–190 margin() (graph option)...... 116–117 multicollinearity . . . . . 213–214, 218–219 marker multinomial logistic regression .. . . 284– colors ...... 111–112 288 labels...... 107, 120–122 mvdecode ...... 18,96 options ...... 109–110 mvencode ...... 96–97 sizes...... 112 symbols ...... 107, 110–111 N masterdata...... 310 net install ...... 357 match...... see datasets, combine NetCourses...... 354 matrix (command)...... 271 newlist...... 63 matrix (graph type) . . . . . 105, 202, 205 nonlinear relationships...... 224–225, maximum ...... 16, 155–156 277–278 maximum likelihood normal distribution principle...... 254–257 density ...... 281–282 searchdomain...... 263 density function ...... 282–283 mband (plottype) ...... 202–203 note() (graph option) ...... 128–129 mean...... see average notes ...... 97 median ...... 155–156 nullmodel...... 263 median regression.. . .212–213, 231–232 numlabel...... 100 median-trace ...... 202–203 numlist ...... 57–58, 63–64 memory...... see RAM merge ...... 308–316 metadata...... 313 O minimum...... 16, 155–156 observations missing definition...... 12 encode...... 96–97 list ...... 12–14 missing (tabulate option)...... 140 odds ...... 250–252 missingvalues...... see missings odds ratio ...... 251, 260–261 missings odds-ratio interpretation ...... 260–261 coding...... 304–305 OLS ...... 181–183 definition...... 13 operators...... 55–56 in expressions...... 54–55 options...... 19–20, 49–51 set ...... 17–18, 96 order ...... 319 ML ...... see maximum likelihood ordered logistic regression ...... see mlabel() (scatter option) . . . 120–122 proportional odds model mlabposition() (scatter option) . . . . ordinal logit model. . . .see proportional ...... 121 odds model mlabsize() (scatter option) . . . . . 121 ordinaryleastsquares...... see OLS Subject index 375

P PSID ...... 307, 317 packagedescription...... 358 pwd ...... 27, 58–59 panel data . . . . see datasets, panel data pweight (weighting type) ...... 69–70 partial correlation ...... see regression, standardized coefficient Q partialregressionplot...... see quantile plot ...... 170–173 added-variable plot quantileregression...... 231 partialresidualplot...... see quantiles ...... 154–156 component-plus-residual plot quartiles ...... 155–156 PDF ...... 135–136 quietly ...... 345 Pearson residual ...... 267–268 Q–Q plots ...... 173–174 Pearson-χ2 ...... 267–268 percentiles...... see quantiles R PICT ...... 135–136 r...... see correlation coefficient pie (graph type)...... 105, 148, 159 r() (saved results) ...... 71–72 pie charts ...... 148, 159 r(max) (saved result)...... 71–72 plot region ...... 107, 116–117 r(mean) (saved result) ...... 71–72 plotregion() (graph option) . . . . 116– r(mean) (savedresult)...... 64 117 r(min) (savedresult)...... 71 PNG ...... 135–136 r(N) (saved result) ...... 71–72 pocketcalculator...... 55 r(sd) (saved result) ...... 71–72 portable document format. . . . .see PDF r(sum) (savedresult)...... 71 PostScript...... see PS r(sum w) (savedresult)...... 71 ppfad.dta ...... 313 r(Var) (saved result)...... 71–72 predict ...... 186 r-class...... see commands, r-class 2 predicted values ...... 185–186 R ...... 191 Pregibons δβ ...... see δβ RAM...... 9, 320–322 preserve ...... 66 random numbers...... 63, 77–78 probability interpretation . . . . . 261, 262 random-effects model ...... 241–242 probit...... 283–284 range() (scale suboption). . . . .117–118 probit model ...... 281–284 RAW...... 294 see program define ...... 331–335 rawdata...... RAW see program drop ...... 333 recode...... variables, replace programs recode() (function) ...... 151–152 anddo-files...... 332 recode ...... 86–87 debugging ...... 333–334, 343 reference lines ...... 107, 119–120 define ...... 331–332 regress...... 25, 183–184 in do-files...... 334–335 regression naming...... 333 ANOVA table ...... 188–190 redefine ...... 332–333 autocorrelation ...... 216–217 syntax ...... 340–343, 346–348 coefficient ...... 184–186, 194–195 syntaxchecks...... 348 command ...... 183–184, 193–194 proportional odds model ...... 289–290 control ...... 197–199 PS ...... 135–136 diagnostics ...... 199–217 pseudo R2 ...... 263–264 fit ...... 190–192 376 Subject index regression, continued save ...... 27–28, 319 homoskedasticity ...... 214–216 saved results ...... 64, 71–74, 185, 271 linear...... 24–25, 177–243 scatter (plottype) ...... 105, 108 LPM ...... 246–249 scatterplot ...... 177–178 multiple ...... 192–193 scatterplot matrix ...... 202, 205 nonlinear relationships. . . .224–226 scatterplot smoother ...... 202–203 omitted variables ...... 213–214 search ...... 359 panel data...... 232–242 sensitivity...... 265–266 residuals...... 186 separate...... 173 simple ...... 180–183 signinterpretation...... 260 standarderror...... 188 SJ...... 23,353 standardized coefficient . . . 196–197 SJ-ados ...... 356–357 with heteroskedasticity .. . 226–227 SMCL...... 38, 351–352 replace ...... 24, 75–86 SOEP ...... see GSOEP reshape ...... 234–236 sort ...... 14–15 residual (predict option)...... 186 sort (scatter option) ...... 114–115 residual specificity ...... 265–266 definition...... 180 spreadsheet format ...... 294–296 sum ...... 182–183, 189 SPSS files...... 292–294 residual sum of squares ...... see RSS SSC ...... 358 residual-versus-fitted plot . . . . .200–201, ssc install ...... 358 215–216, 227 SSC-ados...... 358 responsevariable...... see variables, standarddeviation...... 16,153 dependent StataJournal...... 353 restore ...... 66 StataPress...... 353 Results window . . . see windows, Result StataTechnicalBulletin...... 353 window stata.toc ...... 358 return list ...... 72 Statalist...... 353 reverse (scale suboption). . . . .118–119 statisticalinference...... 187 Review window . . see windows, Review STB ...... 23,353 window STB-ados...... 356–357 RGB ...... 111 stereotype model...... 288–289 robust ...... 216 storage types...... 89–90, 100–101 root MSE ...... 191 strings ...... 297–298, 305 round() (function)...... 215 displayformat...... 95 rowmiss() (egen function)...... 88 inexpressions...... 90 RSS ...... 189 replace substrings...... 91–92 rstudent (predict option)...... 215 storage type ...... 89–90 running counter ...... 81–82 todates...... 94 running sum ...... 83–85 tonumeric...... 90 rvfplot ...... 200–201 strpos() (function)...... 90–91 subinstr() (function)...... 92 S subscripts ...... 84–86 samplingweights...... see weights substr() (function)...... 91–92 SAS files...... 292–294 subtitle() (graph option) . . . 128–129 Subject index 377 sum() (function) ...... 83–85 upper() (function)...... 90 summarize ...... 15–16, 155–156 use ...... 10–11 summarize() (tabulate option) . . . 157 using ...... 58–59 summary graphs ...... 159–161 usingdata...... 310 summary tables ...... 157–159 superposition...... 149, 160–161 V survey data ...... 229–230 V...... see Cramer’s V svmat ...... 271 valuelabels...... see labels, values svy ...... 229–230 valuelabel (axis-label suboption) . . . . symmetry plot ...... 214–215 ...... 124–125 symplot (statistical graph). . . .214–215 variable list ...... see variables, varlist syntax ...... 340–343, 346–348 variables syntaxdiagram...... 45 all ...... 47–48 system files ...... 292–293 allowed names ...... 76–77 categorical...... 217–218 T center . . . . . 32–33, 64, 72, 195, 221 tab-separated values . . . see spreadsheet definition...... 12 format delete...... 12,48 tab1 ...... 140 dependent...... 177 tab2 ...... 143 dummy...... 78–79, 147, 194, table ...... 157–159 217–219, 278–280 tabstat ...... 156 generate ...... 24, 75–97 tabulate ...... 139–143 group ...... 150–152 tagged-image file format ...... see TIFF identifier...... 304 tau-b...... see Kendall’s tau-b independent...... 177 tempvar ...... 350–351 multiplecodings...... 305 text() (twoway option) ...... 121–122 names...... 97 textboxoptions...... 127 ordinal...... 288 tick lines ...... 107–108, 125–126 replace ...... 24, 75–97 TIFF ...... 135–136 temporary ...... 350–351 title() (graph option) ...... 128–129 transformations . . . . . 107, 204, 213, totalsumofsquares...... see TSS 216, 223–227 totalvariance...... see TSS varlist...... 14, 47–49 trace ...... 333–334 Variableswindow...... see windows, TSS ...... 189 Variables window two-way table . . . . . see frequency table, variance...... see two-way varianceofresiduals...... see RSS twoway (graphtype)...... 105 variation...... 189 varlist...... see variables, varlist U version ...... 37 U-shapedrelationship...... 226 view...... 38 unbalancedpaneldata...... 313 uniform() (function) ...... 63, 77–78 W update...... 354–355 weights...... 65–70 updating Stata ...... 354–355 whisker...... 162 378 Subject index wildcards...... 48 windows change...... 26 Commandwindow...... 8 fontsizes...... 8 Graphwindow...... 22 preferences...... 8 Resultswindow...... 8 Reviewwindow...... 8,14 scrollback...... 10 Variableswindow...... 8 windowsmetafile...... see WMF WMF ...... 135–136 workingdirectory...... see directory, working directory wstata.exe ...... 354

X xi:...... 219 xlabel() (twoway option) . . . . 122–125 xline() (twoway option)...... 119–120 xtick() (twoway option)...... 125–126 xscale() (twoway option) . . . . 117–119 xsize() (graph option)...... 116 xt commands...... 232 xtgee ...... 242 xtick() (twoway option)...... 125–126 xtitle() (twoway option) . . . . 126–127 xtreg ...... 239–242

Y ylabel() (twoway option) . . . . 122–125 yline() (twoway option)...... 119–120 ytick() (twoway option)...... 125–126 yscale() (twoway option) . . . . 117–119 ysize() (graph option)...... 116 ytick() (twoway option)...... 125–126 ytitle() (twoway option) . . . . 126–127

Z ziparchive...... 3