Introduction to Stata

Econ 5/Poli 5D Lecture 5

Intergenerational Mobility The Data Example of No Intergenerational Mobility Example of Complete Intergenerational Mobility Our Goal Today

display di display di

In [1]: di 2+2

4 display di

In [1]: di 2+2

4

In [2]: di 2*3

6 display di

In [1]: di 2+2

4

In [2]: di 2*3

6 display display di

In [1]: di 2+2

4

In [2]: di 2*3

6 display

In [3]: di "Hello World"

Hello World

In [4]: log using stata_class1_logfile.log, text replace

------name: log: /Users/Brian/Dropbox/Grad School/Sixth Year/Econ:Poli 5/Lectu res/We > ek 3/stata_class1_logfile.log log : text opened on: 22 Mar 2021, 23:05:04 In [4]: log using stata_class1_logfile.log, text replace

------name: log: /Users/Brian/Dropbox/Grad School/Sixth Year/Econ:Poli 5/Lectu res/We > ek 3/stata_class1_logfile.log log type: text opened on: 22 Mar 2021, 23:05:04 pwd

In [5]: pwd

/Users/Brian/Dropbox/Grad School/Sixth Year/Econ:Poli 5/Lectures/Week 3 pwd

In [5]: pwd

/Users/Brian/Dropbox/Grad School/Sixth Year/Econ:Poli 5/Lectures/Week 3

pwd

In [5]: pwd

/Users/Brian/Dropbox/Grad School/Sixth Year/Econ:Poli 5/Lectures/Week 3

cd

In [6]: cd "/Users/Brian/Dropbox/Grad School/Sixth Year/Econ:Poli 5/Lectures/Week 3/data"

/Users/Brian/Dropbox/Grad School/Sixth Year/Econ:Poli 5/Lectures/Week 3/da ta Commenting code

* // Commenting code

* //

In [7]: * print working directory pwd

/Users/Brian/Dropbox/Grad School/Sixth Year/Econ:Poli 5/Lectures/Week 3/da ta use , replace use , replace

In [8]: use ./oi_ranks_short.dta, replace

(Intergenerational cross section for children in US) use , replace

In [8]: use ./oi_ranks_short.dta, replace

(Intergenerational cross section for children in US) describe describe

In [9]: describe

Contains data from ./oi_ranks_short.dta obs: 100 Intergenerational cross sect ion for children in US vars: 2 19 Nov 2020 14:33 ------storage display value variable name type format label variable label ------par_pctile float %9.0g Parent Household Income Rank kfr_pooled double %10.0g All Children: Household Inco me Rank ------Sorted by: par_pctile describe

In [9]: describe

Contains data from ./oi_ranks_short.dta obs: 100 Intergenerational cross sect ion for children in US vars: 2 19 Nov 2020 14:33 ------storage display value variable name type format label variable label ------par_pctile float %9.0g Parent Household Income Rank kfr_pooled double %10.0g All Children: Household Inco me Rank ------Sorted by: par_pctile summarize summarize

In [10]: summarize par_pctile kfr_pooled

Variable | Obs Mean Std. Dev. Min Max ------+------par_pctile | 100 50.5 29.01149 1 100 kfr_pooled | 100 51.0177 10.22296 32.43 73.76 summarize

In [10]: summarize par_pctile kfr_pooled

Variable | Obs Mean Std. Dev. Min Max ------+------par_pctile | 100 50.5 29.01149 1 100 kfr_pooled | 100 51.0177 10.22296 32.43 73.76 summarize

help In [11]: help summarize

[R] summarize -- Summary statistics (View complete PDF manual entry)

Syntax ------

summarize [varlist] [if] [in] [weight] [, options]

options Description ------Main detail display additional statistics meanonly suppress the display; calculate only the mean; programmer's option format use variable's display format separator(#) draw separator line after every # variables; default is separator(5) display_options control spacing, line width, and base and empty cell s

------varlist may contain factor variables; see fvvarlist. varlist may contain -series operators; see tsvarlist. by, rolling, and statsby are allowed; see prefix.

aweights, fweights, and iweights are allowed. However, iweights may n ot be used with the detail option; see weight.

Menu ----

Statistics > Summaries, tables, and tests > Summary and descriptive statistics > Summary statistics

Description ------

summarize calculates and displays a variety of univariate summary statistics. If no varlist is specified, summary statistics are calcul ated for all the variables in the dataset.

Links to PDF documentation ------

Quick start

Remarks and examples

Methods and formulas

The above sections are not included in this help .

Options ------

+------+ ----+ Main +------

detail produces additional statistics, including skewness, kurtosis, t he four smallest and four largest values, and various percentiles.

meanonly, which is allowed only when detail is not specified, suppress es the display of results and calculation of the variance. Ado-file writers will this useful for fast calls.

format requests that the summary statistics be displayed using the dis play formats associated with the variables rather than the default g display format; see [D] format.

separator(#) specifies how often to insert separation lines into the output. The default is separator(5), meaning that a line is drawn after every five variables. separator(10) would draw a line after every 10 variables. separator(0) suppresses the separation line.

display_options: vsquish, noemptycells, baselevels, allbaselevels, nofvlabel, fvwrap(#), and fvwrapon(style); see [R] Estimation opti ons.

Examples ------

. sysuse auto . summarize , detail , detail

In [12]: summarize kfr_pooled, detail

All Children: Household Income Rank ------Percentiles Smallest 1% 33.155 32.43 5% 36.035 33.88 10% 37.785 34.7 Obs 100 25% 42.27 35.31 Sum of Wgt. 100

50% 50.69 Mean 51.0177 Largest Std. Dev. 10.22296 75% 58.855 69.59 90% 65.155 70.33 Variance 104.5089 95% 68.435 71.59 Skewness .1880136 99% 72.675 73.76 Kurtosis 2.07493 browse

In [13]: help graph

[G-2] graph -- The graph command (View complete PDF manual entry)

Syntax ------

graph ...

The commands that draw graphs are

Command Description ------graph twoway scatterplots, line plots, etc. graph matrix scatterplot matrices graph bar bar charts graph dot dot charts graph box box-and-whisker plots graph pie pie charts other commands to draw statistical graphs ------

The commands that save a previously drawn graph, redisplay previously saved graphs, and combine graphs are

Command Description ------graph save save graph to disk graph use redisplay graph stored on disk graph display redisplay graph stored in memory graph combine combine multiple graphs graph replay redisplay graphs stored in memory and on d isk ------

The commands for printing a graph are

Command Description ------graph print print currently displayed graph set printcolor set how colors are printed graph export export .gph file to PostScript, etc. ------

The commands that deal with the graphs currently stored in memory are

Command Description ------graph display display graph graph dir list names graph describe describe contents graph rename rename memory graph graph copy copy memory graph to new name graph drop discard graphs in memory graph close close Graph windows ------Also see [G-2] graph manipulation.

The commands that describe available schemes and allow you to identify and set the default scheme are

Command Description ------graph query, schemes list available schemes query graphics identify default scheme set scheme set default scheme ------Also see [G-4] Schemes intro.

The command that lists available styles is

Command Description ------graph query list available styles ------

The command for setting options for printing and exporting graphs is

Command Description ------graph set set graphics options ------graph twoway

scatter In [14]: graph twoway scatter kfr_pooled par_pctile 0 7 k n a R

e m 0 o 6 c n I

d l o h e s 0 u 5 o H

: n e r d l i 0 h 4 C

l l A 0 3 0 20 40 60 80 100 Parent Household Income Rank In [14]: graph twoway scatter kfr_pooled par_pctile 0 7 k n a R

e m 0 o 6 c n I

d l o h e s 0 u 5 o H

: n e r d l i 0 h 4 C

l l A 0 3 0 20 40 60 80 100 Parent Household Income Rank In [15]: graph twoway scatter kfr_pooled par_pctile, msymbol(smcircle_hollow) 0 7 k n a R

e m 0 o 6 c n I

d l o h e s 0 u 5 o H

: n e r d l i 0 h 4 C

l l A 0 3 0 20 40 60 80 100 Parent Household Income Rank

In [16]: graph twoway scatter kfr_pooled par_pctile, mcolor(maroon) msize(small) msymbol(smcircle_hol 0 7 k n a R

e m 0 o 6 c n I

d l o h e s 0 u 5 o H

: n e r d l i 0 h 4 C

l l A 0 3 0 20 40 60 80 100 Parent Household Income Rank

In [17]: twoway scatter kfr_pooled par_pctile, mcolor(maroon) msize(small) msymbol(smcircle_hollow) / text( 70 20 "Slope = 0.351", placement(e) size(large) ) /// legend(off) /// ytitle("Mean Child Rank in National Income Distribution") /// ylabel(30(10)80, nogrid) /// xtitle("Mean Parental Rank in National Income Distribution") /// xlabel(0(10)100) /// graphregion(color(white) fcolor(white)) 0 8 n o i t u b i r t 0 s i 7 Slope = 0.351 D

e m o c n 0 I

6 l a n o i t a N

0 n i 5

k n a R

d l i 0 h 4 C

n a e M 0 3 0 10 20 30 40 50 60 70 80 90 100 Mean Parental Rank in National Income Distribution

Conclusion of Stata introduction