Introduction to Stata
Econ 5/Poli 5D Lecture 5
Intergenerational Mobility The Data Example of No Intergenerational Mobility Example of Complete Intergenerational Mobility Our Goal Today
display di display di
In [1]: di 2+2
4 display di
In [1]: di 2+2
4
In [2]: di 2*3
6 display di
In [1]: di 2+2
4
In [2]: di 2*3
6 display display di
In [1]: di 2+2
4
In [2]: di 2*3
6 display
In [3]: di "Hello World"
Hello World
In [4]: log using stata_class1_logfile.log, text replace
------name:
------name:
In [5]: pwd
/Users/Brian/Dropbox/Grad School/Sixth Year/Econ:Poli 5/Lectures/Week 3 pwd
In [5]: pwd
/Users/Brian/Dropbox/Grad School/Sixth Year/Econ:Poli 5/Lectures/Week 3
cd pwd
In [5]: pwd
/Users/Brian/Dropbox/Grad School/Sixth Year/Econ:Poli 5/Lectures/Week 3
cd
In [6]: cd "/Users/Brian/Dropbox/Grad School/Sixth Year/Econ:Poli 5/Lectures/Week 3/data"
/Users/Brian/Dropbox/Grad School/Sixth Year/Econ:Poli 5/Lectures/Week 3/da ta Commenting code
* // Commenting code
* //
In [7]: * print working directory pwd
/Users/Brian/Dropbox/Grad School/Sixth Year/Econ:Poli 5/Lectures/Week 3/da ta use , replace use , replace
In [8]: use ./oi_ranks_short.dta, replace
(Intergenerational cross section for children in US) use , replace
In [8]: use ./oi_ranks_short.dta, replace
(Intergenerational cross section for children in US) describe describe
In [9]: describe
Contains data from ./oi_ranks_short.dta obs: 100 Intergenerational cross sect ion for children in US vars: 2 19 Nov 2020 14:33 ------storage display value variable name type format label variable label ------par_pctile float %9.0g Parent Household Income Rank kfr_pooled double %10.0g All Children: Household Inco me Rank ------Sorted by: par_pctile describe
In [9]: describe
Contains data from ./oi_ranks_short.dta obs: 100 Intergenerational cross sect ion for children in US vars: 2 19 Nov 2020 14:33 ------storage display value variable name type format label variable label ------par_pctile float %9.0g Parent Household Income Rank kfr_pooled double %10.0g All Children: Household Inco me Rank ------Sorted by: par_pctile summarize summarize
In [10]: summarize par_pctile kfr_pooled
Variable | Obs Mean Std. Dev. Min Max ------+------par_pctile | 100 50.5 29.01149 1 100 kfr_pooled | 100 51.0177 10.22296 32.43 73.76 summarize
In [10]: summarize par_pctile kfr_pooled
Variable | Obs Mean Std. Dev. Min Max ------+------par_pctile | 100 50.5 29.01149 1 100 kfr_pooled | 100 51.0177 10.22296 32.43 73.76 summarize
help In [11]: help summarize
[R] summarize -- Summary statistics (View complete PDF manual entry)
Syntax ------
summarize [varlist] [if] [in] [weight] [, options]
options Description ------Main detail display additional statistics meanonly suppress the display; calculate only the mean; programmer's option format use variable's display format separator(#) draw separator line after every # variables; default is separator(5) display_options control spacing, line width, and base and empty cell s
------varlist may contain factor variables; see fvvarlist. varlist may contain time-series operators; see tsvarlist. by, rolling, and statsby are allowed; see prefix.
aweights, fweights, and iweights are allowed. However, iweights may n ot be used with the detail option; see weight.
Menu ----
Statistics > Summaries, tables, and tests > Summary and descriptive statistics > Summary statistics
Description ------
summarize calculates and displays a variety of univariate summary statistics. If no varlist is specified, summary statistics are calcul ated for all the variables in the dataset.
Links to PDF documentation ------
Quick start
Remarks and examples
Methods and formulas
The above sections are not included in this help file.
Options ------
+------+ ----+ Main +------
detail produces additional statistics, including skewness, kurtosis, t he four smallest and four largest values, and various percentiles.
meanonly, which is allowed only when detail is not specified, suppress es the display of results and calculation of the variance. Ado-file writers will find this useful for fast calls.
format requests that the summary statistics be displayed using the dis play formats associated with the variables rather than the default g display format; see [D] format.
separator(#) specifies how often to insert separation lines into the output. The default is separator(5), meaning that a line is drawn after every five variables. separator(10) would draw a line after every 10 variables. separator(0) suppresses the separation line.
display_options: vsquish, noemptycells, baselevels, allbaselevels, nofvlabel, fvwrap(#), and fvwrapon(style); see [R] Estimation opti ons.
Examples ------
. sysuse auto . summarize , detail , detail
In [12]: summarize kfr_pooled, detail
All Children: Household Income Rank ------Percentiles Smallest 1% 33.155 32.43 5% 36.035 33.88 10% 37.785 34.7 Obs 100 25% 42.27 35.31 Sum of Wgt. 100
50% 50.69 Mean 51.0177 Largest Std. Dev. 10.22296 75% 58.855 69.59 90% 65.155 70.33 Variance 104.5089 95% 68.435 71.59 Skewness .1880136 99% 72.675 73.76 Kurtosis 2.07493 browse
In [13]: help graph
[G-2] graph -- The graph command (View complete PDF manual entry)
Syntax ------
graph ...
The commands that draw graphs are
Command Description ------graph twoway scatterplots, line plots, etc. graph matrix scatterplot matrices graph bar bar charts graph dot dot charts graph box box-and-whisker plots graph pie pie charts other more commands to draw statistical graphs ------
The commands that save a previously drawn graph, redisplay previously saved graphs, and combine graphs are
Command Description ------graph save save graph to disk graph use redisplay graph stored on disk graph display redisplay graph stored in memory graph combine combine multiple graphs graph replay redisplay graphs stored in memory and on d isk ------
The commands for printing a graph are
Command Description ------graph print print currently displayed graph set printcolor set how colors are printed graph export export .gph file to PostScript, etc. ------
The commands that deal with the graphs currently stored in memory are
Command Description ------graph display display graph graph dir list names graph describe describe contents graph rename rename memory graph graph copy copy memory graph to new name graph drop discard graphs in memory graph close close Graph windows ------Also see [G-2] graph manipulation.
The commands that describe available schemes and allow you to identify and set the default scheme are
Command Description ------graph query, schemes list available schemes query graphics identify default scheme set scheme set default scheme ------Also see [G-4] Schemes intro.
The command that lists available styles is
Command Description ------graph query list available styles ------
The command for setting options for printing and exporting graphs is
Command Description ------graph set set graphics options ------graph twoway
scatter In [14]: graph twoway scatter kfr_pooled par_pctile 0 7 k n a R
e m 0 o 6 c n I
d l o h e s 0 u 5 o H
: n e r d l i 0 h 4 C
l l A 0 3 0 20 40 60 80 100 Parent Household Income Rank In [14]: graph twoway scatter kfr_pooled par_pctile 0 7 k n a R
e m 0 o 6 c n I
d l o h e s 0 u 5 o H
: n e r d l i 0 h 4 C
l l A 0 3 0 20 40 60 80 100 Parent Household Income Rank In [15]: graph twoway scatter kfr_pooled par_pctile, msymbol(smcircle_hollow) 0 7 k n a R
e m 0 o 6 c n I
d l o h e s 0 u 5 o H
: n e r d l i 0 h 4 C
l l A 0 3 0 20 40 60 80 100 Parent Household Income Rank
In [16]: graph twoway scatter kfr_pooled par_pctile, mcolor(maroon) msize(small) msymbol(smcircle_hol 0 7 k n a R
e m 0 o 6 c n I
d l o h e s 0 u 5 o H
: n e r d l i 0 h 4 C
l l A 0 3 0 20 40 60 80 100 Parent Household Income Rank
In [17]: twoway scatter kfr_pooled par_pctile, mcolor(maroon) msize(small) msymbol(smcircle_hollow) / text( 70 20 "Slope = 0.351", placement(e) size(large) ) /// legend(off) /// ytitle("Mean Child Rank in National Income Distribution") /// ylabel(30(10)80, nogrid) /// xtitle("Mean Parental Rank in National Income Distribution") /// xlabel(0(10)100) /// graphregion(color(white) fcolor(white)) 0 8 n o i t u b i r t 0 s i 7 Slope = 0.351 D
e m o c n 0 I
6 l a n o i t a N
0 n i 5
k n a R
d l i 0 h 4 C
n a e M 0 3 0 10 20 30 40 50 60 70 80 90 100 Mean Parental Rank in National Income Distribution
Conclusion of Stata introduction