Constructing a Boxplot and Computing Descriptive Statistics in Stata and More…

Constructing a Boxplot and Computing Descriptive Statistics in Stata and More…

University of California, Los Angeles Department of Statistics Statistics 13 Instructor: Nicolas Christou Constructing a boxplot and computing descriptive statistics in Stata and more… Retrieve the file “textbookex1_38” as follows: . use http://www.stat.ucla.edu/~nchristo/textbookex1_38 In this file there are four variables.. The first (year1) is the year in which the Boston marathon was completed by men. The second (male_tim) is the completion time rounded to the nearest minute for the years 1959-97. The third (year2) is the year in which women were allowed to compete in the Boston Marathon. The fourth (female_tim) is the completion time also rounded to the nearest minute. You can see the data by typing . edit or . list Let’s construct a box plot for the finishing time of men and women. First the box plot of the finishing time for men: . graph box male_tim, t1title(Finishing marathon time for male runners) Finishing marathon time for male runners 145 140 135 130 125 Then the boxplot for the finishing time for women: . graph box female_t, t1title(Finishing marathon time for female runners) Finishing marathon time for female runners 190 180 170 160 150 140 We can also do a side-by-side boxplot to compare the 2 variables. graph box male_tim female_t, t1title(Side-by-side boxplot) Side-by-side boxplot 200 180 160 140 120 male_tim female_t Now let’s use Stata to compute descriptive statistics for the completion time of men and women. Here is the command: . summarize male_tim female_t And here is what Stata gives you: Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- male_tim | 39 133.641 5.106689 127 144 female_t | 26 153.6538 13.07499 142 190 If you want more detailed output you should type: . summarize male_tim female_t, detail And you will receive this: male_tim ------------------------------------------------------------- Percentiles Smallest 1% 127 127 5% 128 128 10% 128 128 Obs 39 25% 129 128 Sum of Wgt. 39 50% 132 Mean 133.641 Largest Std. Dev. 5.106689 75% 137 142 90% 142 143 Variance 26.07827 95% 144 144 Skewness .6219865 99% 144 144 Kurtosis 2.113501 female_t ------------------------------------------------------------- Percentiles Smallest 1% 142 142 5% 143 143 10% 144 144 Obs 26 25% 145 144 Sum of Wgt. 26 50% 147 Mean 153.6538 Largest Std. Dev. 13.07499 75% 162 167 90% 168 168 Variance 170.9554 95% 186 186 Skewness 1.482637 99% 190 190 Kurtosis 4.361222 - Question: Find the median, the first and third quartiles, and compute the interquartile range of the completion time for men and women. Now go back to the boxplots and locate these numbers. Which dataset has larger variation? Try to find a reason for that. Think about this… Another data set gives the boxplot below. What happened here? Generate data that give you approximately the following boxplot. income 60 53 46 39 32 25 18 A different data set has the following boxplot. Why? Generate data that give you approximately the following boxplot. income 88 81 74 67 60 53 46 39 32 25 18.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    4 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us