Exploratory Data Analysis and JMP

Exploratory Data Analysis and JMP

<p>September 29, 2009 Lecture #2 240A-1 L. Phillips Exploratory Data Analysis and JMP</p><p>I. Open the JMP program by going to Start, Programs, Statistics, JMP 5.0.1 (select).</p><p>II. Open the data file students by clicking on the open data table button in the JMP </p><p> starter window and scrolling over to the file students.jmp in the folder Sample </p><p>Data.</p><p>The five columns contain the five variables:</p><p>Age: an ordinal variable</p><p>Sex: a nominal or categorical variable</p><p>Height: a cardinal or numeric variable</p><p>Weight: a cardinal or numeric variable:</p><p>Idnum: id number, a nominal variable</p><p>Note: there are 233 observations or rows</p><p>III. To display ordinal and nominal variables, from the menu bar choose </p><p> analyze/distributions</p><p>In the distribution dialog box, select the variables age and sex and drag to the y, </p><p> columns window. Hit the OK button.</p><p>You can see there are more boys than girls and more twelve year olds than</p><p> other ages. The graph on the left for the variable age is a histogram, plotting the </p><p> frequency or number of observations for each age category. The graph on its right</p><p> is a mosaic bar chart , showing the fraction of observations in each category. By </p><p> hitting the red triangle button to the left of the word age, and choosing histogram </p><p> options, you can add a count axis to the histogram.</p><p>IV. To display a numerical variable, click on the data window to make it active and , </p><p> from the menu bar choose analyze/distributions September 29, 2009 Lecture #2 240A-2 L. Phillips Exploratory Data Analysis and JMP</p><p>In the distribution dialog box, select the variables height and weight and drag to </p><p> the y, columns window. Hit the OK button.</p><p>Use the hand icon and drag to the right on the histogram columns to </p><p> obtain finer categories of height. You can see that the mode is 62 inches. The </p><p> maximum height is 72 inches and the minimum height is 51 inches. The graph on </p><p> the left for the variable height is a histogram, plotting the frequency or number </p><p> of observations for each height. The graph on its right is an outlier box chart . </p><p>The ends of the box are the 25th and 75th quantiles (quartiles), 58 and 64, </p><p> respectively. The difference between these quartiles, 6, is the inter-quartile range, </p><p> a measure of dispersion. Once again, for height, the 75th quartile is 64, with 25% </p><p> of the observations lying above this height, and the 25th quartile is 58 with 25% of</p><p> the observations lying below this height, so the inter-quartile range is 6. The </p><p> median height is 61 inches, with 50 % 0f the observations above this height. The </p><p> median is illustrated in the box by a line. The lines on either end of the box are </p><p> whiskers, and extend to the outermost data points within the distance, for </p><p> example, 75th quartile + 1.5* inter-quartile range, i.e. 64 + 1.5*6, or 73. Since the </p><p> maximum height is 72 inches, the whisker ends there. Thus there are no outliers, </p><p> or heights to plot beyond this whisker. The 25th quartile is 58, so the whisker will </p><p> potentially extend down to 49, but the minimum height is 51 inches, so the </p><p> whisker ends at 51, and there are no outlier heights below this .</p><p>The diamond is the called the means diamond. Note the mean or average </p><p> height is 61.33 inches, above the median of 61. The extent of the diamond is a </p><p>95% confidence interval around the mean, i.e. the probabilty of the mean height September 29, 2009 Lecture #2 240A-3 L. Phillips Exploratory Data Analysis and JMP</p><p> lying above or below the diamond is only 5%. We will study the calculation of </p><p> these confidence intervals in the weeks ahead.</p><p>Note there is an outlier observation for the weight variable, so this may be </p><p> an individual that requires medical diagnosis. The red bracket in the box plot </p><p> designates the range of the shortest half of the data, i.e. the 50% of the </p><p> observations that are most dense, i.e clustered around the central tendency.</p><p>In the moments list are the mean and standard deviation of the observation</p><p> values, for example for height.</p><p>V. The Spinning Plot</p><p>Select the data window and from the graph menu choose spinning plot. In</p><p> the dialog box, (use the control key to) select the height, weight, and age variables</p><p> and drag to the y, column box. Click OK. </p><p>Note the positive relationship or correlation between weight and height as </p><p> age increases. Use the hand icon to rotate the three-dimensional data plot. Try </p><p> using the white background( red triangle to the right of the rotation icons). You </p><p> can use the lasso icon to select the outlier point and from the data table identify </p><p> the idnum of this individual.</p><p>VI. Help Menu</p><p>The manuals are available online and provide instructions for using the </p><p>JMP program. Select help from the menu bar and select contents.</p><p>VII. Analysis of a Subset of Female Students</p><p>Use the students window and repeat the instructions at the beginning of </p><p> section III above, i.e. from the menu bar choose analyze/distributions and select September 29, 2009 Lecture #2 240A-4 L. Phillips Exploratory Data Analysis and JMP</p><p> age and sex and drag to the y, columns window. Highlight females in the </p><p> histogram. Note that all of the observations for females are now selected in the </p><p> data window. From the Tables menu in the bar, select subset. In the dialog box, </p><p> choose a name such as female subset of students. This data file can then be used </p><p> to conduct analysis on the height, weight, and age variables, as before, including </p><p> producing histograms and box plots, as well as a rotating plot, but restricted to </p><p> females.</p>

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    4 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us