<<

Lecture 1: Introduction to and Distributions

Chapter 1 Important Things

• www.stat.purdue.edu/~xuanyaoh/stat350 • Syllabus – Textbook – Classroom Locat ions – Policy: Hw/Lab/Class Participation/Exams – Exam Schedule – SAS The Required Textbook What is ?

• Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data

• Suppose we want to have an idea about how well do Purdue students do in MA162 in the past 5 years, what can one do to find it ou t?

–Find MA162 records and check them –too many to look at, not realistic in most cases – Draw a number of records and try to make a reasonable guess –statistics comes into play! Population vs. Samples

• Population: All objects of interest • Sample: a subset of the population Examples of Data

• RltResults from making obtibservations on one or more variables – x = score on a STAT 350 midterm exam – data or one variable

• (x, y) = height and weight of a STAT 350 student – Bivariate data or two variables

• Etc. Types of Variables Two Terminologies

• DitiDescriptive SttitiStatistics – Summarize and describe important features of data – Numerical summary measures • , , … – Graphic, visual display • –histogram, scatter

• Inferential Statistics – Formal “guesses” we make about the population by lkilooking at the samp le • Common types of inferential statistics are confidence intervals and significance tests 121.2 : Graphical

• The scores of 30 Undergraduate Students • GhilGraphical dildisplay such as the histogram in the previous slide –gives us a rough idea on the whole, very informative and clear

• Numerical measures such as mean and standard deviation in the previous slide –give us a quantitative measure of the center and spread of the data Visual Displays of Data

• Hist ogram—see in prev ious examp le, w ill discuss in detail • Dot plot —self reading (sec 1.2) • Stem and Leaf, see in later example (sec 12)1.2) • Bar graph or —self reading (sec 1.2) • SttltScatterplot—discusse d la ter • We won’t discuss them all but you should cover them in your reading & be comfortable with them all. Histogram for Discrete Data • Based on previous example, – To get the histogram, just count the occurrence of each value of the variable and plot the counts () on the vertical axis – Can display as frequencies (counts) or percents Continuous Data

• SbdiideSubdivide the x-aisaxis into a nnmbeumber of class inteintealsrvals (or classes), plot the frequency or relative frequency for each class

• Define the boundaries of the classes carefully to prevent observations from falling on boundaries (read Pg.14)

• The class size may greatly influence how the histogram looks –Big class : a few big rectangles –Small class interval: many small rectangles Example: Ex. 8 from Text (Pg. 14)

How to choose the class width?

• Although the class width doesn’t change the distribution,,gy it can change your visual understanding of the distribution • A rule of thumb in determining a reasonable number of classes if provided by your text Relative Frequency vs. Density Why “ Densities” ? Interpreting Histograms DotPlot (Self-reading in Sec 1. 2) StemPlot (SelfReading) Hank Aeron Example 1.3 Distributions Continuous Distributions Continuous Distribution: Density Function Examples About SAS …

• Read “SAS” section in syllabus, and also the instruction from course website When you go home…

• Read over the syllabus carefully, before you make decision! • Get the Textbook • Read the “SAS” part in syllabus and the instruction from course website • Read/Review sections 1. 1, 1. 2 and 1. 3 • Start doing Hw#1 and Lab#1 posted on the website • No lab this Wednesday, so go to the regular Wed classroom.

• To preview, Readddbd sections 1.3 (discrete distribution, mass function), 1.4 and 1.5