(Put your answers in this document and submit it through Blackboard Assignment Dropbox.)

Part 1: Confidence Interval Estimation

The SPSS data file ncbirth2625.sav contains information from a sample of 1000 birth records taken by the North Carolina State Center for Health and Environmental Statistics. The data set represents a sample taken within the state of North Carolina. Of particular interest will be incidents of Low Infant Birth Weight. Low birth weight has been associated with weaker development of many characteristics such as intelligence, coordination, strength, etc. Low birth weight is commonly defined as less than 2500 grams (approximately 88 ounces).

The variables examined are: sex: Sex of newborn child (1=Male, 0=Female) Race of child (0=other Nonwhite, 1=White, 2=Black, 3=American Indian, race: 4=Chinese, 5=Japanese, 6=Hawaiian, 7=Filipino, 8=Other) mothage: Age of mother, in years mothed: Education level of mother, in years gest: Completed Weeks of Gestation, in weeks marital: Marital status (1=married, 0=not married) cigs: Smoking Cigarette (1=Smoke, 0=Do not smoke) drinks: Drink Alcohol (1=Drink Alcohol, 0=Do not drink alcohol) apgar1: Apgar score at 1 minute (0-10) fas: Fetal Alcohol Syndrome (0=No, 1=Yes) plural: Number of children born of the pregnancy totounc: Birth Weight in total ounces Birth Weight Categories (1=Below normal birth weight “less than 88 btotounc: ounces”, 0=Normal birth weight)

Data values are labeled in the SPSS data file provided above. The data was from North Carolina Vital Statistics Institute for Research in Social Science. Visiting the web site may help you to gain more insight about the data. The data for this project comes from 1995 birth registry at the North Carolina State Center for Health and Environmental Statistics. Use is allowed if reference is cited to the above agency.

Use one of the data below to answer questions in next page. In the following question, the term population means the population of mothers who successfully gave birth to a child during the period of data collection in the area where the data were collected.

Data in EXCEL format: http://people.ysu.edu/~gchang/stat/ncbirth2625.xls Data is SPSS format: http://people.ysu.edu/~gchang/stat/ncbirth2625.sav

1 (If you use R Commander, you can import either EXCEL or SPSS file into R Commander.)

SPSS gives you the two end points of the confidence interval estimates. To answer the following questions, you need to convert them into estimate plus/minus margin of error.

Estimation of mean:

1. Find the 95% confidence interval for the average birth weight in the population. ______±______

2. Find the 95% confidence interval for the average birth weight from smoking mothers population. ______±______

3. Find the 95% confidence interval for the average birth weight from nonsmoking mothers population.

______±______

Estimation of population proportion:

(SPSS does not have the option for estimating proportion. Please use the confidence interval estimation for mean to approximate the confidence interval for portion. Please check the video instruction for Inference on One Population on my SPSS Reference web page. If you use R Commander, please view the video instruction in my R Instruction page for confidence interval for one proportion.)

4. Find the 95% confidence interval for the percentage of smoking mother in the population. ______±______

5. Find the 95% confidence interval for the percentage of babies that were below normal birth weight in the population. ______±______

6. Find the 95% confidence interval for the percentage of mothers having below normal birth weight babies among those mothers who were smokers. ______±______(from software output)

2 ______±______(from your own computation using the formula in lecture note and the counts and percentages statistics from software)

3 Part 2: Hypothesis Test for Mean

Adam claimed that his average morning pulse rate is 65. A group of his friends do not think so and they took a random sample of 12 morning pulse rates from Adam on 12 randomly selected days. The data is the following: 70, 67, 67, 68, 66, 72, 75, 68, 68, 66, 68, 69.

The following link has an SPSS example for performing the one-sample t-test: http://people.ysu.edu/~gchang/SPSSE/SPSSOneSampleTTest.pdf The following link has an SPSS instruction video for performing the one-sample test: http://people.ysu.edu/~gchang/SPSS/TTestOneMean.html

The objective of this research is to see if Adam’s average morning pulse rate is different from 65 per minute. (Perform a one-sample t-test at 5% level of significance.)

Use the pulse rates data above to answer the following questions:

1) State the null and the alternative hypotheses:

H0 (Null): ______

Ha (Alternative): ______

2) Is the normality assumption valid? (Verify this using the p-value of the Shapiro- Wilk normality test.)

3) What is the value of the t-test statistic: ______

4) What is the value of the p-value of the test: ______

5) Would you reject the null hypothesis? (Use the p-value of the t-test.)

6) Draw a conclusion for this test:

7) Find the 95% confidence interval estimate for Adam’s average morning pulse rate. ______ ______

4 8) If one wishes to see if Adam’s average morning pulse rate is higher than 65, what would you conclude the analysis using the statistics above?

5 Please use the following space to show your software output that can support the t-test and normality assumption for questions 3), 4), and 7) in Part 2.

6 Part 3: Hypothesis Test for Proportion

A group of researchers wish to see if “more than 40% of the people living in a major city often ate dinner after 7 PM”. A random sample of 400 people from this city was surveyed and 224 of them often had dinner after 7 PM. Please perform a test at 5% level of significance to test their hypothesis, and answer the following questions.

1) State the null and the alternative hypotheses:

H0 (Null): ______

Ha (Alternative): ______

2) What is the value of the z-test statistic: ______

3) What is the value of the p-value of the test: ______

4) Would you reject the null hypothesis? (Use the p-value from the Binomial test or z-test under a large sample situation. )

5) Draw a conclusion for this test:

Please use the following space to show your software output that can support your answers for 2) and 3) in Part 3.

7