Estimation Steps Point Estimators Statistics Point Estimates
Total Page:16
File Type:pdf, Size:1020Kb
IntroductiontoEstimation POINT&INTERVALESTIMATION Basicdefinitionsandconcepts AND Theassignmentofvalue(s)toapopulationparameter (INTRODUCTIONTOTESTING) basedonavalueofthecorrespondingsamplestatisticis calledestimation. The value(s) assignedto a population parameter based on thevalueofasamplestatisticiscalledanestimate. Thesamplestatisticusedtoestimateapopulation parameteriscalledanestimator. 2 Estimationsteps PointEstimators Theestimationprocedureinvolvesthefollowingsteps: APointEstimation 1. Select a sample. The value of a samplestatistic that isused to estimate a populationparameteriscalledapointestimate.Usually, 2. Collecttherequiredinformationfromthemembersof wheneverweusepointestimation,wecalculatethemarginof the sample. errorassociated with that point estimation, which is s 3. Calculatethevalueofthesamplestatistic. calculatedasfollows: Margin of error 1.96 x or 1.96 x 4. Assigg()pgppnvalue(s)tothecorrespondingpopulation Pointestimateisbasedonjustonesample,wecannot parameter. expectittobeequaltothecorrespondingpopulation parameter.Indeed,eachsamplewillhaveadifferent, non of them isequal to. But they are all unbiased estimatesof.(Recallthatunbiased meanstheir expectedvalueisequalto.) 3 4 Statistics PointEstimates Statistics Parameters Z Astatisticisapropertyofasamplefromthe Z Instatisticalinference,thetermparameter isusedtodenote population. aquantity,say,thatisapropertyofanunknownprobability Z Astatisticisdefinedtobeanyfunctionofrandom distribution. variables.So,itisalsoarandomvariable.For Z Forexampp,le,themean, variance, orap articularquantile of example,thesamplemean,samplevariance,ora theprobabilitydistribution particularsamplequantile. Z Parametersareunknown,andoneofthegoalsofstatistical Z The observedvalue of the statisticcan be ccalculatedalculated inferenceis to estimate them . fromtheobserveddatavaluesofrandomvariables. Estimation Examples of statistics: Z Aprocedureof“guessing”propertiesofthepopulationfrom whichdata are collected. X XX sample mean X 12 n Z Apointestimateofanunknownparameterisastatisticthat n representsa“guess”oftheparameterofinterest. n 2 Z B ()XXi There maybe more than one sensiblepoint estimate of a sample variance S 2 i1 n parameter. 5 1 6 Therelationshipbetweenanunknown PropertiesofEstimatorsthatWeDesire parameteranditspointestimator UbiUnbiase dness: E( ˆ ) Inotherwordswewouldwishthattheexpectedvalueof theestimatoristhesameasitstruevalue. Wedefinebiasofanestimator asthedifferencebetween theexpectedvalueoftheestimatorandthetruevaluein thepppopulation: Efficiency:wewishtominimizethemeansquareerror aroundthetruevalue.Theefficiencytellsushowwellthe estimatorperformsinpredicting.Amongunbiased estimatorstherefore,wewanttheonewiththesmallest variance. Consistency.Assamplesizeincreases,variationofthe estimatorfromthetruepopulationvaluedecreases. 7 8 Unbiasedness Efficiency P(X) Sampling Unbiased Biased P(X) Distribution of Median Sampling Distribution of Mean 9 10 Consistency Intervalestimation: Larger General approach P(X) sample size B Smaller sample size A 11 12 IntervalEstimation ConfidenceIntervalEstimation Definition Outline: Ininterval estimation,aninterval is constructed around Procedure: thepointestimate,anditisstatedthatthisintervalislikely 1. Sample point estimator ( X or p ) tocontainthecorrespondingpopulationparameter. 2. Confidence level and Table Zortn-1 3. Formulas compute UCL and LCL: point estimator margin of error x x $1370 13 $1130 $1610 IntervalEstimationofthePopulationMean Eachintervalisconstructedwithregardtoagivenconfidencelevel Intervalestimationofapopulation andiscalledaconfidenceinterval. The confidence levelassociated with a confidence intervalstates how mean: The case of known5 muchconfidencewehavethatthisintervalcontainsthetrue populationparameter.Theconfidencelevelisdenotedby(1– ))%100%. The(1– )100%confidenceintervalfor- (populationmean)is: x z x if is known and x zsx if is not known, where x / n and sx s/ n Thevalueofz usedherecanbefoundfromthestandardnormal distributiontable,forthegivenconfidencelevel. Themaximumerrorofestimatefor),denotedbyE,isthequantity thatis subtracted from and added to the value ofx to obtain a confidenceintervalfor).Thus, E z x or zsx 16 17 5 IntervalEstimationofthePopulationMean IntervalEstimationofthePopulationMeanwhen is when5 is known:Example known:AnswerstotheExample x Apublishingcompanyhasjustpublishedanewcollegetextbook. Herewetakeadvantageofourknowledgeondistributionofto Beforethecomppyanydecidesthe priceatwhichtosellthis developaconfidenceintervalfor. 4.50 textbook,itwantstoknowtheaveragepriceofallsuch a) n=36,x =$70.50,and =$4.5,thus: $.75 x n textbooksinthemarket.Theresearchdepartmentatthe Pointestimateof) =x =$70.50 36 company tooka sample of 36 comparable textbooksand Marginof error= 1.96 x 1.96(.75) $1.47 collectedinformationontheirprices.Thisinformationproduces b) Confidencelevelis90%or.90;andz =1.65. ameanpriceof$70.50forthissample.Itisknownthatthe x z standddiidarddeviationo fhfthepr iceso fllhfallsuchtext boo ki$450ksis$4.50. x 70.50 1.65(.75) 70.50 1.24 (a) Whatisthepointestimateofthemeanpriceofallsuch (70.50 -1.24) to (70.50 1.24) textbooks? Whatis the margin of errorfor the estimate? $69.26 to $71.74 (b) Constructa90%confidenceintervalforthemeanpriceofall Basedonourresults,wecansaythatweare90%confidentthat suchcollegetextbooks. themean price of all such college textbooks is between $69.26and$71.74. 18 19 Example 1: 1. Interval Estimation for Population Mean Answer: Example 1: ( known case) • n = 49 In an effort to estimate the mean amount spent per customer for dinner at a major Atlanta restaurant. Data were collected X $24.8 for a sample of 49 customers over a three-week period. Assume a population standard deviation of $5. =$5 a. At the 95% confidence, what is the margin error? • Z: (1- )/2 = 0.95/2 = 0.475 Table 1: Z = 1.96 5 •1. Z x (1.96) 1.4 b. If the sample mean is $24.80, What is the 95% / 2 n 49 confidence interval for the population mean? 2. UCL X Z x 24 .8 1.4 26 .2 / 2 n LCL X Z x 24 .8 1.4 23 .4 / 2 n : [23.4, 26.2] IntervalEstimationofthePopulationMean when5 isunknown Intervalestimationofapopulation Insteadofpopulationstandarddeviation wehave mean: The case of unknown5 samplestandard deviation s. Insteadofnormaldistribution,wehavetdistribution Thetdistribution isusedtoconstructaconfidence intervalabout if: 1. Thepopul ati on fromwhi c h thesamp le is drawn is (approximately)normallydistributed; 2. Thesampp(,);lesizeissmall(thatis,n<30); 3. Thepopulationstandarddeviation,,isnotknown. 22 23 t The Distribution Thet Distribution:Example Thetdistribution isaspecifictypeofbellshapeddistributionwith alowerheightandawiderspreadthanthestandardnormal Findthevalueoft for16degreesoffreedomand.05areain distribution.Asthesamppg,lesizebecomeslarger,thet t distributionapproachesthestandardnormaldistribution.A ther ig ht ta ilo fa dis tr ibu tioncurve. specifictdistributiondependsononlyoneparameter,calledthe Area in the Right Tail Under the t Distribution Curve degreesof freedom (df). The mean of the t distributionis equal df .10 .05 .025 ….001 df df to0anditsstandarddeviationisfoundby.The/( 2) 1 3.078 6.314 12.706 … 318.309 graphbelowdepictsthecaseofdf=3. 2 1.886 2.920 4.303 … 22.327 3 1.638 2.353 3.182 … 10.215 Thestandarddeviationofthe Thestandarddeviationofthet . … … … … … standardnormaldistributionis1.0 distributionis 9 /(9 2) 1.134 16 1. 337 1. 746 2. 120 … 3. 686 . … … … … … Therequiredvalueoft for16 df and.05areaintherighttail. 24 25 - = 0 ConfidenceIntervalforPopulationmean Using t (continued) The Distribution thet Distribution Thet distributionwith16degreesoffreedom,areasunderthe rightand the left tails . The (1– )100%confidenceinterval for) is s x ts where s x x n Thevalueoft isobtainedfromthet distributiontableforn – 1dffddhfdlldegreesoffreedomandthegivenconfidencelevel. .05 1.746 26 -1.746 0 27 ConfidenceIntervalforPopulationmean Using ConfidenceIntervalforPopulationmean Using thet Distribution:Example thet Distribution:ExampleAnswered Dr.Moorewantedtoestimatethemeancholesterollevelfor Confidencelevelis95%or.95,withdf =n – 1=25– 1=24 Areaineachtail=.5– (.95/2)=.5R .4750=.025 alladult menliving in Hartford. He took a sampleof 25 adult s Thevalueoft intherighttailis2.064,and s 12 menfromHartfordandfoundthatthemeancholesterol x 2.40 n 25 levelforthissampleis186withastandarddeviationof12. Assumethatt hec hlholestero lllleve lflsfora lldllladultmenin df Hartfordare(approximately)normallydistributed.Construct = 24 a 95%confidence interval forthe population mean). .025 .025 .4750 .4750 x tsx 186 2.064(2.40) 186 4.95 181.05 to 190.95 Thus,wecanstatewith95%confidencethatthemean cho les tero l leve l fora lla du ltmen liv ing in Har for d lies be tween 181.05and190.95. 28 29 Example 2: Example 2: ( known case) Given: n = 100,,,, X = 49, S = 8.5, 1- = .95 The mean flying time for pilots at Continental Think: What to estimate? Use Z or t? Airlines is 49 hours per month. This mean was Answer: based on a sample of 100 pilots and the sample • Sample info (given): n = 100,X = 49, S = 8.5 standard deviation was 8.5 hours. • t: 1- =0.95, so /2=0.025, d.f.=n-1=99 Table 2: dfd.f.=100, /2=0. 025 t=1. 984 a. At 95% confidence, what is the margin of error? d.f.=80, /2=0.025 t=1.990 100 99 b. What is the 95% confidence interval estimate of *Interpolation: t 1.984 (1.990 1.984) 1.9843 the population mean flying time? 100 80 S 8.5 c. The mean flying time for pilots at United Airlines a. m.o.e.: m.o.e. t 1.9843 1.69 / 2