<<

This PDF is a selection from an out-of-print volume from the National Bureau of Economic Research

Volume Title: The Smoothing of

Volume Author/Editor: Frederick . Macaulay

Volume Publisher: NBER

Volume ISBN: 0-87014-018-3

Volume URL: http://www.nber.org/books/maca31-1

Publication Date: 1931

Chapter Title: The Smoothing of Economic Time Series, Fitting and Graduation

Chapter Author: Frederick R. Macaulay

Chapter URL: http://www.nber.org/chapters/c9361

Chapter pages in book: (p. 31 - 42) ('HAP'I'ER I

LIE SMOC)TIILNU OFECONOMIC 'lIMESF:RIIs. CURVE IITTING AN I) GiuDt'vrIoN. 'lihe statistical prol)lem offitting a mathemati- cal curve to economic datahas many of the charac- teristics of a problem in theadjustment of physi- cal observations. Historically,the hrst problem of this kind arose in theadjustment of ol)servations made on one varial)IC. After alarge number of observations had been made on onevariable it was desired to obtain from thoseobservations the most probable value of the variable,in other words, the most plaUSil)IC estimateof the true value ofthe variable which could be madefrom the oljserva- tions. The arithmetic meanof the observations was early proposed as a so1 utiOflof this problem.1 The arithmetic mean wasused before any at- tempt wasmade to develop a mathematicalfoun- dation for its use. An attempt at amathematical foundation came with the theoryof . This theory claimed that thedistribution of errors tends to be such that if a valueof the variable be

It is not necessary for this discussion to go intothe question of when the arithmetic mean is the lustvalue to use and when some other type of average would seempreferable. 31 'i'li I:S\lOO'Fi11N OF 'I'I i\I lSlR[ES V. taken which will make tlwsum of the squares of the deviations of the observations from thisvalue a minimum, this value xviI I be the most plausible estimate of the true value which could bemade on the basis of the given observations. Thearithmetic average of the observations is the value of the\rarj. able which makes thesum of the squares of the deviations a minimum. The theory of leastsquares was extended to cover not only the problem of themost probable value of OIIC variable but also theprol)lenl of the most probable rd at ionship betweentwo or fliore variables. If we think ofnleasurernents on one variable being chartedas vertical deviations from a horizontal straight line,we see that the true value (which is free fromerrors of observation) may be represented by a horizontal straightline. We are attemptinga numerical solution of the equation YK, where K is theunknown true value. Whenwe approach the problem of the rela- tions between two variables,we may think of our observations being chartedwith the values ofone variable chartedhorizontaJjy and thevalues of the othervariable chartedvertically. Jf the rela- tionship between thetwo variables isa straight line relationship,the Problem isto find the best values for A and Bin the equation YA + BX. An example ofa problem in two variableswould be to discover a relation between altitudeand TI I I'.S MOUTH Ni; UI''II NI F SEttLES 33 l)arometric pressure. Ihe investigator mightfirst take a large number of observationson barometric pressure at different altitudes. For the purposes of this problem, letus assume that the altitudes have been accurately determinedin otherwords, let us assume that theyare free from ''error.'' Alti- tude is to be the ''independent variable." The in- vestigator may now draw pointson a chart. Each point will represent ameasurement of barometric IJressIlI'e at a particular altitude. Altitudes are shown as horizontal distances measured from the left-hand edge of the chart and barometricpres- sures as vertical distances measured from the base of the chart. The problem isto draw a curve in among the points in such a manner that the ordi- nates (or vertical distances from the base) of the curve shall represent the most probable barometric pressures corresponding to the altitudes repre- sented by the al)scissas (horizontal distances from the left-hand edge). In other words, ifwe go along the base line3inches, which represents 300O feet altitude, the height of the curve at this point isto represent the most probable barometric pressure at 3,000 feet altitude.1 The problem is not a problem of what is the most probable value ofone variable

1 Ofcourse in the above problem, the barometric pressure readings are affected not only by errors hut what is more important by actual variations in pressure which are not caused by altitude. We neglect the latter consideration in the illustration. 'iii F SMOO'I'IIIN(; OF TIME SERIES but whatare the most probable values of one vari- able associated with definite values of the other variable. What is the most probable baronietric pressure associated with any particular altitude'? In the solution of sucha prol)lCnl there are two parts. The first and generally the moreimportant part is to decide what is the nature of the relating the two variablesto one another. l'he theory of least squares willgive us little aid in making this decision, Itmay tell us which of two curvesgives the better fit :hut it cannot draw Our attention to Such facts as that thecurve giving the better fitmay be grossly empirical, may not tie in with other known phenomena,and may l)CCOrnC absurd immediately outsidethe of the data. while thecurve giving the poorer fit to theparticu- lar set of observationsmay he subject to no such theoretical drawbacks. Thesecond part of the problem (if thecurve be represented byan equa- tion) is concerned withfinding thebest" numeri- cal values of theunknownconstants. The theory of least Squares states that these '!)est"numerical values of the constants will be such valuesas xviII make the sum of tile squares of thevertical devia- tions of the observations from thefittedcurve a rninjmum.in otherwords make thesum of the Such mayor may not be simple covering the whole maT Ilefliatical equations range of the data. Forexample, the laws of gases do not necessarilyapply to liquids and solids. '[HE SMOOTHING OF' TIMESERIES 35 squares ofthe deviations of theol)serVatioflal values from the correspondingtheoretical values (on the curve) aminimum.1 Fitting a mathematical curve toeconomic time series is astatistical problem. It differsfrom an ordinary problem inthe adjustment of observa- tioflS inthat the primary ob3ectof the fitting is not usually toehrninatc errors of observation.The population of a countrymight conceivably he known exactly year by yearfor a long period and the statistician mightwish to fit a curve to the whole series. Such a curvewould not be fitted in order to discover what wasthe true population at any particulardate. In other words, itwould not be fitted to eliminate eirors.It would probably be fitted for the entirelydifferent purpose of discov- h ering a law of populationgrowth which would C give a picture of whatthe population would have been if, throughout theperiod, it had been affected only by continuouslyacting and fundamental forces. excluding notonly small and erraticinflu- al ences butalso unusual or disturbinginfluences, iLl such as wars, famines. etc.Assuming some such law is, in practice, not so useful asthe a 'flie nietiiod of least squares above might lead one to expect.Only f the equations relatingthe variables to one another areof a linear or J)arabolic typeisthe appljcation of the method bothdirect and SIITIj)lC. With other types is, when not practicallyimpossible, gener- )flS of equation. computation is not of of ally extremely laborious.Even the method of moments universal application. 36 TIlE SMOOTHING 01" TIME SERIES of population growth, tile investigator might he interested in seeing how closely he could describe his particular population data by some simple growth curve, because he would be interested in speculating to what extent the continuously acting and typical forces were the controlling forces in the particular case lie'as examining.' Trend is the outstanding characteristic of popu- lation curves. Seasonal fluctuations are negligible. Non-seasonal cyclical fluctuations arc relatively small. On the other hand, there are time series in which little or no trend is apparent hut which contain pronounced seasonal or cyclicalmove- ments. Monthly rainfall in the City of Seattle would be an example of a series havinga pro- nounced "seasonal" fluctuation. Now the student of weather might wish to fita curve to the monthly data for rainfall in the City of Seattle insuch a manner as to eliminate seasonal and minor erratic fluctuations, i)ut leave everything else.Even if the resulting curve showedno regular long time trend or regular and definite mathematical cycles,it might show considerably differentamounts of rain- fall at different periods. There are economic timeseries which show both

1 The investigator, ofcourse, would not propose that histitted population curve should, for allpurposes, be suhstitutcd for the raw data. Graduations of,or curves fittedto, economic data can generally be used to replace thedata for sj)eciflcptirposes only. For example, a graduation of monthly (all Money Rates hasno 'iii!: SMOOTIIING OF TIME SERIES 37 definite trend and pronounced and definite cycics The production of eggs in the United States isan example of such a series. Annual production of eggs increases with the growth of the country, but there is, of course, a very pronounced 12-months or "seasonal" cycle. Of the series smoothed in the study of interest rates and security prices, Call Money Rates, Time Money Rates, Commercial Paper Rates and Railroad Bond Yields seem to be of a type showing no definite mathematical trend. At least, it would not seem feasible to propose a law of trend, such as has been proposed for the growth of PoPulation. The series for Railroad Stock Prices is of a type which shows evidences of a rather definite long time trend, though the cycli- cal fluctuations are extremely large. Bank Clear- ings and Pig Iron Production show smaller cyclical fluctuations and still more definite trend. During most of the period covered by the in- terest rate study, Call Money Rates, TimeIoney Rates and Commercial Paper Rates show rather pronounced seasonal movements. Such seasonal movements as exist in the case of Bond Yields and Stock Prices are so small as to be practically negli- gible. It seems extremely improbable thatany of the series contain definitely recurring periodic movements other than seasonal fluctuations.

such general replacement relation to the original data as a mortality curve has to its raw data. I

38 TiLE SMOOT11JN(. 01" '1'I1E SERIES

The type of SnloOtli curve wh icti might heex- pected to appear in any particular time series if the series were unaffected by t lie minor ortempoi'arv factors which give rise to seasonal and erratic fluc- tuations is not necessarily rcpresental)lc through- Out its length by any single simple mathematical equation. Its smoothness may be traceable to the fact that the ordinate of eachpoint on the curve is correlated with the ordinate of theimmediately preceding point. Thecurve may in this respect he similar to a curve resulting fromthe cumulation of a mere chance series. If twelvecoins be thrown time after time and if the numberof heads in each throw l)c chartedas they occur, the resulting graph will not be a smoothcurve nor will it tend to show any particular trend or cvcl iral fluctuations.Each throw stands by itself,uncorrelated with thepre- ceding throws. However, ifthe series of throwsare cumulated and the resultschart-ed, the graph will show considerablesmoothness and 1)0th trendand cycles. The trend willmost probably berepresent- able by the straightline yox. The cyclicalap- pearance will be pronounced butthecycles" will be irregular andnot representable byany single simple mathematicalequation. If, instead ofcumu- lating the number ofheads in each throw.the ex- cess of the number of headsOver (3 thrown each tune be cumulated the trend will vanish butthe cycles vil1 remain.flhc smoothnessof such cumu- TI-IF SMOOTHING OF TIMESERIES 39 lativecurvesis inherent in their very nature. While not only the possiblebut the probable size of those larger movementswhich may be termed cyclical is greater in thecumulative than in the non_cumulative curve, 1)0ththe possible and the probable size of those smaller movementswhich may l)etermed erratic is less. In thenon-cumula- tive distribution ofheads over 6 in number, the possible size of a "c clical" movementis 12 (from 6 to +6) . 'Ihe possible"erratic movement" or change in size fromone value tothe next is also 12 (from+6 to 6 or from 6 to+6). If ten throws (12 COInS throwneach time) be cumulated the possible size of a"cyclical" or major movement is increased from 12 tO6o, while the 1)OsSible "erratic movementor clianqein size from one value to the next hasdecreased from 12 to 6(+6 or 6) . Asimilar though much lesssensational story is toldif probable rather thanpossible move- ments areconsidered. Cumulationsmooths a series and introduces movementssuggestive of trends and cycles. Many economic timeseries seem to be of a type somewhat analogous tosuch cumulated chance series. Some economicseries suggest chanceseries which have been cumulatedtwice. For example, each observation on apopulation series is notonly highly correlated withthe immediatelypreceding observation, but the firstdifferences are highly cor- a

40 TilE SlvlOOTlIING OF TIME SERIES related with the preceding hrst differences.The first differences are the resulant of three factors, births, deaths and migration. Births anddeaths are functions of the size of the populationand hence highly correlated with thesame items for the pre- ceding year. iheeXCeSS of births over (leaths and the amount ofmigration are correlated with simi- lar items in the precedingyear, though tile corre- lation will not necessarily beso high as in thic case of births or deaths alone. The('()Illrnonest tYpe of economic time seriessuggests a cumulated chance series on which has beensuperposed another l)Ut non-cumulated chance series anda more or less regular and unchangingseasonal fluctuation. How should such seriesbe smoothed? \Vhiatsort of procedure wouldseem adapted to eliminate all seasonal and erratic fluctuationsleavinga reason- able picture of thecyclical fluctuationsand any underlying trend'? Whilesuch a seriesas Railroad Stock Prices showssome evidence ofan underlying trend which might be rationally described bysome appropriate mathematicalequation,itscyclical movements do notsuggest any such possihtltr. Such series as Call Money Rates.Time Money Rates, ComniercialPaper Ratesand Railroad Bond Yieldsdo notsuggest any possibility of rationally describing either their longtime trends or cyclical fluctuations by mathematicalequations To fita straight line, a parabola, acompound in- TUE SMOOTITIN(; OF' TIME SERI ES 41 terest curve, a logistic curve, or all)' other trend curve to such a series would seem highly empirical, not to say absurd. Any attempt to fit a shigic mathematical equa- tion to the whole of one of such series would seem quite as absurd if the equation were designed to describe periodjc movements, as if it were designed to describe trend movements. The indiscriminate use of harmonic analysis to describe series which are not provably harmonic in their origin is inde- fensible. The posltion of each point on a fitted har- monic curve is affected by the position of each datum point. If short cycles appear in the first two thirds of the data, the harmonic curve will intro- duce short cycles in the last third even if they do not appear in the data. Any part of a smooth curve assumed to underlie the data curve is a smooth curve only because it is an outgrowth of the imme- diately preceding portion of such smooth curve. It is flO more affected by distant points in the past than the wobbling track left by the rear wheel of a bicycle ridden by an inexperienced rider is af- fected by the position of the track 1 00 yards back. Any part of the track of the rear wheel will be a comparatively smooth curve because of the mari- ner in which it is made. It is an outgrowth of an immediately preceding portion of the track, but this does not mean that it is in any important sense affected by the track 100 yards back. A skillful 42 TIlE SMOOTJII\(; OF'I'JME SERIES rider COU Id approach a large scale rnatheiiiical curve and ride upon it. Ofcourse, he would not be able to do so instantly. He would haveto come out of the particular curve on which hewas travel. ing and gradually approach themathennitj('a I curve on which he wished to travel. Iheattempt to represent theentire bicycle track bya single mathematical equatioii, whether designedto (IC- scribe trend, "cyclical"movements or both, be plainly would quixotic. Harmonicanalysis should be used not to describe dataunless thereare reasons for believing thatcycles ofconstant period herent in the are in- very nature of those data.Ofcourse. the analysisitself may offer strongevidence that this isso, but evidently this could hardly hethe case with any suchbicycle ''data." various Attefllj)ts by investigators to discoverrigidly inathe- matical cycles (otherthan seasonal fluctuations)in economic data of thetype presented in of interest the study rates and securityprices have been not so far sucientIy convifl('ifgto makeone feel that the above"bicycle" illustrationiseither fetchedor illegitinn far-