Forest inventory - a challenge for statistics
Erkki Tompp o
Finnish Forest Research Institute
Unioninkatu 40 A
FIN-00170 Helsinki, Finland
erkki.tomppo@metla.
Intro duction
Statistically designed forest inventories were intro duced simultaneously in three Nordic
countries, Norway, Finland and Sweden, in the b eginning of the 1920's घe.g. Ilvessalo 1927ङ.
Estimating the forest area and the volume of growing sto ck as well as analysing of increment
and drain of growing sto ck have been original ob jectives of inventories. The scop e of the rst
inventories was, however, already much wider, including, e.g., information on site typ es, forest
silvicultural state, structure of the growing sto ck, and applied and required silvicultural and
cutting regimes. Later, new parameters related to forest health and forest bio diversity have
app eared, e.g. sp ecies abundances and distributions.
The information needs are increasing, esp ecially at the moment when the awareness of
the forest health status and loss of biological diversity has arisen, the role of forests in pre-
venting global warming has b een recognised and, at the same time, the pressure to increase
timb er consumption is increasing. For instance, the pap er consumption has increased since the
b eginning of 1970s from ab out 130 million tons to 276 million tons in 1995. It is exp ected to
increase to 420-440 million tons by 2010. On the other hand, one half of the harvest timber is
still used for co oking and heating causing wide area deforestation in the dry tropics.
Forest inventories have traditionally provided information related the biological diversity
of forests, such as the structure of growing sto ck, areas of site fertility classes and sometimes the
distribution and abundance of plant sp ecies. An increasing concern ab out the loss of diversity,
caused e.g. by deforestation, human induced environmental and climate changes as well as the
extinction of sp ecies, has increased interest in the whole forest ecosystem and its biological
diversity. The comp osition and structure of landscap e, fragmentation of forests or land typ es,
the areas and spatial distributions of imp ortant habitat typ es are examples of characteristics
which can be measured in the context of large area inventories, at least when multi-source
information is utilised. Hanski घ1999ङ has presented mathematical mo dels that connect the
dynamics of sp ecies to the structure of fragmented landscap es.
Forests have also been seen as having a role in reducing the e ects of global warming
by binding the increasing amount of carb on dioxide in the atmosphere. Global forest area is
known reasonably well. However, the annual incrementplays an imp ortantroleincarbon ux
and is not known globally.
In order to be able to satisfy the increasing and diverse demands for scienti cally sub-
stantiated information, ecient metho ds are needed to measure forest resources, their status
and the comp onents of the whole forest ecosystem.
Examples of spatial variation typ es present in forests
Forest variables are commonly divided into groups describing an individual tree, a forest
stand घor a sample plotङ and a forest region. Each variable has usually its own covariance
structure which dep ends on the geographical scale. A single tree stem volume is assessed as
R
h
an integral of a stem curve, i.e. diameter as a function of height, V = dघhङdh. Trees in a
0
same stand are of similar form while those further apart from each other di er more. Tree stem
form dep ends also on the tree sp ecies and site variables. The within stand and between stand
variation can be mo delled, e.g, with mixed mo dels, d घhङ = f घx; y ङ+ v घhङ+e घhङ; where
ki k ki
d घhङ is the diameter of a tree i in a stand k , f घx; y ङ is a function of tree variables x and stand
ki
variables y , v घhङ is a random stand e ect and e a random tree e ect घLappi 1996ङ.
k ik
The relative lo cations of trees, spatial pattern of trees, a ects for instance the eciency
of sampling design, the growth of trees and can thus b e utilised in planning sampling metho ds,
in assessing the naturalness of forests, e.g. Spatial p oint pro cesses, e.g. Gibbs pro cesses, have
been used in mo delling spatial patterns of trees घSarkka and Tompp o 1998ङ.
Variables, like growth factors, site fertility, nutrient availability,cumulative temp erature
sum of growing season, e.g., are examples of variations of di erent scales. These can regarded
as a realisations of sto chastic pro cesses on the plane. These, on the other hand, very much
determine, the structure of the growing sto ck and its variability. Present silvicultural practice
has led to forests which are mosaics of stands of di erent age classes and tree sp ecies comp osi-
tions with a sp eci c tree form and spatial patterns of trees. The distribution of stands can be
regarded as an output of mosaic mo dels घStoyan et al. 1995ङ.
All these variations should be taken into account when planning sampling design of a
forest inventory, in parameter estimation and in deriving con dence limits of estimators. On
the other hand, practical questions, such as moving between sampling units a ects the costs
and should b e considered in minimising standard errors with given costs.
Parameters estimation with eld data
Forest inventories have motivated much of the pioneering work on the general theory
of line surveys, systematic sampling, and spatial statistics. Spatial auto correlation of trees,
stand and regional variables often leads to multi-stage sampling. An increasing utilisation of
supplementary data, e.g. remote sensing data or other georeferenced data, leads to two-phase
घdoubleङ sampling घCo chran 1963ङ.
Multi-stage sampling are commonly used in tree measurements. Few parameters, which
vary much also between trees and which are usually easy to measure, are measured from each
tree to be sampled. A smaller sub-sample is taken of the rst sample for measuring addi-
tional variables which usually vary less within stand or within a eld plot. Statistical questions
are, how many stages should be used, what are sizes of all samples, what variables should
be measured in each stage, and how to estimate the variables which are not measured. Tree
level volume and increment estimates are usually derived from the most intensive measure-
ments. Statistical mo dels, e.g. non-parametric regression analysis, are applied in estimating
the variables for trees of less intensive samples.
R R
The interest in forest inventory is often in the quantity M = z घtङ dt= y घtङ dt; where
A A
2
A is an inventory area, z घtङ;t 2 R the variable of interest, e.g., an indicator of a land use
2
class, volume of timber assortment and y घtङ;t 2 R is an indicator function of the stratum
घe.g, forestry landङ. After estimating all variables for each sampling unit, e.g. volumes for each
tree, estimation of area and volume parameters of a forest region leads to a ratio estimator
P P
n n
2
m = z = y = z=ऌ yऌ. A natural reliability measure for the estimator is E घm M ङ . Unbi-
i i
i i
ased estimator for systematic sampling is not known. Conservative estimators can be derived
utilising the prop erties of second order stationary pro cesses घMatउern 1960ङ. The parameter
estimation of spatial pattern mo dels with Gibbs pro cesses can be based on the prop erties of
Palm distributions of the pro cess and a chosen test function घFiksel 1988ङ. Another p ossibility
is to utilise approximative maximum likeliho o d approaches, for a review, see Geyer घ1998ङ.
Estimation of parameters with multi-source inventory
The increasing availability of supplementary data, e.g. remote sensing data, has changed
the requirements for statistical metho ds in forest inventories. Supplementary data is usually
cheap but much less accurate than eld measurements. A prop er use of the data can, however,
make the inventories more ecient. Some practical questions, like availability of data, due
to weather conditions, e.g., still prevent the full utilisation of data. The estimation with
supplementary data could be done in the framework of two-phase घdoubleङ sampling. A non-
parametric k-nn metho d, adopted in the Finnish national forest inventory, an be considered
as an extension of double sampling घKilkki and Paivinen 1987, Tompp o 1996ङ. An essential
prop erty is that all inventory variables, typically 100 to 400, can b e estimated at the same time.
The pro cedure utilises a distance measure de ned in the feature space of the supplemen-
tary data, denoted here by d, and de nes new area weights for each eld plot by computation
units. The weight of eld plot i to pixel p is de ned as
k
X
1 1
= ; घ1ङ w =
i;p
2 2
d d
p ;p p ;p
j =1
i
घj ङ
if pixel p is among the k nearest to p, otherwise w = 0. Here, k is a prede ned xed
i i;p
numb er. The weights w are summed over pixels p by computation units u घfor example by
i;p
municipalitiesङ yielding the weight of plot i to computation unit u
X
c = w : घ2ङ
i;u i;p
p2u
The sum c can be interpreted as that area घin pixelsङ of unit u, which is most similar to
i;u
sample plot i. The plots outside u may also receive p ositiveweights घsynthetic estimationङ.
The k-nn metho d is a exible and practical way to combine eld measurements and supple-
mentary data into an op erational inventory system, to obtain much more detailed information
ab out forests with very low additional costs compared to the inventory metho ds employing
sampling and eld measurements only. The metho d is more statistically oriented than the
old classi cation-based approach to the use of satellite images. The key feature concerning
the routine op erational use is, that the image pro cessing phase do es not dep end on the forest
variables to be estimated. After computing the sample plot weights c for each computation
i;u
unit u of interest, the image data is no longer used, and in principle, all parameters can be
estimated as weighted averages of eld plot data. Also b ecause the weights are the same for all
variables, the metho d preserves the natural dep endence structures between forest parameters.
Further advantages are the applicabilitytovery di erenttyp es of forests and the p ossibilityto
use di erent kinds of remote sensing material, b oth with only minor mo di cations.
For the k-nn metho d, the RMSE of the pixel level estimates can b e statistically assessed by
cross validation. However, the error in the estimate of a forest parameter in one pixel is highly
dep endent on the true value there, and thereby the errors are spatially correlated. The error
structure is made even more complex by the spatial dep endencies in the image itself and the
errors of p ossible other data sources like maps. Developing an op erationally usable statistical
error assessment technique is a highly challenging task, and a fully satisfactory solution is yet
to b e found.
Conclusions
Forest inventories have motivated pioneering work in statistics, esp ecially in spatial statis-
tics. Global forestry information needs are increasing at the moment when the requirements to
increases timber pro duction, and at the same, to preserve forest ecosystems exist. Increasing
amount of versatile supplementary data makes it p ossible to increase the eciency of inven-
tories. Analysis of temp orally and spatially correlated, multi-temp oral, multi-resolution and
multi-source data sets of future forest inventories is achallenging task for statistics.
REFERENCES
Co chran, W.G. घ1963ङ Sampling Techniques. Wiley. New York.
Geyer, C. J. घ1998ङ. Likeliho o d inference for spatial point pro cesses. In O. E. Barndor -
Nielsen, W. S. Kendall and M. N. M. van Lieshout घeds.ङ, Sto chastic Geometry, Likeliho o d
and Computation, no. 80 in Monographs on Statistics and Applied Probability. Chapman and
Hall/CRC.
Hanski, I. घ1999ङ: Metap opulation Ecology. Oxford University Press, Oxford.
Fiksel, T. घ1988ङ. Estimation of interaction potentials of Gibbsian point pro cesses. Statistics
19, 77-86.
Ilvessalo, Y. घ1927ङ The forests of Suomi घFinlandङ. Results of the general survey of the forests
of the country carried out during the years 1921{1924. घIn Finnish with English summaryङ.
Communicationes ex Instituto Quaestionum Forestalium Finlandie 11.
Kilkki, P.andPaivinen, R. घ1987ङ. Reference sample plots to combine eld measurements and
satellite data in forest inventory. University of Helsinki, DepartmentofForest mensuration and
management. Research Notes 19, 209-215.
Lappi, J. घ1986ङ. Mixed linear mo dels for analyzing and predicting stem form variation of Scots
Pine. Communicationes Instituti Forestalis Fenniae. 134.
Matउern, B. घ1960ङ Spatial variation. Medd. Statens Skogsf. Inst. 49घ5ङ. Also app eared as
number 36 of Lecture Notes in Statistics. Springer-Verlag, New York, 1986.
Stoyan, D., Kendall, W. S. and Mecke, J. घ1995ङ. Sto chastic Geometry and its Applications.
2nd edn. Wiley. New York
Sarkka, A. and Tompp o, E. घ1998ङ. Mo delling interactions between trees by means of eld
observations. Forest Ecology and Management 108, 57-62.
Tompp o, E. घ1996ङ Multi-source national forest inventory of Finland. In R. Paivinen, J. Vanclay
and S. Miina घeds.ङ, New Thrusts in Forest Inventory. Pro ceedings of IUFRO XX World
Congress, 6{12 Aug. 1995, Tamp ere, Finland, pp. 27{41.