Data Transformations

CHAPTER 9 Data Transformations Most data sets benefit by one or more data Domains and ranges transformations. The reasons for transforming data Bear in mind that some transformations are can be grouped into statistical and ecological reasons: unreasonable or even impossible for certain types of Statistical data. Table 9.1 lists the kinds of data that are potentially usable for each transformation. • improve assumptions of normality, linearity, homogeneity of variance, etc. Monotonie transformations • make units of attributes comparable when mea sured on different scales (for example, if you have elevation ranging from 100 to 2000 meters and Power transformation slope from 0 to 30 degrees) Ecological Different parameters (exponents) for the transfor • make distance measures work better mation change the effect of the transformation; p = 0 • reduce the effect of total quantity (sample unit gives presence/absence, p - 0.5 gives square root, etc. totals) to put the focus on relative quantities The smaller the parameter, the more compression applied to high values (Fig. 9.1 ). • equalize (or otherwise alter) the relative impor tance of common and rare species The square root transformation is similar in effect to, but less drastic than, the log transform. Unlike the • emphasize informative species at the expense of log transform, special treatment of zeros is not needed. uninformative species. The square root transformation is commonly used. Monotonie transformationsare applied to each Less frequent is a higher root, such as a cube root or element of the data matrix, independent of the other fourth root (Fig. 9.1). For example. Smith et al. (2001) elements. They are "monotonie" because they change the values of the data points without changing their rank. Relativizationsadjust matrix elements by a row 10 or column standard (e.g.. maximum, sum, mean, etc.). 9 One transformation described below, Beals smoothing, 8 is unique in being a probabilistic transformation 7 based on both row and column relationships. In this power 1/2 chapter, we also describe other adjustments to the data 6 power 1 3 matrix, including deleting rare species, combining 5 entities, and calculating first differences for time 4 power = 14 series data. 3 power " 1 10 It is difficult to overemphasize the potential 2 importance of transformations. They can make the 1 difference between illusion and insight, fog and clarity . 0 To use transformations effectively requires a good 0 25 50 75 100 understanding of their effects, and a clear vision of your goals. x Notation.— In all of the transformations described below, Figure 9.1. Effect of square root and higher root x,j = the original value in row / and column j of the transformations, b = f(x). Note that roots higher than data matrix three are essentially presence-absence transformations, yielding values close to 1 for all nonzero values. b,} = the adjusted value that replaces x„. 67 ( 'hapter 9 Table 9 1 Domain of input and range of output from transformations Reasonable and acceptable Range offix / domain of x MONOTONIC TRANSFORMATIONS x" (power) all 0 or 1 only x (power) nonnegative nonnegative log(x) positive all (2/7t)-arcsin(x) 0 < x < 1 0 to 1 inclusive (2/7T)-arcsin (x ") 0 < x < 1 0 to 1 inclusive SMOOTHING Beals smoothing 0 or 1 only 0 to 1 inclusive ROW/COLUMN RELATIVIZATIONS general nonnegative 0 to 1 inclusive by maximum nonnegative 0 to 1 inclusive by mean all all by standard deviates all generally between -10 and 10 binary by mean all 0 or 1 only rank all positive integers binary by median all 0 or 1 only ubiquity nonnegative nonnegative information function of ubiquity nonnegative nonnegative applied a cube root to count data, a choice supported bv various sciences. They claim that the abundance of an optimization procedure. Roots at a higher power species follows a truncated lognormal distribution, than three nearly transform to presence-absence: citing Sugihara (1980) and Magurran (1988) While nonzero values become close to one. while zeros the nonzero values of community data sets often remain at zero. resemble a lognormal distribution, excluding zeros often amounts to ignoring half of a data set. The log Logarithmic transformation normal distribution is fundamentally flawed when applied to community data because a zero value is. K = lo g (* „ ) more often than not. the most frequent abundance Log transformation compresses high values and value for a species. Nevertheless, the log transforma spreads low values by expressing the values as orders tion is extremely useful in community analysis, of magnitude. Log transformation is often useful when providing that one carefully handles the problem of there is a high degree of variation within variables or log(O) being undefined. when there is a high degree of variation among To log-transform data containing zeros, a small attributes within a sample. These are commonly true number must be added to all data points. If the lowest with count data and biomass data. nonzero value in the data is one (as in count data), then Log transformations are extremely useful for many it is best to add one before applying the transforma kinds of environmental and habitat variables, the log tions: normal distribution being one of the most common in hi} = log(x,; +1) nature See Limpert et al. (2001) for a general intro duction to lognormal distributions and applications in Data Transformation If. however, the lowest nonzero value of x differs < x < 1 The function arcsin is the same as sin 1 or from one by more than an order of magnitude, then inverse sine Data must range between zero and one. adding one will distort the relationship between zeros inclusive If they do not. you should rclativize before and other values in the data set For example, biomass selecting this transformation. data often contain many small decimal fractions Unlike the arcsine-squareroot transformation, an (values such as 0.00345 and 0.00332) ranging up to arcsine transformation is usually counterproductive in fairly large values (in the hundreds). Adding a one to community ecology, because it tends to spread the high the whole data set will tend to compress the resulting values and compress the low values (Fig 9.2). This distribution at the low end of the scale. The order-of- might be useful for distributions with negative skew, magnitude difference between 0.003 and 0.03 is lost if but community data almost alway s have positiv e skew you add a one to both values before log transformation: log( 1.003) is about the same as log( 1.03). Arcsine sqnareroot transformation The following transformation is a generalized procedure that (a) tends to preserve the original order bj = 2/π * arcsin (д/х^) of magnitudes in the data and (b) results in values of zero when the initial value was zero. Given: The arcsine-squareroot transformation spreads (he ends of the scale for proportion data, while com Min(.v) is the smallest nonzero value in the data pressing the middle (Fig. 9.2). This transformation is lnt(x) is a function that truncates x to an integer by recommended by many statisticians for proportion dropping digits after the decimal point data, often improving normality (Sokal and Rohlf c = order of magnitude constant = Int(log(Min(.v)) 1995). The data must range between zero and one. J — decimal constant = log'1 (c) inclusive. The arcsine-squareroot is multiplied by 2/π to rescale the result so that it ranges from 0 to 1 then the transformation is The logit transformation, b = ln(x/(l-x)). is also b,j = log(x,; + J ) - c sometimes used for proportion data (Sokal and Rohlf Subtracting the constant c from each element of the 1995). However, if x = 0 or x = 1. then the logit is data set after the log transforma undefined Often a small constant is added to prev ent tion shifts the values such that the lowest value in the data set will be a zero For example, if the smallest nonzero value in the data set is 0.00345. then 0.8 - log(min(x)) = -2.46 c = mt(log(min(x))) = -2 arcsin(sqrt(x) log'1 (c) = 0.01. Applying the transformation to some example values: If x = 0. arcsin(x then/) = log(0+0.01)-(-2). therefore b = 0. If x = 0.00345. then b = log(0.00345+0.01 )-(-2), therefore b = 0.128. A resine transformation h,, = 2/π * arcsin(x,,) The constant 2/π scales the result of arcsin(x) [in radians] to range from 0 to 1. assuming that 0 Figure 9.2. Effect of several transformations on proportion data ln(0) and division by /ero Alternatively, empirical data are quantitative and you do not want to lose this logits may be used (see Sokal and Rohlf 1995.762). information Because zeros are so common in community data, it Beals smoothing can be slow' to compute If you seems reasonable to use the arcsine squareroot or have a large data set and a slow computer, be sure to squareroot transformations to avoid this problem. allocate plenty of time This transformation is avail able in PC-ORD but apparently not in other packages Beals smoothing for statistical analysis. Beals smoothing replaces each cell in the commu nity matrix with a probability of the target species Relativizations occurring in that particular sample unit, based on the loint occurrences of the target species with the species "To relativize or not to relativize. that focuses the that are actually in the sample unit. The purpose of question. (Shakespeare. " ????) this transformation (also known as the sociological favorability index, Beals 1984) is to relieve the "zero- truncation problem" (Beals 1984).

Load more