Marine Environmental Research 87-88 (2013) 112e121
Contents lists available at SciVerse ScienceDirect
Marine Environmental Research
journal homepage: www.elsevier.com/locate/marenvrev
Multivariate analysis applied to agglomerated macrobenthic data from an unpolluted estuary
Anxo Conde a,b,*, Júlio M. Novais a,1, Jorge Domínguez b a IBB-Institute for Biotechnology and Bioengineering, Center for Biological and Chemical Engineering, Instituto Superior Técnico (IST), Av. Rovisco Pais, 1049-001 Lisbon, Portugal b Departamento de Ecoloxía e Bioloxía Animal, Universidade de Vigo, Vigo E-36310, Spain article info abstract
Article history: We agglomerated species into higher taxonomic aggregations and functional groups to analyse envi- Received 20 February 2013 ronmental gradients in an unpolluted estuary. We then applied non-metric Multidimensional Scaling Received in revised form and Redundancy Analysis (RDA) for ordination of the agglomerated data matrices. The correlation be- 15 April 2013 tween the ordinations produced by both methods was generally high. However, the performance of the Accepted 19 April 2013 RDA models depended on the data matrix used to fit the model. As a result, salinity and total nitrogen were only found significant when aggregated data matrices were used rather than species data matrix. Keywords: We used the results to select a RDA model that explained a higher percentage of variance in the species Taxonomic sufficiency Functional groups data set than the parsimonious model. We conclude that the use of aggregated matrices may be Environmental gradients considered complementary to the use of species data to obtain a broader insight into the distribution of Ordination methods macrobenthic assemblages in relation to environmental gradients. Model selection Ó 2013 Elsevier Ltd. All rights reserved. Minho estuary Iberian Peninsula
1. Introduction into consideration the environmental problem in question. Similar conclusions were reached in long term monitoring studies that The distribution of macrobenthic assemblages in relation to propose periodical analysis at species level rather than at lower environmental drivers has been often described at the species level taxonomic resolution, such as the family level (Musco et al., 2011). (Rodrigues et al., 2006; Ysebaert et al., 1998; Zajac and Whitlatch, Agglomeration of species into guilds (usually feeding guilds) has 1982). Agglomeration or aggregation of species at low resolution also been considered (Gaudêncio and Cabral, 2007; García-Arberas levels has also been used to describe environmental gradients. Ac- and Rallo, 2002). The use of functional groups (or functional traits) cording to the concept of taxonomic sufficiency, species data are subdivides the aggregated data at a finer functional level. For aggregated into higher taxonomic levels deemed to be sufficient for instance, the carnivore guild may be further subdivided in accor- the purposes of a study. The term ‘taxonomic sufficiency’ (coined by dance with the motile or sessile characteristic of the species, Ellis, 1985) has been used to identify the effects of pollution on reflecting different ecological niches that are not indicated by the marine communities (e.g. Ferraro and Cole, 1990). The family level ‘carnivore’ grouping alone. Similarly, although species in the same has been proposed as an appropriate level of description for pollu- feeding guild commonly compete for the same food resource, the tion studies (Gómez Gesteira et al., 2003), although higher levels interaction is not necessarily reflected in a taxonomic approach may also be used (Warwick, 1988). However, it has been argued that (Rosenberg, 2001). Improvements in computational methods enable the likelihood of detecting a stressor at higher taxonomic levels will more complex analyses that allow species, biological traits and depend on the severity of the stressor (Ferraro and Cole, 1990), environmental matrices to be considered simultaneously (Dolédec and therefore the taxonomic resolution used in a study must take et al., 1996; Legendre et al., 1997). Analysis and visualization of patterns based on species agglomeration matrices in relation to * Corresponding author. IBB-Institute for Biotechnology and Bioengineering, environmental factors may be accomplished by common ordination Center for Biological and Chemical Engineering, Instituto Superior Técnico (IST), Av. techniques such as Non-metric Dimensional Scaling (NMDS, Terlizzi Rovisco Pais, 1049-001 Lisbon, Portugal. Tel.: þ351 218419124; fax: þ351 et al., 2008) and Redundancy Analysis (RDA, Boström et al., 2006). 218419062. E-mail addresses: [email protected], [email protected] (A. Conde). A few studies have shown that the family level provides an 1 Deceased. adequate description of macrobenthic assemblages along natural
0141-1136/$ e see front matter Ó 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.marenvres.2013.04.005 A. Conde et al. / Marine Environmental Research 87-88 (2013) 112e121 113 gradients (De Biasi et al., 2003; Dethier and Schoch, 2006; Moreira et al., 2006). Other authors have recommended using this Wlodarska-Kowalczuk and Kedra, 2007, and references therein). estuary as a pristine reference site for comparison with other metal The family level is a better descriptor than higher taxonomic levels polluted estuaries (Reis et al., 2009). However, Sousa et al. (2008a) or trophic groups for explaining the distribution of the assemblages have identified up to 11 well-established alien species of in- (Dethier and Schoch, 2006), although a lower level of data resolu- vertebrates and fish in the Minho estuary. Among these, the inva- tion may also be useful (Wlodarska-Kowalczuk and Kedra, 2007). sive bivalve Corbicula fluminea (Müller, 1774) has achieved a key Aggregation of species has been underexplored as a means of position in the benthic assemblages because of its high abundance, describing natural gradients (Wlodarska-Kowalczuk and Kedra, biomass and production (Sousa et al., 2008b). 2007). The aim of this study was to assess the usefulness of agglomerated data sets in explaining the distribution of macro- 2.2. Sampling and laboratory procedures benthic assemblages along a natural gradient in an unpolluted estuary. As estuaries may display horizontal, vertical or cross- Sampling was conducted on both banks of the estuary during a sectional spatial gradients, among others (McLusky, 1993), the summer spring tide, between 23 and 26 August 2010. We sampled study was conducted in an area of an estuary between the zones ten sites and assigned them codes in alphabetical order from AM, most affected by marine and freshwater influences. We measured a BM. to KM, excluding IM (Fig. 1). We conducted intertidal sam- number of abiotic variables (redox potential, total nitrogen content pling during low spring tides immediately above the water edge. of the sediments, salinity and others) with the aim of explaining the These sampling surveys included sites along the main axis of the distribution of the macrobenthic assemblages. We compared the estuary, in salt marshes and on an estuarine island (Boega Island, results of applying multivariate analyses to the species dataset and site JM). The length of the sampling area was approximately 13 km, the results of applying the same analyses to agglomerated data extending from the mouth of the estuary to a few kilometres up- grouped at low level of resolution (order and class taxonomic level stream of the village of Vilanova da Cerveira (Fig. 1). and gross feeding guilds), which are rather overlooked in com- We used a standard field probe (WTW 340i) to measure tem- parison with finer aggregation types such as the family taxonomic perature, salinity, oxygen, redox potential in the interstitial water, level (De Biasi et al., 2003) or subdivisions of feeding guilds at a depth of approximate 10 cm in the sediment. We sampled the (Fauchald and Jumars, 1979). The underlying null hypothesis of the top 3 cm of the sediment (making composite sample from three study was that there was no relation between multivariate patterns replicate samples per site) to determine the sediment grain size by in agglomerated data sets and the species data set. the dry sieving method. We defined the grain size fractions as follows: gravel (>2 mm), coarse and medium sand (2e0.250 mm), fi e fi < 2. Material and methods ne sand (0.250 0.063 mm) and nest grains ( 0.063 mm), following the method of Rodrigues et al. (2006) and Silva et al. 2.1. Study site (2006), although these authors considered coarse and medium sand content separately. At each sampling site, we placed samples of the top 3 cm of the sediment in plastic containers and kept them The River Minho is the longest river in the Northwest Iberian Peninsula. The annual average discharge of the River Minho is in a cooler at 4 C for subsequent determination of total organic 13,560 h m3, with a monthly flow average ranging from 100 m3 s 1 carbon and total nitrogen (on dried samples) in an elemental in August to 1000 m3 s 1 in February (deCastro et al., 2006). The analyser (LECO CN2000). estuarine part of the river (Fig. 1) lies between Portugal and Galicia We inserted a corer of inner diameter 95 mm to a depth of 25 cm ¼ 2 (Spain). The Minho estuary has mesotidal features and is partially in the sediment (7 replicates 0.05 m ) to sample infaunal or- mixed, although it tends to be a salt wedge estuary when high ganisms. We sieved all samples through a 1 mm mesh and pre- floods occur (Sousa et al., 2005). Numerous small tributaries drain served the material retained on the mesh in 70% ethanol. We used a into the estuary (Fig. 1). dissecting microscope to sort, count and identify samples of A number of studies have highlighted the low level of anthro- benthic fauna to the lowest possible taxonomic level. pogenic pressure on the Minho estuary (Monteiro et al., 2007; 2.3. Taxa aggregation and categories
We agglomerated the taxa into four aggregation types, each constituted by different categories, as explained below. We first aggregated the taxa into two different taxonomic levels: order and class; the number of categories for both aggregation types depended on the fauna found in each sampling site. We also aggregated the taxa into four categories of trophic guilds: suspension feeders, deposit-feeders (including surface and sub-surface-deposit feeders), carnivores (or predators) and omnivores. This trophic aggregation, with the exception of the omnivore category, follows that used by Chardy and Clavier (1988). We categorized feeding guilds in accor- dance with Ysebaert et al. (1998), Cummins et al. (1989) and Mancinelli et al. (2005). The feeding guilds are hereinafter referred to as guilds. For the final aggregation, we combined class and guilds in the same data matrix, to produce a mixed data matrix. This approach is based on that used by Pagola-Carte and Saiz-Salinas (2001), although in the present study each taxon occurs twice, as a category of both class and guilds. For example, the species Hediste diversicolor occurs in the polychaete category (class) and the omnivore category (guild). We labelled taxa that did not fitanyof Fig. 1. Map of the Minho estuary showing the location of the sampling sites. the categories within a specific aggregation type as other (e.g. we 114 A. Conde et al. / Marine Environmental Research 87-88 (2013) 112e121 defined insects as other). We also considered the species data set We used canonical ordination to determine the spatial structure (including some undetermined taxa) in the analysis. of the species assemblages in response to an environmental explanatory matrix. We chose redundancy Analysis (RDA) because 2.4. Statistical analysis it enables transformation of the species data matrix (Legendre and Gallagher, 2001; Borcard et al., 2011) and posterior projection on a We used multivariate analysis to examine the spatial distribution Euclidean space (Legendre and Gallagher, 2001). We constructed and composition of the assemblages within the estuary. We used the RDA triplots by using site scores based on the weighted sum of Vegan package (Oksanen et al., 2011), which is run in the free R species (wa) as a counterbalance or midpoint between the de- environment for statistical computing (R Development Core Team, scriptors used in the analysis and environmental constraints e 2011), for most of the analyses. We calculated a dissimilarity ma- (following Oksanen, 2011). The so-called species environment ’ trix, based on the Euclidean distance of the Hellinger transformed correlation is also provided (Oksanen, 2011). We used Akaike s In- fi raw data, to overcome acute differences in the numbers of in- formation Criteria (AIC) to measure the goodness of t of the RDA dividuals between sampling stations (Legendre and Gallagher, 2001). models (Oksanen et al., 2011). We assessed linear dependencies fl We then applied non-metric Multidimensional Scaling (NMDS) to between environmental variables by computing variance in ation site centroids (7 replicates). NMDS is an ordination technique that factors (Borcard et al., 2011; Oksanen et al., 2011), so that the values displays the distance between the considered objects in accordance of none of the variables was higher than 10, as recommended by with a previously computed dissimilarity matrix (e.g. Legendre and Borcard et al. (2011). We used permutation tests to assess the sig- fi Legendre, 1998). We identified groups of sites by cluster analysis ni cance of the environmental variables and RDA axes of the con- (Legendre and Legendre,1998), in accordance with their similarity in strained ordination. We used forward selection (Blanchet et al., fi species composition. We formed three groups of sites in each cluster 2008) to determine which environmental variables signi cantly analysis by choosing an arbitrary threshold dissimilarity distance explained the distribution of the assemblages. (Clarke and Warwick, 1994). We used complete linkage agglomera- We used Procrustean superimposition (Gower,1971) to compare tive clustering because it is considered an appropriate approach for constrained and unconstrained ordinations between different detecting discontinuities in data (Borcard et al., 2011). species aggregations. Procrustes rotation minimizes the sum of We used non-parametric permutational multivariate analysis of squared residuals between matrices that are compared. Therefore, variance (PERMANOVA, Anderson, 2001, 2005) to test for differ- the optimal superimposition between matrices is used as a metric fi ences between the assemblages in groups identified by cluster of association (Gower, 1971). The statistical signi cance of the dif- analysis, with sites nested within groups of sites. We mainly used ferences highlighted by Procrustean analysis can be assessed by a the PERMANOVA test to assess the convergence of the results ob- permutational procedure (PROTEST; Jackson, 1995). We calculated fi tained with different data sets, not to test for differences on the the corresponding values on the rst two dimensions of the ordi- groups identified a posteriori by cluster analysis, because in such nations to maintain the same dimensionality between the com- cases circularity makes invalid any inference (Clarke and Warwick, parisons (Peres-Neto and Jackson, 2001). We used the PROTEST 1994). We used the Hellinger distance in the analysis. PERMANOVA approach because it has been shown to be more powerful than the enables post-hoc analysis based on uncorrected permutations for Mantel test for this purpose (Peres-Neto and Jackson, 2001). multiple testing. The Monte Carlo asymptotic P-value (Anderson, We assessed and compared the dispersion within each aggre- 2005) is provided as the P-value (P) for the pairwise comparisons. gation type to identify any intrinsic differences in the multivariate We applied the indicator value index (IndVal; Dufrêne and dispersion among aggregation types. Anderson (2001) pointed out Legendre, 1997) to identify the components of the fauna that that the mean distance to the group centroid could be tested in ’ characterized the groups of sites by comparing their abundance and multivariate data in the same way as by a Levene s test for homo- occurrence. In relation to the type of sites under analysis, the geneity of variances in univariate analysis. The comparison used specificity and fidelity of the fauna yield an IndVal value that is here was based on the mean distance for each replicate (7 per site) higher for the most prominent members of the groups; the sig- in relation to the general centroid within an aggregation type prior nificance of the differences in these values is determined by a to Hellinger transformation. Pairwise comparisons can be con- fi permutation test (Dufrêne and Legendre,1997; Borcard et al., 2011). ducted after an overall signi cance test through the calculation ’ fi In IndVal, specificity (designated as Aij, see below) is calculated by of the Tukey s Honest Signi cant Differences (Oksanen, 2011) considering the mean number of individual of species i (or any between aggregations. category in this case) across sites of group j divided by the sum of Values are shown as means standard deviations ( sd). the mean number of individuals of species i over all groups. Fidelity (designated as Bij, see below) is equal to the number of sites in 3. Results cluster j, where species i occurs, divided by the total number of sites in the cluster. Finally, IndVal is calculated as the product between The environmental data obtained at each sampling site are specificity and fidelity: IndVal ¼ Aij *Bij. shown in Table 1. The porewater temperature ranged from
Table 1 Mean values ( sd) of the environmental variables measured at each site.
Variable Units AM BM CM DM EM FM GM HM JM KM