Sentinel-2 and Sentinel-3 Intersensor Vegetation Estimation Via Constrained Topic Modeling
Total Page:16
File Type:pdf, Size:1020Kb
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 16, NO. 10, OCTOBER 2019 1531 Sentinel-2 and Sentinel-3 Intersensor Vegetation Estimation via Constrained Topic Modeling Ruben Fernandez-Beltran , Filiberto Pla , and Antonio Plaza , Fellow, IEEE Abstract— This letter presents a novel intersensor vegetation on global monitoring services over terrestrial and aquatic estimation framework, which aims at combining Sentinel-2 (S2) surfaces, using for this purpose high-resolution and midres- spatial resolution with Sentinel-3 (S3) spectral characteristics olution multispectral imagery [2]. More specifically, S2 [3] in order to generate fused vegetation maps. On the one is a polar-orbiting mission, which comprises two identical hand, the multispectral instrument (MSI), carried by S2, pro- vides high spatial resolution images. On the other hand, the satellites: S2A, launched on June 23, 2015, and S2B, which Ocean and Land Color Instrument (OLCI), one of the instru- followed on March 7, 2017. Each satellite incorporates a ments of S3, captures the Earth’s surface at a substantially multispectral instrument (MSI), which provides a versatile set coarser spatial resolution but using smaller spectral bandwidths, of 13 spectral bands ranging from the visible and near infrared which makes the OLCI data more convenient to highlight specific (VNIR) to the shortwave infrared (SWIR). Four of these bands spectral features and motivates the development of synergetic (B02-B04, B08) are acquired at a spatial resolution of 10 m, fusion products. In this scenario, the approach presented here takes advantage of the proposed constrained probabilistic latent six bands (B05-B07, B08A, B11, B12) at 20 m and the semantic analysis (CpLSA) model to produce intersensor veg- remaining three bands (B01, B09, B10) at 60 m. Analogously, etation estimations, which aim at synergically exploiting MSI’s S3 [4] includes a pair of satellites, called S3A and S3B, where spatial resolution and OLCI’s spectral characteristics. Initially, the first one was launched on February 16, 2016 and the sec- CpLSA is used to uncover the MSI reflectance patterns, which are ond one was successfully launched on April 25, 2018. Both able to represent the OLCI-derived vegetation. Then, the original satellites carry the Ocean and Land Color Instrument (OLCI), MSI data are projected onto this higher abstraction-level repre- sentation space in order to generate a high-resolution version of which provides 21 bands (Oa01–Oa21) spanning from 390- to the vegetation captured in the OLCI domain. Our experimental 1040-nm VNIR spectral range with bandwidths from 2.5 to comparison, conducted using four data sets, three different 40 nm. Regarding the spatial resolution of the sensor, OLCI regression algorithms, and two vegetation indices, reveals that the has global resolution requirement of 300 m. proposed framework is able to provide a competitive advantage Although S2 and S3 missions have been designed to in terms of quantitative and qualitative vegetation estimation provide global data products of vegetation, soil and water results. cover, inland waterways and coastal areas, the spectral and Index Terms— Constrained probabilistic latent semantic analy- spatial differences between MSI and OLCI sensors make sis (CpLSA), Sentinel-2, Sentinel-3, topic models, vegetation each satellite more suitable for a particular application field. estimation. On the other hand, the higher spatial resolution in S2 enables the use of its products for characterization tasks, with I. INTRODUCTION the requirement of a high level of spatial details such as HE Copernicus program is a joint initiative of the Euro- soil mapping or land use classification [5], S3 is able to Tpean Commission, the European Space Agency, and the capture imagery using smaller spectral bandwidths, which European Environment Agency in order to provide operational makes the OLCI data more convenient to highlight spe- monitoring information from space, useful for environment cific spectral responses that represent different features over and security applications. In this context, five different Sentinel the Earth’s surface. Specifically, vegetation cover can exem- Earth observation missions have been planned to guarantee this plify this point [6]. In general, vegetation indices, such operational provision [1]. Among all the program resources, as the normalized difference vegetation index (NDVI) [7] and Sentinel-2 (S2) and Sentinel-3 (S3) missions are focused the soil-adjusted vegetation index (SAVI) [8], seek to exploit the correlation between the maximum chlorophyll absorption Manuscript received November 6, 2018; revised January 10, 2019; accepted March 2, 2019. Date of publication March 26, 2019; date of current version wavelength and the Red-Edge electromagnetic spectrum. As a September 25, 2019. This work was supported by Generalitat Valenciana result, the smaller VNIR spectral bandwidth of the OLCI (APOSTD/2017/007), the Spanish Ministry of Economy (ESP2016-79503- sensor makes that fewer wavelengths are involved in the NDVI C2-2-P, TIN2015-63646-C5-5-R), Junta de Extremadura (Ref. GR18060) and the European Union under the H2020 EOXPOSURE project (No. 734541). and SAVI computations. This fact generates an enhanced (Corresponding author: Ruben Fernandez-Beltran.) response for plant surfaces, which eventually increases the R. Fernandez-Beltran and F. Pla are with the Institute of New Imaging instrument sensitivity to detect those image areas with certain Technologies, University Jaume I, 12071 Castelló de la Plana, Spain (e-mail: types of vegetation [4]. Precisely, these intersensor differ- [email protected]; [email protected]). A. Plaza is with the Hyperspectral Computing Laboratory, Department of ences motivate the development of fused vegetation prod- Technology of Computers and Communications, Escuela Politécnica, Univer- ucts to exploit MSI spatial resolution and OLCI spectral sity of Extremadura, PC-10003 Cáceres, Spain (e-mail: [email protected]). features. Color versions of one or more of the figures in this letter are available online at http://ieeexplore.ieee.org. In the literature, different kinds of regression algo- Digital Object Identifier 10.1109/LGRS.2019.2903231 rithms have been successfully applied to conduct biophysical 1545-598X © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. 1532 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 16, NO. 10, OCTOBER 2019 expectation–maximization (EM) algorithm [14], which per- forms two stages: 1) E-step where the likelihood expected values are computed given the current estimation of the parameters and 2) M-step where the new optimal values of the parameters are calculated according to the current settings. The E-step can be computed by using Bayes’ rule and the chain Fig. 1. CpLSA model. rule as shown in (1) and (2). For the M-step, we calculate CpLSA likelihood partial derivatives, set them as equal to zero, parameter estimations within the context of Sentinel mis- and solve the equations in order to obtain (3)–(6) sions. Specifically, Verrelst et al. [9] review several state- ( |w, ) of-the-art machine learning regression algorithms for S2 and p c d p(c,w,d) p(w|c)p(c|d) S3 satellites, and Caicedo et al. [10] assess multiple linear = = (1) and nonlinear regression algorithms with a range of remotely p(w, d) p(w|c)p(c|d) sensed data. Despite the value of these and other related c works, the regression process is often conducted from a single- p(z|w, d) sensor perspective and, usually, they only consider simulated ( ,w, ) (w| ) ( | ) = p z d = p z p z d Sentinel data [11]. This letter is focused on a more general (2) p(w, d) p(w|z)p(z|d) objective, where S2 and S3 operational products are combined z to generate fused vegetation maps with MSI spatial resolution p(w|c) and OLCI spectral characteristics, that is, the objective of this letter is based on exploiting the existing synergy between n(w, d)p(c|w, d) S2 and S3 missions to generate improved vegetation estimates = d (3) of the Earth surface. On the other hand, standard regression n(w, d)p(c|w, d) algorithms are able to generate such estimations by directly w d relying on low-level reflectance values, the proposed approach (w| ) p z takes advantage of a newly proposed constrained probabilistic n(w, d)p(z|w, d) latent semantic analysis (CpLSA) topic model to uncover two d different kinds of discriminating patterns in the S2 MSI spec- = (4) (w, ) ( |w, ) tral domain: 1) constrained-topics that are able to reproduce n d p z d the vegetation detected by the S3 OLCI sensor and 2) standard- w d ( | ) topics that represent the rest of the nonvegetation components p c d in S2 MSI. In this way, image pixels are managed at a higher n(w, d)p(c|w, d) abstraction level as a dual mixture of spectral patterns and, = w hence, it is possible to infer more accurate high-resolution n(w, d)p(c|w, d) + n(w, d)p(z|w, d) vegetation maps at S2 MSI spatial resolution using only the c w z w most S3 vegetation discriminating patterns. Our experiments (5) considering four data sets and two different vegetation indices p(z|d) reveal the advantages of the proposed approach to generate intersensor vegetation estimations when compared to three n(w, d)p(z|w, d) different standard regression algorithms. = w n(w, d)p(z|w, d) + n(w, d)p(c|w, d) z w c w II. METHODOLOGY (6) A. Constrained Probabilistic Latent Semantic Analysis where n(w, d) represents the number of times the word w Based on the incremental formulation of the asymmetric appears in the document d. The EM process is performed probabilistic latent semantic analysis model [12], we define as follows. First, p(w|c), p(w|z), p(c|d),and p(z|d) are a topic model extension, called CpLSA, which is specially randomly initialized.