Copernicus Sentinel-2A Calibration and Products Validation Status Ferran Gascon, Catherine Bouzinac, Olivier Thépaut, Mathieu Jung, Benjamin Francesconi, Jérôme Louis, Vincent Lonjou, Bruno Lafrance, Stéphane Massera, Angélique Gaudel-Vacaresse, et al.

To cite this version:

Ferran Gascon, Catherine Bouzinac, Olivier Thépaut, Mathieu Jung, Benjamin Francesconi, et al.. Copernicus Sentinel-2A Calibration and Products Validation Status. , MDPI, 2017, 9 (6), pp.1-81. ￿10.3390/rs9060584￿. ￿hal-01853316￿

HAL Id: hal-01853316 https://hal.archives-ouvertes.fr/hal-01853316 Submitted on 3 Aug 2018

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. remote sensing

Article Copernicus Sentinel-2A Calibration and Products Validation Status

Ferran Gascon 1,*, Catherine Bouzinac 2, Olivier Thépaut 2, Mathieu Jung 3, Benjamin Francesconi 4,Jérôme Louis 5, Vincent Lonjou 6, Bruno Lafrance 2, Stéphane Massera 7, Angélique Gaudel-Vacaresse 6, Florie Languille 6, Bahjat Alhammoud 8, Françoise Viallefont 9, Bringfried Pflug 10, Jakub Bieniarz 10,Sébastien Clerc 8, Laëtitia Pessiot 2, Thierry Trémas 6, Enrico Cadau 1, Roberto De Bonis 1, Claudia Isola 1, Philippe Martimort 1 and Valérie Fernandez 1

1 ESA (European Space Agency), Paris 75015, France; [email protected] (E.C.); [email protected] (R.D.B.); [email protected] (C.I.); [email protected] (P.M.); [email protected] (V.F.) 2 CS-SI (Communication & Systèmes—Systèmes d'Information), Toulouse 31506, France; [email protected] (C.B.); [email protected] (O.T.); [email protected] (B.L.); [email protected] (L.P.) 3 Airbus Defence and Space, Toulouse 31402, France; [email protected] 4 Thales Alenia Space, Cannes La Bocca 06156, France; [email protected] 5 Telespazio, Toulouse 31023, France; [email protected] 6 CNES (Centre National d'Etudes Spatiales), Toulouse 31401, France; [email protected] (V.L.); [email protected] (A.G.-V.); fl[email protected] (F.L.); [email protected] (T.T.) 7 IGN (Institut Géographique National) Espace, Ramonville-Saint-Agne 31520, France; [email protected] 8 ARGANS, Plymouth PL6 8BX, UK; [email protected] (B.A.); [email protected] (S.C.) 9 ONERA (Office National d'Etudes et Recherches Aérospatiales), Toulouse 31055, France; [email protected] 10 DLR (Deutschen Zentrums für Luft- und Raumfahrt), Berlin 12489, Germany; bringfried.pfl[email protected] (B.P.); [email protected] (J.B.) * Correspondence: [email protected]; Tel.: +39-06-941-88605

Academic Editor: Prasad S. Thenkabail Received: 15 September 2016; Accepted: 17 May 2017; Published: 10 June 2017

Abstract: As part of the of the European Commission (EC), the European Space Agency (ESA) has developed and is currently operating the Sentinel-2 mission that is acquiring high spatial resolution optical imagery. This article provides a description of the calibration activities and the status of the mission products validation activities after one year in orbit. Measured performances, from the validation activities, cover both Top-Of-Atmosphere (TOA) and Bottom-Of-Atmosphere (BOA) products. The presented results show the good quality of the mission products both in terms of radiometry and geometry and provide an overview on next mission steps related to data quality aspects.

Keywords: calibration; validation; optical; instrument; processing; imagery; spatial; operational

1. Introduction As part of the Copernicus programme of the European Commission (EC), the European Space Agency (ESA) has developed and is currently operating the Sentinel-2 mission acquiring high spatial resolution (10 to 60 m) optical imagery [1]. The Sentinel-2 mission provides enhanced continuity to services monitoring global terrestrial surfaces and coastal waters.

Remote Sens. 2017, 9, 584; doi:10.3390/rs9060584 www.mdpi.com/journal/remotesensing Remote Sens. 2017, 9, 584 2 of 81

The Sentinel-2 mission offers an unprecedented combination of systematic global coverage of land and coastal areas, a high revisit of five days under the same viewing conditions, high spatial resolution, and a wide field of view (295 km) for multispectral observations from 13 bands in the visible, near infrared and short wave infrared range of the electromagnetic spectrum. Frequent revisits of five days at the equator require two identical Sentinel-2 (called Sentinel-2A and Sentinel-2B units) and each one carrying a single imaging payload named MSI (Multi-Spectral Instrument). The orbit is Sun-synchronous at 786 km altitude (14 + 3/10 revolutions per day) with a 10:30 A.M. descending node. This local time was selected as the best compromise between minimizing cloud cover and ensuring suitable sun illumination. It is also aligned with other similar satellites, e.g., Landsat. An overview of the MSI imaging payload is provided in the following section. The Sentinel-2 satellites will systematically acquire observations over land and coastal areas from −56◦ to 84◦ latitude including islands larger 100 km2, EU islands, all other islands less than 20 km from the coastline, the whole Mediterranean Sea, all inland water bodies and all closed seas. Over specific calibration sites, for example DOME-C in Antarctica, additional observations will be made. The two units will work on opposite sides of the orbit. Sentinel-2A launch took place in June 2015 and Sentinel-2B is foreseen beginning 2017. Therefore, this paper focuses only on the performances achieved by Sentinel-2A. The availability of products with good data quality performances (both in terms of radiometry and geometry accuracies) has a paramount importance for many applications. This is indeed a key enabling factor for an easier exploitation of time-series, inter-comparison of measurements from different sensors or detection of changes in the landscape. Calibration and validation (Cal/Val) corresponds to the process of updating and validating on-board and on-ground configuration parameters and algorithms to ensure that the product data quality requirements are met. This paper provides a description of the calibration activities one year after Sentinel-2A launch, of the mission products validation activities. Only Sentinel-2A data were available at the manuscript-writing time. Performances derived from the validation activities have been estimated for both Top-Of-Atmosphere (TOA) and Bottom-Of-Atmosphere (BOA) products (referred respectively as Level-1 and Level-2A and further described in this paper).

2. Multi-Spectral Instrument Overview This section provides a brief overview of Sentinel-2 Multi-Spectral Instrument (MSI). It aims at giving to the reader the basis required to fully understand the measured performances and the Calibration and Validation (Cal/Val) approach.

2.1. MSI Design The MSI instrument design has been driven by the large swath requirement together with the demanding geometrical and spectral performances of the measurements. It is based on a push-broom concept, featuring a Three-Mirror Anastigmatic (TMA) telescope feeding two focal planes spectrally separated by a dichroic filter, as shown in Figure1. On the left part of this figure, the yellow part is a Sun diffuser used for radiometric calibration. One focal plane includes the Visible and Near-Infrared (VNIR) bands and the other one the Short-Wave Infrared (SWIR) bands. Remote Sens. 2017, 9, 584 3 of 80

Remote Sens. 2017, 9, 584 3 of 80 Remote Sens. 2017, 9, 584 3 of 81

Figure 1. Multi-Spectral Instrument (MSI) internal configuration. Left: full instrument view (diffuser panel in yellow, telescope mirrors in dark blue). Right: optical path construction to the Short-Wave FigureInfrared 1 1.. (MultiMulti-SpectralSWIR-)Spectral/visible to Instrument Instrument near-infrared (MSI) (MSI) (VNIR internalinternal) (see configuration. configuration.Section 0) splitter LeftLeft and:: fullfull focal instrumentinstrument planes. view view (diffuser(diffuser panel in yellow, telescope telescope mirrorsmirrors in dark blue). Right:: optical path path construction to to the the Short Short-Wave-Wave 2.2. SpectralInfrared Bands( (SWIR)/visibleSWIR )and/visible Resolution to to near near-infrared- infrared (VNIR (VNIR)) (see (see Section Section 0) 2.2 splitter) splitter and andfocal focal planes. planes. The Sentinel-2 MSI is a filter-based push-broom imager. It performs measurements in 13 spectral 2.2. Spectral Bands and Resolution bands spread over the VNIR and SWIR domains with spatial resolutions ranging from 10 to 60 m. TheseThe spectral Sentinel Sentinel-2 channels-2 MSI is include: a a filter filter-based- based push push-broom-broom imager. It performs measurements in 13 spectral bands spread over the VNIR and SWIR SWIR domains domains with with spatial spatial resolutions resolutions ranging ranging from from 10 10 to to 60 60 m. m.  4 bands at 10 m spatial resolution: blue (490 nm), green (560 nm), red (665 nm) and near infrared These spectral channels include: (842 nm). • 46 bands bands at at 10 20 m m spatial spatial resolution: resolution: blue 4 narrow (490 n nm), m),bands green mainly (560 nm),used red for vegetation(665 nm) and charac near terization infrared (842in the nm). red edge (705 nm, 740 nm, 783 nm and 865 nm) and 2 wider SWIR bands (1610 nm and • 62190 bands nm) at 2020 for mm applications spatialspatial resolution:resolution: such 4 as4 narrow narrow snow/ice/cloud bands bands mainly mainly detection used used for for or vegetation vegetation vegetation characterization charac moistureterization stress in intheassessment. the red red edge edge (705 (705 nm, nm, 740 nm,740 783nm, nm 783 and nm 865 and nm) 865 and nm) 2 wider and 2 SWIR wider bands SWIR (1610 bands nm (1610 and 2190 nm and nm)  2190for3 bands applications nm) at for 60 applicationsm such spatial as snow/ice/cloud resolution such as for snow/ice/cloud applications detection or such vegetation detection as cloud ormoisture vegetationscreening stress and moisture assessment. atmospheric stress • assessment.3correc bandstions at60 (443 m nm spatial for aerosols, resolution 945 for nm applications for water vapour such asand cloud 1375screening nm for cirrus and detection). atmospheric  3corrections bands at (44360 m nm spatial for aerosols, resolution 945 for nm applications for water vapour such andas cloud 1375 nmscreening for cirrus and detection). atmospheric The specified spectral band characteristics and resolutions are summarised in Figure 2. corrections (443 nm for aerosols, 945 nm for water vapour and 1375 nm for cirrus detection). The specified spectral band characteristics and resolutions are summarised in Figure2. The specified spectral band characteristics and resolutions are summarised in Figure 2. VNIR SWIR

B1 VNIR B9 B10 SWIR

60 m B1 B9 B10

60 m B5 B7 B8a

20 m B5 B7 B8a

20 m B6 B11 B12

10 m B6 B11 B12

10 m B2 B3 B4 B8

400 600 800 1000 1200 1400 1600 1800 2000 2200 2400 B2 B3 B4 B8 nm nm nm nm nm nm nm nm nm nm nm 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400

Figurenm 2. MSInm spectralnm bands vs.nm spatial resolutionnm n withm correspondingnm nm Full Widthnm at Halfnm Maximum nm (FWHM).

Remote Sens. 2017, 9, 584 4 of 80

Figure 2. MSI spectral bands vs. spatial resolution with corresponding Full Width at Half Maximum (FWHM).

2.3. Focal Plane Layout—Pixels Line of Sight Due to different thermal regulation constraints, VNIR and SWIR bands are separated into two Remote Sens. 2017, 9, 584 4 of 81 distinct focal planes of 12 detector modules each. The 12 detector modules on each focal plane are stagger-mounted to cover altogether the 20.6° 2.3.instrument Focal Plane field Layout—Pixels-of-view resulting Line ofin Sighta compound swath width of 295 km on the ground across track at the satellite reference altitude of 786 km. Due to different thermal regulation constraints, VNIR and SWIR bands are separated into two As illustrated in Figure 3, due to this staggered design of the detectors on the focal planes (and distinct focal planes of 12 detector modules each. due to the position of the mirrors of the spectral bands between two successive detectors), a parallax The 12 detector modules on each focal plane are stagger-mounted to cover altogether the 20.6◦ angle between the odd and even clusters of detectors is induced on the measurements, resulting in instrument field-of-view resulting in a compound swath width of 295 km on the ground across track an inter-detector shift along track from 14 km to 48 km at maximum. Furthermore, two successive at the satellite reference altitude of 786 km. detectors share an overlap area of 2 km across track. As illustrated in Figure3, due to this staggered design of the detectors on the focal planes (and Likewise, the hardware design of both the VNIR and SWIR detectors induces a relative due to the position of the mirrors of the spectral bands between two successive detectors), a parallax displacement of each spectral channel sensor within the detector, resulting in an inter-band angle between the odd and even clusters of detectors is induced on the measurements, resulting in measurement parallax amounting to a maximum along-track displacement of about 17 km between an inter-detector shift along track from 14 km to 48 km at maximum. Furthermore, two successive B10 and B12. detectors share an overlap area of 2 km across track.

Odd clusters Overlapping area = 98 pixels @ 20m

10 bands Even clusters Nb pixels / detector module:

2592 (10m) or 1296 (20m – 60m)

Cross-detector parallax

Cross-band parallax

Figure 3.3. Staggered Detector Configuration.Configuration.

As a consequence of this layout, the landscape is acquired with different viewing angles from Likewise, the hardware design of both the VNIR and SWIR detectors induces a relative one detector to another. This can result in radiometric differences, due to the anisotropy of the displacement of each spectral channel sensor within the detector, resulting in an inter-band atmosphere or surface and as a function of the viewing and sun directions. measurement parallax amounting to a maximum along-track displacement of about 17 km between For each pixel “p” of each detector module, the Line of Sight (LOS) is given in Equation (1) by a B10 and B12. vectoAsr Z a consequence( p) identified of thisby angles layout,  theX(p) landscape and Y(p) is acquiredexpressed with in differenta frame linked viewing to angles the MSI from (X oneLOS, detectorYLOS, ZLOS to). another.Figure 4 Thisillustrates can result the LOS in radiometric defined by differences, Equation (1). due to the anisotropy of the atmosphere or surface and as a function of the viewing and sun directions. For each pixel “p” of each detector module, the Line of Sight (LOS) is given in Equation (1) by a → vector Z(p) identified by angles ψX(p) and ψY(p) expressed in a frame linked to the MSI (XLOS,YLOS, ZLOS). Figure4 illustrates the LOS defined by Equation (1).   → tan(ψY(p)) 1   Z(p) = q − tan(ψX(p)) (1) 2 2   1 + tan (ψX(p)) + tan (ψY(p)) 1

Remote Sens. 2017, 9, 584 5 of 80

YLOS Pixel “p” RemoteRemote Sens. Sens.2017 2017, 9,, 5849, 584 5 of5 81of 80 XLOS

YLOS ZLOS Pixel “p”

LOS X Y

X Viewing direction ZLOS

Y

X Viewing direction

Satellite track

Figure 4. Line of Sight angles definition.

 tan( ( p))   Y Satellite1 track   Z( p)   tan( X ( p)) (1) 1 tan( ( p))2  tan( ( p))2   X Y  1  FigureFigure 4. 4Line. Line of of Sight Sight angles angles definition. definition. This expression precisely defines the Viewing Angle of each pixel and then the localization of This expression precisely defines the Viewing tan( Angle Y ( ofp)) each pixel and then the localization of this this pixel on ground, when considering1 the satellite orbital position, the sampling date, and the model pixel onZ ground,( p)  when considering the satellite orbital tan( position,( p)) the sampling date, and the model of(1) of the earth including a Digital Elevation2 Model.2  Figure X5 simply presents a flat ground projection of the earth including1 atan( Digital X ( p Elevation))  tan( Model.Y ( p)) Figure5 simply presents a flat ground projection of the the pixel LOS for the whole focal plane as well as the materialization1  of the MSI frame (XLOS, YLOS, ZLOS). pixel LOS for the whole focal plane as well as the materialization of the MSI frame (XLOS,YLOS,ZLOS). This expression precisely defines the Viewing Angle of each pixel and then the localization of this pixelVNIR on LOSground, projection when considering the satellite orbital position, the samplingDetector date, and 12 the model of the earthSWIR includingLOS projection a Digital Elevation Model. Figure 5 simply presents a flat ground projection of the pixel LOS for the whole focal plane as well as the materialization of the MSI frame (XLOS, YLOS, ZLOS). Pixel 1 Pixel 1

Pixel 1 Pixel 1 VNIR LOS projection Detector 12 Even clustersSWIR LOS projection

Pixel 1 Pixel 1

Pixel 1 Pixel 1 Odd clusters

Even clusters

Pixel 1 Pixel 1 XLOS Pixel 1 Pixel 1 Odd clusters

Detector 1 YLOS ZLOS

Pixel 1 Pixel 1 Figure 5. Representation of the pixels Line-Of-SightXLOS (LOS) projection on ground. Pixel 1 Pixel 1 Figure 5. Representation of the pixels Line-Of-Sight (LOS) projection on ground. 2.4. Detector Specificities Detector 1 YLOS ZLOS The VNIR assembly (built by Astrium-E2V) is based on monolithic Silicon CMOS technology [2] and consists of 10 spectral bands integrated on a single detector. The SWIR detector is based on hybrid HgCdTe-CMOSFigure 5. technology Representation where of the the pixels Hg1-xCdxTe Line-Of-Sight or Mercury (LOS) projection Cadmium on ground. Telluride (MCT) is

Remote Sens. 2017, 9, 584 6 of 80

2.4. Detector Specificities

RemoteThe Sens. VNIR2017, 9 assembly, 584 (built by Astrium-E2V) is based on monolithic Silicon CMOS technology6 of [2 81] and consists of 10 spectral bands integrated on a single detector. The SWIR detector is based on hybrid HgCdTe-CMOS technology where the Hg1-xCdxTe or Mercury Cadmium Telluride (MCT) is hybridized to aa siliconsilicon readoutreadout circuitcircuit ROIC.ROIC. TheThe SWIR SWIR assembly assembly has has 3 3 spectral spectral bands bands combined combined on on a singlea single detector detector respectively respectively [3 ].[3]. The photosensitive MCT material and the MCT MCT-CMOS-CMOS technology make the SWIR bands more sensitive to radiation and variations of temperature [ 4] than the VNIRVNIR bands.bands. This may result to a slightly larger dark signal variation or evolution for for the SWIR bands comparing comparing to to the the VNIR VNIR ones. ones. To achieveachieve aa highhigh radiometricradiometric performance, the SWIR detectors are cooled down to 190 K. For each × µ 2 SWIR spectral band, severalseveral lineslines ofof pixelspixels ofof 1515 × 15 μmm2 havehave been designed and allow an optimal pixel configurationconfiguration byby thethe selectionselection ofof thethe bestbest respondingresponding pixel(s)pixel(s) perper columncolumn andand perper bandband [[55].]. The number ofof pixelspixels inin eacheach detector detector module module is is 1296 1296 or or 2592, 2592, depending depending on on the the spectral spectral band. band. 60 m60m bands bands are are acquired acquired with with native native resolution resolution of of 20 20 m m across across tracktrack andand binningbinning is applied on ground. ground. The sampling sampling time time is also is alsodepending depending on the band.on the Table band.1 synthesizes Table 1 these synthesizes instrument these characteristics. instrument characteristics. Table 1. Detector Configuration.

Table 1. Detector Configuration. Number of Number of Number of X-Size Y-Size Resolution Integration Band Detector Lines after Pixels per (across-track) (along-track) (m) Time (ms) Number of Number of Number of Resolution Integration Lines Selection Detector X-Size(µ (acrossm) - Y-Size(µm) (along- Band Detector Lines after Pixels per B01(m) 60Time 9.396 (ms) 1 1 1296track) 15 (µm) track) 45 (µm) B02 10 1.566Lines 1 Selection 1 Detector 2592 7.5 7.5 B01 B0360 109.396 1.566 21 1 2 1296 2592 15 7.5 7.545 B02 B0410 101.566 1.566 21 1 2 2592 2592 7.5 7.5 7.57.5 B05 20 3.132 1 1 1296 15 15 B03 B0610 201.566 3.132 12 2 1 2592 1296 7.5 15 157.5 B04 B0710 201.566 3.132 12 2 1 2592 1296 7.5 15 157.5 B05 B0820 103.132 1.566 1 1 1 1296 2592 15 7.5 7.515 B06B8A 20 203.132 3.132 1 1 1 1296 1296 15 15 1515 B09 60 9.396 1 1 1296 15 45 B07 20 3.132 1 1 1296 15 15 B10 60 9.396 3 1 1296 15 15 B08 B1110 201.566 3.132 41 1 2 2592 1296 7.5 15 157.5 B8AB12 20 203.132 3.132 41 1 2 1296 1296 15 15 1515 B09 60 9.396 1 1 1296 15 45 B10 60 9.396 3 1 1296 15 15 As shown in Table1, SWIR spectral bands are acquired on several detector lines thanks to a Time B11 20 3.132 4 2 1296 15 15 DelayB12 Integration20 (TDI) technology.3.132 This allows4 increasing2 the Signal-to-Noise1296 Ratio15 (SNR), on15 the one hand, but on the other hand some reconfiguration is needed, as explained hereafter. IndeedAs shown SWIR in Table bands 1, SWIR detectors spectral are morebands sensitive are acquired to noise on several and to detector ageing. lines For thanks this reason, to a Time the detectorDelay Integration provides more(TDI) TDI technology. lines than This the nominal allows increasing number used the and Signal some-to rearrangements-Noise Ratio (SNR), are possible on the duringone hand, the but mission on the lifespan. other hand Figure some6 shows reconfiguration the possible is configurations. needed, as explained hereafter. SWIRIndeed pixel SWIR health bands is detectors monitored are all more along sensitive satellite lifetimeto noise andand TDIto ageing. selection For rearrangement this reason, the is performeddetector provides when required more TDI (cf. lines Section than 4.1.4 the). Anynominal change number of the used TDI configuration and some rearrangements has an impact are on thepossible pixel during Line-Of-Sight the mission (LOS) lifespan. that shall Figure also be6 shows taken intothe possible account configurations. by the ground processing system.

Pix n Pix n+1 Pix n+2 Pix n+3 TDI 1 n+1 n+1 n+1 TDI 2 TDI 3

Figure 6. Cont.

Remote Sens. 2017, 9, 584 7 of 81 Remote Sens. 2017, 9, 584 7 of 80

Pix n Pix n+1 Pix n+2 Pix n+3

n+1 n+1 n+1 TDI 1

TDI 2

TDI 3

TDI 4

Figure 6. Time Delay Integration (TDI) configuration examples for pixels of SWIR bands. Top figure Figure 6. Time Delay Integration (TDI) configuration examples for pixels of SWIR bands. Top figure corresponds to B10 configuration, Bottom figure corresponds to B11 and B12 configuration. Each corresponds to B10 configuration, Bottom figure corresponds to B11 and B12 configuration. Each column corresponds to one pixel. The lines correspond to available TDI lines. A cross corresponds to column corresponds to one pixel. The lines correspond to available TDI lines. A cross corresponds to the TDI line selected for each pixel. For For B10, B10, 3 3 TDI TDI lines lines are are available available for for only only 1 1 line line selected selected per per pixel. pixel. Therefore 3 configurationsconfigurations per pixel are possible. The actual configuration configuration may be different from one pixel to the other. For B11 and B12, 4 TDI lines are available, and 2 consecutive lines are used for each pixel. Therefore, 3 configurationsconfigurations per pixel areare possible.possible.

2.5. On-BoardSWIR pixel Equalization health is andmonitored Compression all along satellite lifetime and TDI selection rearrangement is performed when required (cf. Section 4.1.4). Any change of the TDI configuration has an impact on the pixelImages Line are-Of compressed-Sight (LOS) on that board shall to also reduce be taken to volume into ofaccount data to by be the downlinked ground processing to ground. system. Before compression, they are roughly equalized to minimize the signal entropy and improve the compression quality2.5. On- atBoard a given Equalization rate. The and equalization Compression consists in applying a two parts piecewise linear model to each pixel response in order to compensate for the non-uniformity between detectors. The component Images are compressed on board to reduce to volume of data to be downlinked to ground. Before in charge of these operations is called WICOM (Wavelet Image Compression and Memory). compression, they are roughly equalized to minimize the signal entropy and improve the Detailed description of the WICOM is complex and not interesting in the frame of this paper. compression quality at a given rate. The equalization consists in applying a two parts piecewise linear Nevertheless, one must be aware of the compression principle as well as the compression rates in order model to each pixel response in order to compensate for the non-uniformity between detectors. The to understand the potential impact on image quality. component in charge of these operations is called WICOM (Wavelet Image Compression and WICOM is a high performance image compression module that implements a wavelet image Memory). compression algorithm close to JPEG2000 standard consisting of a discrete wavelet transform (DWT). Detailed description of the WICOM is complex and not interesting in the frame of this paper. Before compression, a Non Uniformity Correction (NUC) function is applied to the 12 bits on-board Nevertheless, one must be aware of the compression principle as well as the compression rates in image. This function consists of a 2 parts linear function, reversible on ground. Of course, saturated order to understand the potential impact on image quality. pixels are detected before NUC computation and, for such pixels, the function is not applied. WICOM is a high performance image compression module that implements a wavelet image NUC function as well as compression function can be bypassed, for example in case of compression algorithm close to JPEG2000 standard consisting of a discrete wavelet transform (DWT). calibration images. Before compression, a Non Uniformity Correction (NUC) function is applied to the 12 bits on-board The nominal compression ratios and data rates are given in Table2. image. This function consists of a 2 parts linear function, reversible on ground. Of course, saturated pixels are detected before NUC computationTable 2. Nominal and, Compression for such pixels, Ratios. the function is not applied. NUC function as well as compression function can be bypassed, for example in case of calibration images. B1 B2 B3 B4 B5 B6 B7 B8 B8a B9 B10 B11 B12 TheComp. nominal ratio compression2.4 3.33 ratios 3.19 and 3 data 3.13 rates 3.13 are 2.86given in 3 Table 3.13 2. 2.14 2.65 2 2.4 Bits per pixel 5 3.6 3.76 4 3.84 3.84 4.2 4 3.84 5.6 4.52 6 5 Table 2. Nominal Compression Ratios.

As shownB1 in TableB22 , theB3 compression B4 ratiosB5 are lowB6 enoughB7 so thatB8 theB8a image B9 quality B10 is preservedB11 B12 Comp.with little ratio loss. 2.4 3.33 3.19 3 3.13 3.13 2.86 3 3.13 2.14 2.65 2 2.4 Bits per pixel 5 3.6 3.76 4 3.84 3.84 4.2 4 3.84 5.6 4.52 6 5 2.6. On-Board Sun Diffuser AAs full-field shown (orin Table full-pupil) 2, the on-board compression diffuser ratios is used are to low perform enough the radiometricso that the calibration image quality and to is guaranteepreserved awith high little quality loss. radiometric performance (Figure7). The diffuser is mounted on the so-called Calibration and Shutter Mechanism (CSM). This CSM is designed to collect the sunlight after reflection 2.6. On-Board Sun Diffuser by a diffuser, and to prevent a direct view of the Sun or contamination during launch and early operationA full phase-field (LEOP).(or full-pupil) on-board diffuser is used to perform the radiometric calibration and to guarantee a high quality radiometric performance (Figure 7). The diffuser is mounted on the so- called Calibration and Shutter Mechanism (CSM). This CSM is designed to collect the sunlight after

Remote Sens. 2017, 9, 584 8 of 80 reflection by a diffuser, and to prevent a direct view of the Sun or contamination during launch and early operation phase (LEOP). The advantage of such on-board device is to provide the instrument with a very uniform and Remote Sens. 2017, 9, 584 8 of 81 well-known signal, allowing very accurate absolute and relative radiometric calibrations.

Flight direction

Nadir

Figure 7. Illustration of the sun diffuser acquisition principle. Figure 7. Illustration of the sun diffuser acquisition principle.

This CSMThe advantageis periodicall of suchy used on-board (typically device every is month) to provide to assess the instrument the radiometric with a performance very uniform and and towell-known compute new signal, calibration allowing parameters very accurate (cf. Section absolute 4.1.2). and relative radiometric calibrations. This CSM is periodically used (typically every month) to assess the radiometric performance and 3. Productsto compute Overview new calibration parameters (cf. Section 4.1.2).

3.1. Processing3. Products Levels Overview Sentinel3.1. Processing-2 MSI Levelsproducts comprise the following levels of processing:  Level-Sentinel-20 (L0) product MSI productss: Raw comprise instrument the data following packaged levels forofprocessing: long-term storage and future reprocessing campaigns. A coarse cloud mask is appended for cataloguing purposes.  Level• Level-0-1A (L1A) (L0) products: products: Uncompressed Raw instrument instrument data data packaged in sensor for geometry long-term and storage with a coarse and future registrationreprocessing (i.e., rough campaigns. pixel A alignment coarse cloud between mask images is appended from for different cataloguing spectral purposes. bands and detector• Level-1A modules). (L1A) No products: resampling Uncompressed nor radiometric instrument corrections data in sensor have geometry been applied. and with These a coarse productsregistration are used (i.e., for roughcalibration pixel purposes. alignment between images from different spectral bands and detector  Levelmodules).-1B (L1B) Noproducts: resampling Top- norOf-Atmosphere radiometric corrections (TOA) radi haveances been in applied.sensor geometry These products (same are as used Levelfor-1A calibration products). purposes.Full radiometric corrections have been applied. These products are used for calibration,• Level-1B validation (L1B) products: and quality Top-Of-Atmosphere control purposes. (TOA) radiances in sensor geometry (same as  LevelLevel-1A-1C (L1C) products). products: Full TOA radiometric reflectances corrections in cartographic have been geometry. applied. These Thes productse products are are used for publiclycalibration, disseminated validation by ESA. and quality control purposes.  Level• Level-1C-2A (L2A) (L1C) products: products: Bottom TOA-Of reflectances-Atmosphere in cartographic(BOA) reflectances geometry. in cartographic These products geometry are publicly (samedisseminated as Level-1C products). by ESA. Today, these products can be generated by users with the publicly available• Level-2A Sen2Cor (L2A) processo products:r. L2A Bottom-Of-Atmosphere production is now systematic (BOA) reflectancesover Europe in and cartographic dissemination geometry through(same the asCopernicus Level-1C products).Open Access Today, Hub these [6] started products in May can 2017. be generated by users with the publicly available Sen2Cor processor. L2A production is now systematic over Europe and dissemination 3.2. Level-1through Processing the Steps Copernicus Open Access Hub [6] started in May 2017. The main steps of the processing chain are summarized in Figure 8. 3.2. Level-1 Processing Steps The main steps of the processing chain are summarized in Figure8.

Remote Sens. 2017, 9, 584 9 of 81 Remote Sens. 2017, 9, 584 9 of 80

Figure 8.8. MSI processing chain overview. The The gray gray font indicate optional steps implemented to mitigatemitigate risk on instrument performance but currentlycurrently notnot activated.activated.

After formatting and decompression, the first first processing step for Level-1BLevel-1B is the radiometricradiometric correction, which aims at converting converting instrument instrument counts counts into into physical physical units units (radiances). (radiances). First, the coarse on-boardon-board equalization performed to improve compression levels is inverted to recover raw measurement data. Then, Then, the the fine fine radiometric model using instrument calibration coefficientscoefficients is applied (including dark signal removal removal and and pixel pixel response response non-uniformity non-uniformity correction). correction). A linear correction is applied on SWIR bands to remove the effects of the electronic cross-talk.cross-talk. Blind pixels at the edges edges of of the the detector detector are are removed, removed, and and defective defective pixels pixels are interpolated. are interpolated. SWIR SWIR pixels arepixels re-aligned are re- consistentlyaligned consistently with the on-boardwith the pixel on-board selection pixel (see selection Section (see2.4). Section The processing 2.4). The chain processing implements chain a contingencyimplements a deconvolution contingency deconvolution step to improve step image to improve quality. image This optionquality. has This not option been has activated not been for Sentinel-2Aactivated for as Sentinel the image-2A quality as the isimage compliant quality with is pre-launchcompliant expectations.with pre-launch Finally, expectatio some radiometricns. Finally, qualitysome radiometric masks are generatedquality masks (for saturatedare generated and defective (for saturated pixels). and Binning defective from pixels). 20 m to Binning 60 m resolution from 20 ism appliedto 60 m onresolution the corresponding is applied on spectral the corresponding bands to improve spectral signal bands to noiseto improve ratio. signal to noise ratio. After applicationapplication ofof the the radiometric radiometric correction, correction, geometric geometric refining refining is applied. is applied. This This algorithm algorithm aims ataims improving at improving the geolocation the geolocation performance performance and especially and especially the multi-temporal the multi-temporal geometric geometric stability. stability. It will beIt will activated be activated after the after completion the completion of the Global of the Reference Global Reference Image (GRI) Image currently (GRI) currently planned forplanned end 2017. for Theend processing 2017. The involvesprocessing a smoothing involves a of smoothing the instrument of the viewing instrument model viewing (attitude model and position) (attitude using and theposition) set of using accurately the set localised of accurately ground localised control ground points of control the GRI points for the of the reference GRI for band. the reference band. Level-1CLevel-1C processing processing starts starts with with the the reprojection reprojection of the of imagesthe image in thes in cartographic the cartographic reference reference frame. Theframe. processing The processing is performed is performed for each for L1C each tile intersectingL1C tile intersecting the instrument the instrument swath. Each swath. pixel Each is projected pixel is fromprojected the cartographic from the cartographic grid onto the grid instrument onto the instrument frame, using frame, the viewing using the model viewing appended model to appended the Level 1Bto the product Level and 1B theproduct reference and Digitalthe reference Elevation Digital Model Elevation (DEM) atModel 90 m (DEM) spatial sampling.at 90 m spatial The pixel sampling. value forThe each pixel spectral value for band each is spectral interpolated band using is interpolated B-splines. using B-splines. Conversion from radiances to reflectancesreflectances reliesrelies onon thethe SolarSolar IrradianceIrradiance modelmodel ofof [[77].]. Finally, qualityquality masksmasks -derived-derived fromfrom thethe L1BL1B masks-masks- as well as opaque and cirrus clouds masks are generated. The The opaque cloud mask is based on radiometric thresholds on B01 and B12, while cirrus cloudsclouds areare detecteddetected with with band band B10. B10. Viewing Viewing and and solar solar angles, angles, as wellas well as meteorologicalas meteorological data data are computedare computed on aon coarse a coarse grid grid for eachfor each L1C L1C tile andtile and appended appended as metadata as metadata or auxiliary or auxiliary data. data.

3.3. Level-2A Processing Steps The main processing steps of the Sen2Cor processor are presented in Figure 9. Level-2A processing is applied to granules of Top-Of-Atmosphere (TOA) Level-1C orthorectified reflectance

Remote Sens. 2017, 9, 584 10 of 81

3.3. Level-2A Processing Steps

RemoteThe Sens. main 2017, processing9, 584 steps of the Sen2Cor processor are presented in Figure9. Level-2A processing10 of 80 is applied to granules of Top-Of-Atmosphere (TOA) Level-1C orthorectified reflectance products. Theproducts. processing The processing starts with starts the Cloudwith the Detection Cloud Detection and Scene and Classification Scene Classification followed followed by the retrieval by the ofretrieval the Aerosol of the Aerosol Optical Optical Thickness Thickness (AOT) (AOT) and the and Water the Water Vapour Vapour (WV) (WV) content content from from the the Level-1C Level- image.1C image. The The final final step is step the is TOA the to TOA Bottom-Of-Atmosphere to Bottom-Of-Atmosphere (BOA) conversion. (BOA) conversion. Sen2Cor Sen2Cor also includes also severalincludes options several that options can be that activated can be like activated cirrus correction, like cirrus terrain correction, correction, terrain adjacency correction, correction adjacency and empiricalcorrection Bidirectionaland empirical Reflectance Bidirection Distributional Reflectance Function Distribution (BRDF) Function corrections. (BRDF) corrections.

Figure 9. Level-2A processing chain overview. Figure 9. Level-2A processing chain overview.

Sen2Cor reliesrelies onon twotwo main main auxiliary auxiliary data data sets: sets: the the Radiative Radiative Transfer Transfer Look-Up Look-Up Tables Tables (LUTs) (LUTs) and theand Digital the Digital Elevation Elevation Model Model (DEM). (DEM). The latter The modellatter model is not embeddedis not embedded in the Sen2Corin the Sen2Cor software software but the defaultbut the DEMdefault (Shuttle DEM Radar(Shuttle Topography Radar Topography Mission Digital Mission Elevation Digital Model Elevation at 90 Model m spatial at resolution)90 m spatial is downloadedresolution) is automaticallydownloaded automatically if requested. Otherwise, if requested. the Otherwise, user can add the any user other can add DEM any in DTEDother DEM format. in Level-2ADTED format. outputs Level are-2A the out surfaceputs (orare BOA)the surface reflectance (or BOA) images reflectance which are images provided which at differentare provided spatial at resolutionsdifferent spatial (60 m, resolutions 20 m and10 (60 m) m, together 20 m and with 10 AOT m) together and WV with maps. AOT The and internal WV Scenemaps. Classification The internal (SCL)Scene image Classification used for (SCL) atmospheric image used correction for atmospheric is also provided correction together is also with provided Quality togetheIndicatorsr with for cloudQuality and Indicators snow probabilities. for cloud and snow probabilities.

3.4. Level Level-1C-1C Product Description L1C products provide provide TOA TOA normalized normalized reflectances reflectances for for each each spectral spectral band, band, coded coded as integers as integers on on15 bits. 15 bits. The The physical physical values values range range from from 1 1(minimum (minimum reflectance reflectance 10 10-4)-4) to to 10000 10000 (reflectance (reflectance 1), but values higher than 1 can be observedobserved in somesome casescases duedue toto specificspecific angularangular reflectivityreflectivity effects.effects. The value 0 is reserved for “No Data”.Data”. The computation computation of of the the reflectance reflectance relies relies on on a solar a solar irradiance irradiance model model and and the thecosine cosine of the of Sun the Sunzenith zenith angle angle with withrespect respect to the to reference the reference Earth Earthellipsoid. ellipsoid. Both data Both are data available are available in the product in the productmetadata. metadata. The reflectancereflectance values are providedprovided inin UniversalUniversal TransverseTransverse MercatorMercator (UTM)(UTM) projection.projection. The Sentinel-2Sentinel-2 Level Level-1C-1C granules, also called “tiles “tiles”,”, are based on the Military Grid ReferenceReference System (MGRS) tiles with with a a 5 5 km km extension extension on on all all sides sides to toensure ensure an anoverlap overlap with with neighbouring neighbouring tiles. tiles. The The110 2 110× 110× km1102 kmfootprintfootprint of each of each tile tilecan can be beobtained obtained in ina aKeyhole Keyhole Mark Mark-up-up Language Language (KML) (KML) file file from Sentinel-2Sentinel-2 internet site [8]. The projection is computed using a global global Digital Digital Elevation Elevation Mo Modeldel (DEM) with approximately 9090 mm horizontalhorizontal resolutionresolution andand 1010 mm ofof elevationelevation uncertainty.uncertainty. Reflectance images are provided with a spatial resolution of 10, 20 or 60 m depending on the spectral bands. Sixty meter images are obtained by spatial binning of measurements sensed at 20 m resolution as recalled in Section 2.4. Sentinel-2 L1C products take the form of folders gathering geolocalised JPEG2000 images for each spectral band, metadata in xml format and auxiliary data. Images are gathered in folders for

Remote Sens. 2017, 9, 584 11 of 81

Reflectance images are provided with a spatial resolution of 10, 20 or 60 m depending on the spectral bands. Sixty meter images are obtained by spatial binning of measurements sensed at 20 m resolution as recalled in Section 2.4. Sentinel-2 L1C products take the form of folders gathering geolocalised JPEG2000 images for each spectral band, metadata in xml format and auxiliary data. Images are gathered in folders for each granule, together with quality masks in Geographic Mark-up Language (GML) format and tile metadata. Granule masks include an opaque cloud and cirrus mask, and detector footprint masks for each spectral bands. Viewing angles (zenith/azimuth with respect to North direction) for each spectral band as well as Sun illumination angles are provided on a 5 km grid. Metadata are provided for the data strip (elementary processing unit) and user product (dissemination unit). data strip metadata provide information on processing configuration and parameters, and reports from automatic quality checks. The user product metadata summarizes information about the instrument spectral and radiometric response and processing parameters. Finally, Sentinel-2 products include auxiliary data such as meteorological data (systematic distribution) or processing parameter files (specifically added for expertise).

3.5. Level-2A Products Description Level-2A surface reflectance products are generated with Sen2Cor processor whose main purpose is to correct Sentinel-2 Level-1C products from the effects of the atmosphere. Level-2A processing is applied independently to single granules of TOA Level-1C orthorectified reflectance products. Note that this independent processing of neighbouring granules can lead to visible steps between adjacent granules. Sen2Cor processor is available as a third-party plugin of the Sentinel-2 Toolbox [9]. L2A products provide:

• Surface (or BOA) reflectance images which are provided at different spatial resolutions (60 m, 20 m and 10 m); • Aerosol Optical Thickness (AOT) and Water Vapour (WV) maps (60 m, 20 m and 10 m); • Scene Classification (SCL) map used internally as input for atmospheric correction together with Quality Indicators for cloud and snow probabilities (60 m and 20 m).

The structure of the Level-2A product [10] is strictly based on the structure of the Level-1C product. The same tiling geometry and projection are used. The main difference is that the IMG_DATA folder contains three directories: one for each resolution at 60 m, 20 m, and 10 m. The Scene Classification map is available at 20 m or 60 m resolution and the cloud and snow probabilities are located in the QI_DATA folder. The Scene Classification module provides a Scene Classification map divided in 11 classes presented in chapter 5. This map does not constitute a land cover classification map in a strict sense, its main purpose is to be used internally in Sen2Cor in the atmospheric correction module to distinguish between cloudy pixels, clear pixels and water pixels. Two quality indicators are also provided: a Cloud confidence map and a Snow confidence map with values ranging from 0 to 100%. The Water Vapour content is retrieved from Level-1C image using a Sentinel-2 adapted APDA (Atmospheric Pre-corrected Differential Absorption) algorithm [11] which uses a ratio between band B8A and band B09. The quantification value to convert Digital Numbers to Water Vapour column in cm is equal to 1000. The Aerosol Optical Thickness (AOT) at 550 nm is estimated using the Dark Dense Vegetation (DDV) pixel method introduced by [12]. AOT estimation fails when there are no DDV pixels in the image. The fallback solution for that case is to perform the atmospheric correction with a constant AOT which is specified by the start visibility (VIS) set in the configuration file. Default value of start visibility is 40 km which corresponds to an AOT at 550 nm of 0.2 at sea level. The quantification value to convert Digital Numbers to AOT is equal to 1000. Remote Sens. 2017, 9, 584 12 of 81

The actual Sen2Cor retrieval method is described in [13]. The method was developed for application over land surface. It includes several options that can be activated like cirrus correction, terrain correction, adjacency correction and empirical Bidirectional Reflectance Distribution Function (BRDF) corrections. Sen2Cor can be also applied over water surface using the AOT estimated over land pixels in the image. However the processor contains no consideration of water surface effects like sun glint. The surface reflectance values are coded in JPEG2000 with the same quantification value of 10,000 as for Level-1C products, i.e., a factor of 1/10,000 needs to be applied to Level-2A digital numbers (DN) to retrieve physical surface reflectance values. In the R10m directory of the IMG_DATA folder are located the surface reflectance images with 10 m spatial resolution for the following bands: B02, B03, B04 and B08. In the R20m directory, the 20 m resampled bands B02, B03 and B04 are provided together with the following surface reflectance bands: the “red edge” bands B05, B06, B07, the NIR band B8A and the two SWIR bands B11 and B12. Note that the band B08 is not provided resampled at 20 m because the band B8A at 20 m native resolution provides a narrower spectral width more suitable for vegetation monitoring applications. In the R60m directory, the band B01 and the band B09 are provided at their 60 m native resolution together with all other bands at the exception of bands B10 and B08. Band B10 is not provided in surface reflectance as it does not provide information on the surface and band B08 is replaced by band B8A which provides a narrower spectral width.

3.6. Processing Baseline Evolutions The public archive of Sentinel-2 product was opened with processing baseline 02.00. The processing baseline evolves to take into account corrections of errors in the processing chain and introductions of new product features or processing steps (e.g., introduction of the geometric refinement). On the other hand, evolution of calibration coefficient does not lead to a change of version. The list of auxiliary files used for processing can be traced in the data-strip metadata of the products. The evolutions of the processing baseline are tracked and justified in the Data Quality Report published on a monthly basis [14]. This document is available from the Sentinel-2 on-line technical guide.

4. Level-1 Calibration and Validation Status Calibration and Validation (Cal/Val) correspond to the process of updating and validating on-board and on-ground configuration parameters and algorithms to ensure that the product data quality requirements are met. These activities are performed within the frame of the Mission Performance Centre (MPC). These operations are achieved through products processing and analyses at different processing levels. This section provides an overview of the Cal/Val activities that have been performed on Sentinel-2A unit.

4.1. Radiometry Calibration Activities The radiometric calibration activities allow determination of Ground Image Processing Parameters (GIPPs) of the radiometric calibration model, which aims at converting the electrical signal measured by the instrument, transformed in digital counts, into the physical incoming radiance enlightening the sensor. The nominal calibrations are based on the exploitation of the on-board sun diffuser images (relative gains calibration, absolute radiometric calibration) or images acquired over ocean at night (for dark signal calibration). Some other calibrations are foreseen only in case of contingency (crosstalk, refocusing). Remote Sens. 2017, 9, 584 13 of 81

At the moment, the radiometric calibration is very close to what was expected before launch and the discrepancies observed are weak and under control. In particular, SWIR focal plane is subject to predictable electronic crosstalk which is corrected by a ground post-processing.

4.1.1. Dark Signal Calibration The dark signal is the signal delivered by the sensor when it is not illuminated by any light. Its measurement is realized by processing images acquired by night over ocean to avoid light coming from human activities. Such acquisitions were performed several times per week up to mid of May 2016 (during the commissioning phase and the beginning of the operational phase). Currently, dark signal acquisitions are performed at least once per 10-day orbit cycle. Starting from Level-0 products, the images are uncompressed and reversed from the on-board processing to retrieve the native values of the measurements. Dark acquisitions last 40 s. The average of digital counts over acquisition lines provides the dark signal coefficients for each pixel of each detector for each band. The standard deviation of the dark signal over the column gives an estimate of the dark signal noise, expressed in digital counts (DC) or Least Significant Bit (LSB). The MPC monitors the time evolution of the dark signal. No significant dark signal variation has been noticed since the sensor is in orbit. The dark current is particularly stable for VNIR bands: the variation of the measured dark coefficients between two acquisitions is always smaller than 1 digit (for extreme variations) and the dark signal noise is about 0.5 DC. Table3 shows the statistics of the differences between dark images obtained on 9 May and 8 June 2016. For SWIR bands, the variations of the dark signal can be larger, mainly for the B12 band for which the maximal variations can reach about 5 digital counts for a few pixels, and the pixel relative variation is noticeable (the standard deviation of variations over pixels is usually about ten times larger for B12 than for VNIR bands). The dark noise for SWIR bands is also slightly larger than for VNIR bands, with mean values of about 0.7 to 1 digital count. SWIR detectors are known to be more sensitive to variations than VNIR ones due to a difference in detector technology (see Section 2.4).

Table 3. On the left, statistics on dark signal variations between the dark estimate on June 8, 2016 and the operational dark coefficients deduced from an acquisition on May 9, 2016 (i.e., difference of the mean dark signal measured for the two acquisitions). On the right, statistics on the dark noise for estimate on June 8, 2016 (standard deviation of the dark signal measured by pixel). Measurements are expressed in digital counts (LSB).

Dark Signal Variation Relative to the Operational Dark Noise (RMS in LSB) GIPPs on 9 May 2016 08/06/2016 Min Mean Max Std 08/06/2016 Min Mean Max Std B01 −0.1 0.0 0.1 0.02 B01 0.33 0.51 0.78 0.05 B02 −0.1 0.0 0.1 0.02 B02 0.37 0.51 2.41 0.06 B03 −0.1 0.0 0.2 0.02 B03 0.21 0.42 0.75 0.08 B04 −0.1 0.0 0.3 0.02 B04 0.22 0.43 0.66 0.08 B05 −0.1 0.0 0.2 0.02 B05 0.37 0.53 0.75 0.05 B06 −0.2 0.0 0.2 0.02 B06 0.36 0.52 0.77 0.05 B07 −0.1 0.0 0.2 0.03 B07 0.35 0.53 0.83 0.05 B08 −0.1 0.0 0.3 0.02 B08 0.36 0.52 0.86 0.05 B8A −0.1 0.0 0.6 0.03 B8A 0.29 0.53 0.77 0.05 B09 −0.2 0.0 0.7 0.03 B09 0.29 0.50 0.75 0.06 B10 −1.0 0.0 1.1 0.12 B10 0.91 1.06 1.37 0.03 B11 −0.9 0.0 0.5 0.08 B11 0.54 0.67 0.82 0.03 B12 −5.0 0.0 5.5 0.34 B12 0.55 0.73 2.02 0.05

Figures 10 and 11, respectively for the B01 and B12 bands, give a global overview of the dark signal versus the pixel number, its variation between the two dates, and its noise (the successive detectors are agglomerated for these illustrations). They illustrate both the very good time stability of Remote Sens. 2017, 9, 584 14 of 81 the dark signal and the weak inter-pixel variations for VNIR bands, while the inter-pixel variations Remote Sens. 2017, 9, 584 14 of 80 appearRemote Sens. larger 2017 for, 9, 5 SWIR84 bands (mainly for B12 as plotted on Figure 11). 14 of 80

FigureFigure 10.1010. .Variation Variation ofof thethe darkdark signalsignal measuredmeasured onon 88 JuneJune 2016,2016, forfor thethe B01B01 band,band, withwith respectrespect toto thethe coefficientscoefficientscoefficients calculated calculated from from dark dark acquisition acquisition on on 9 9 May May 2016. 2016. TheThe first firstfirst plotplot showsshows thethe dark dark signal signal level level (in(in digitaldigi digitaltal counts, counts, LSB) LSB) as as a a function function of of the the pixel pixel number, number, the the second second plot plot shows shows its its variation variation between between thethe two two dates, dates, the the third third plotplot plot showsshows shows itsits its noisenoise noise (in(in (in LSB).LSB). LSB).

FigureFigure 11.1111. .Same Same as as Figure Figure 10 10 but but forfor thethe B12B12 band.band.

4.1.2.4.1.2. Absolute Absolute Radiometric Radiometric Calibration Calibration TheThe SunSun--difdiffuserfuser calibrationcalibration aimsaims toto assessassess thethe absoluteabsolute calibrationcalibration coefficientscoefficients andand thethe relativerelative gaingain coefficients coefficients (or (or equalisation equalisation coefficients). coefficients). The The method method relies relies on on the the comparison comparison of of the the Sun Sun-- diffuserdiffuser acquisition acquisition measurement measurement to to a a simulation simulation of of the the reflected reflected radia radiance.nce. TheThe simulation simulation takes takes into into account account the the solar solar irradiance irradiance corresponding corresponding to to each each acquisition acquisition line line ℓℓ, , i.e., the mean solar irradiance (E ) corrected for the Earth to Sun distance (dsun) and for the incidence i.e., the mean solar irradiance (Esunsun) corrected for the Earth to Sun distance (dsun) and for the incidence

Remote Sens. 2017, 9, 584 15 of 81

4.1.2. Absolute Radiometric Calibration The Sun-diffuser calibration aims to assess the absolute calibration coefficients and the relative gain coefficients (or equalisation coefficients). The method relies on the comparison of the Sun-diffuser acquisition measurement to a simulation of the reflected radiance. The simulation takes into account the solar irradiance corresponding to each acquisition line `, i.e., the mean solar irradiance (Esun) corrected for the Earth to Sun distance (dsun) and for the incidence angle (θsun) of the solar beam incoming on the Sun-diffuser for each line (as there is a slight time variation depending on the acquisition line). The model of the reflexion of the solar light on the Sun-diffuser considers the non-uniformity and bidirectional reflectivity of the diffuser from a pre-launch characterisation of its BRDF. This one gives for each pixel the diffuser reflectance as a function of the incident direction of the solar light (solar zenith angle θsun and solar azimuth angle ϕsun), accounting for a constant viewing direction for a given pixel p. This leads to a simulation of the radiance illuminating the MSI sensor which is given, for each band b and each detector d, by:

E (b) × cos θ (`) ( ` ) = sun sun × ( (`) (`) ) Zsimu b, d, , p 2 BRDFdi f b, d, θsun , ϕsun , p (2) dsun

The solar irradiance used as reference for Sentinel-2 comes from [7]. This is the CEOS recommended solar irradiance spectrum for use in Earth Observation applications. The Earth to Sun distance is calculated by Orekit flight dynamics library [15]. The Level-0 Sun-diffuser acquisition is corrected from the dark signal, then is equalised with the current calibration parameters (absolute and relative gain coefficients) to provide the Level-1B equalised measurement, Zmeas(b, d, `, p), by line and by pixel for each detector and band. For each band b, the absolute calibration coefficient, A(b), is then calculated by the average value of the ratio between measurement and simulation, for each pixel of each detector, over the acquisition lines: 1 Z (b, d, `, p) A(b) = × meas (3) ∑ ( ` ) Nd.N`.Np d,`,p Zsimu b, d, , p where Nd is the total number of detectors (12), Np is the number of pixels by detector, and N` is the number of acquisition lines selected for the processing (5100 lines for a 10 m-resolution band). The non-valid pixels are not taken into account in the calculation of the average value. SWIR bands are corrected beforehand from the cross-talk effect. A stray-light correction is also performed, by applying a correction factor on simulations (1.007 whatever the band) taking into account a uniform illumination of the sensor. Note that, for a given spectral band, the absolute calibration coefficient is related to the mean sensitivity of the radiometric response over all the detectors. The inter-detector variation of the sensor sensitivy is defined by the relative gain coefficients. The monitoring of the absolute calibration coefficients response is an important output of the calibration activity as their time variation impacts directly the global level of the measured radiance. The evolution of the absolute calibration coefficients, as it is included in the processing baseline 02.04, is illustrated on Figure 12 for VNIR and SWIR bands. The variation of the absolute calibration coefficients for VNIR bands is below 0.85% from the 06 July 2015 (reference date) until July 2016. Except for the B01 band, a decrease of the sensitivity is visible since the first Sun-diffuser acquisition to March 2016, by about −0.5% to −0.8% depending on the VNIR spectral band. A part of this decrease is due to the calculation of the Earth to Sun distance which was slightly improved by using Orekit in April 2016. Since April 2016, absolute calibration coefficients have remained stable by about 0.1% for VNIR bands. Remote Sens. 2017, 9, 584 15 of 80

angle (θsun) of the solar beam incoming on the Sun-diffuser for each line (as there is a slight time variation depending on the acquisition line). The model of the reflexion of the solar light on the Sun- diffuser considers the non-uniformity and bidirectional reflectivity of the diffuser from a pre-launch characterisation of its BRDF. This one gives for each pixel the diffuser reflectance as a function of the

incident direction of the solar light (solar zenith angle θsun and solar azimuth angle φsun ), accounting for a constant viewing direction for a given pixel p. This leads to a simulation of the radiance illuminating the MSI sensor which is given, for each band b and each detector d, by:

퐸푠푢푛(푏) × 푐표푠 휃푠푢푛(ℓ) 푍푠𝑖푚푢(푏, 푑, ℓ, 푝) = 2 × 퐵푅퐷퐹푑𝑖푓(푏, 푑, 휃푠푢푛(ℓ), 휑푠푢푛(ℓ), 푝) (2) 푑푠푢푛 The solar irradiance used as reference for Sentinel-2 comes from [7]. This is the CEOS recommended solar irradiance spectrum for use in Earth Observation applications. The Earth to Sun distance is calculated by Orekit flight dynamics library [15]. The Level-0 Sun-diffuser acquisition is corrected from the dark signal, then is equalised with the current calibration parameters (absolute and relative gain coefficients) to provide the Level-1B

equalised measurement, Zmeas(b, d, ℓ, p), by line and by pixel for each detector and band. For each band b, the absolute calibration coefficient, A(b), is then calculated by the average value of the ratio between measurement and simulation, for each pixel of each detector, over the acquisition lines:

1 푍푚푒푎푠(푏, 푑, ℓ, 푝) 퐴(푏) = × ∑ (3) 푁푑. 푁ℓ. 푁푝 푍푠𝑖푚푢(푏, 푑, ℓ, 푝) 푑,ℓ,푝

where Nd is the total number of detectors (12), Np is the number of pixels by detector, and Nℓ is the number of acquisition lines selected for the processing (5100 lines for a 10 m-resolution band). The non-valid pixels are not taken into account in the calculation of the average value. SWIR bands are corrected beforehand from the cross-talk effect. A stray-light correction is also performed, by applying a correction factor on simulations (1.007 whatever the band) taking into account a uniform illumination of the sensor. Note that, for a given spectral band, the absolute calibration coefficient is related to the mean sensitivity of the radiometric response over all the detectors. The inter-detector variation of the sensor sensitivy is defined by the relative gain coefficients. The monitoring of the absolute calibration coefficients response is an important output of the calibration activity as their time variation impacts directly the global level of the measured radiance. The evolution of the absolute calibration coefficients, as it is included in the processing baseline 02.04, Remoteis illustrated Sens. 2017 on, 9, Figure 584 12 for VNIR and SWIR bands. 16 of 81

Remote Sens. 2017, 9, 584 16 of 80

FigureFigure 12.12. TimeTime variationvariation ofof thethe absoluteabsolute calibrationcalibration coefficientscoefficients (for(for thethe processingprocessing baselinebaseline versionversion 02.04),02.04), normalizednormalized toto the the coefficients coefficients estimated estimated from from the the first first Sun-diffuser Sun-diffuser acquisition acquisition performed performed on on 06 July06 July 2015. 2015.

TheThe variationvariation of of the the absolute absolute calibration calibration coefficien coefficientsts for for VNIR SWIR bands bands is isbelow larger, 0.85 mainly% from for the the 06 B10July and2015 B11 (reference bands. As date) mentioned until July in Section2016. Except2, these for bands the B01 are respectivelyband, a decrease centred of onthe wavelengthssensitivity is 1375visible nm since (30 nmthe width)first Sun and-diffuser 1610 nmacquisiti (90 nmon width).to March The 2016, B10 by band about was −0.5 designed% to −0.8 to% coverdepending a strong on waterthe VNIR vapour spectral absorption band. bandA part and of isthis especially decrease sensitive is due to to the this calculation gas. During of thethe sensorEarth to manufacture, Sun distance a tinywhich amount was slightly of water improved vapour was by using trapped Orekit in the in optical April 2016. system. Since A slight April condensation 2016, absolute effect calibration occurs andcoefficients impacts ha theve sensitivityremained stable of this by band about (as 0.1 well% for for VNIR B11 andbands. B12, even if they are less sensitive to this absorption).The variation This of the contamination absolute calibration effect wascoefficients expected for before SWIR launchbands is and larger, can mainly be reversed for the by B10 a decontaminationand B11 bands. As operation mentioned consisting in Section in increasing2, these bands the temperatureare respectiv ofely the centred sensor on during wavelengths a short time:1375 whilenm (30 its nm nominal width) temperature and 1610 nm is around(90 nm 200width). K, the The SWIR B10 focalband plane was designed is heated to aroundcover a 300strong K during water 90vapour min (theabsorption MSI is unavailableband and is forespecially nominal sensitive acquisitions to this during gas. During around the 15.5 sensor h, i.e., manufacture, 3 h for heating a tiny up, 1.5amount h with of regulated water vapour decontamination was trapped temperaturein the optical and system. 11 h toA recoverslight condensat the operationalion effect temperature). occurs and Theimpacts resulting the sensitivity effect is clearly of this noticeable band (as in well Figure for 12B11. The and trend B12, ofeven decrease if they of are the less absolute sensitive calibration to this forabsorption). the B10 band This is about contamination−2.5% per effect six months was expected (about − 2% before and − launch1% respectively and can for be the reversed B11 and by B12 a bands).decontamination This is completely operation compensated consisting in afterincreasing decontamination the temperature (in the of range the sensor of uncertainty during a short due to time: the calculationwhile its no ofminal the Earthtemperature to Sun distanceis around before 200 K, the the use SWIR of thefocal more plane accurate is heated computation to around 30 by0 Orekit).K during 90 min (the MSI is unavailable for nominal acquisitions during around 15.5 h, i.e., 3 h for heating up, 4.1.3.1.5 h with Relative regulated Gains Calibrationdecontamination temperature and 11 h to recover the operational temperature).

The resultingTaking into effect account is clearly the dark noticeable signal variations,in Figure 12. the The measurements trend of decreaseYmeas (ofb, thed, ` ,absolutep) in digital calibration counts arefor equalisedthe B10 band by pre-definedis about −2.5 functions% per sixγ months(b, d, p, Y(about) which −2% were and estimated −1% respectively pixel by for pixel the in B11 pre-launch and B12 characterisationsbands). This is completely and are updated compensated in orbit after from decontamination Sun-diffuser acquisitions. (in the range These of uncertainty functions due deal to both the withcalculation the non-linearity of the Earth of to the Sun sensor distance response before and the withuse of the the inter-pixel more accurate differences computation of response. by Ore Theykit). are defined by a cubic model for VNIR bands and by a bi-linear model for SWIR bands. A piece-wise linear4.1.3. Relative function Gains has been Calibration shown to provide a better fit to non-linearity measurements for the SWIR

Taking into account the dark signal variations, the measurements 푌푚푒푎푠(푏, 푑, ℓ, 푝) in digital counts are equalised by pre-defined functions 훾(푏, 푑, 푝, 푌) which were estimated pixel by pixel in pre-launch characterisations and are updated in orbit from Sun-diffuser acquisitions. These functions deal both with the non-linearity of the sensor response and with the inter-pixel differences of response. They are defined by a cubic model for VNIR bands and by a bi-linear model for SWIR bands. A piece-wise linear function has been shown to provide a better fit to non-linearity measurements for the SWIR detectors with respect to a bi-linear model. Here is detailed the mathematical approach for the cubic model, a similar approach being applied for the bi-linear model.

The relation between the equalised measurements 푍푚푒푎푠(푏, 푑, ℓ, 푝) and measurements corrected for the dark signal, 푌푚푒푎푠(푏, 푑, ℓ, 푝), in digital counts, for VNIR bands is:

Remote Sens. 2017, 9, 584 17 of 81 detectors with respect to a bi-linear model. Here is detailed the mathematical approach for the cubic model, a similar approach being applied for the bi-linear model. The relation between the equalised measurements Zmeas(b, d, `, p) and measurements corrected for the dark signal, Ymeas(b, d, `, p), in digital counts, for VNIR bands is:

3 n Zmeas(b, d, `, p) = γ(b, d, p, Ymeas(b, d, `, p)) = ∑ Gn(b, d, p) × [Ymeas(b, d, `, p)] (4) n=1

In orbit, the relative gains can be adjusted only for the solar radiance level. Again, the method is based on the simulation of the solar radiance illuminating the sensor, as defined by Equation (2). The average value is considered over lines, giving a simulated radiance by pixel:

1 N` < Zsimu >(b, d, p) = × ∑ Zsimu(b, d, `, p) (5) N` `=1 where N` is the number of selected lines for the processing (5100 lines for a 10 m-resolution band). The relation between the simulated radiance by pixel < Zsimu >(b, d, p) and the corresponding simulation of measurement corrected for the dark signal < Ysimu >(b, d, p) is given by:

γ(b, d, p, < Ysimu > (b, d, p)) − A(b) × < Zsimu >(b, d, p) = 0 (6)

The first positive real root of the polynomial function gives < Ysimu >(b, d, p). This value is then compared to the measurement (average over lines too):

< Ysimu > (b, d, p) Ra(b, d, p) = (7) < Ymeas > (b, d, p)

One can deduce the update of coefficients Gn in Equation (4), for equalisation functions γ(b, d, p, Y) of VNIR bands, by: updated current n Gn (b, d, p) = Gn (b, d, p) × [Ra(b, d, p)] (8) The temporal evolution of the equalisation coefficients is monitored by the MPC. During the operational phase of Sentinel-2A, the sun-diffuser acquisitions are performed on a monthly basis. After each sun-diffuser acquisition, a new set of relative gains calibration parameters is generated. Table4 shows typical results of the R a estimate. The time variation of relative gains is weak for VNIR bands: typically maximal variations do not exceed 0.2 to 0.4% between two assessments. Variations are higher for SWIR bands with a change of gain coefficients up to 3% for some pixels and with an inter-pixel variation more pronounced for SWIR bands than for VNIR ones. For instance, Figure 13 illustrates the change of equalisation coefficients for the B10 band as a function of the pixel number. The maximum variations appear on the figure. They induce stripes along track on the images as explained in Section 4.3.1. The update of the equalisation coefficients eliminates these artefacts. Remote Sens. 2017, 9, 584 18 of 81

Table 4. Statistics on the Ra factor for relative gains variation between the sun-diffuser acquisition on 08 June 2016 and the previous update of GIPPs from the acquisition on 09 May 2016, without cross-talk correction. Minimal, mean and maximal values of Ra and its standard deviation over all pixels are given for each spectral band.

2016-06-08 Min Mean Max STD B01 0.999 1.000 1.001 0.0004 B02 0.999 1.000 1.001 0.0003 B03 0.999 1.000 1.001 0.0004 B04 0.999 1.000 1.001 0.0004 B05 0.999 1.000 1.002 0.0006 B06 0.999 1.000 1.001 0.0004 B07 0.999 1.000 1.001 0.0005 B08 0.999 1.000 1.002 0.0005 B8A 0.999 1.000 1.002 0.0005 Remote Sens. 2017, 9B09, 584 0.999 1.000 1.002 0.0005 18 of 80 B10 0.989 1.000 1.028 0.0014 B11B11 0.9920.992 1.0001.000 1.027 1.027 0.0012 0.0012 B12 0.994 1.000 1.006 0.0007 B12 0.994 1.000 1.006 0.0007

Figure 13. R factor as a function of the pixel number (over the 12 detectors) for the update of Figure 13. Ra afactor as a function of the pixel number (over the 12 detectors) for the update of gain gainfunctions functions measured measured on 8 on June 8 June 2016, 2016, for for the the B10 B10 band, band, with with respect respect to to the the previous operational coefficientscoefficients calculated fromfrom thethe sun-diffusersun-diffuser acquisitionacquisition onon 99 MayMay 2016.2016.

4.1.4. SWIR Detectors Re Re-Arrangement-Arrangement Parameters Generation As presented in Section 2.42.4,, individualindividual SWIR pixels TDI configuration configuration is is likely to be modified, modified, depending on the radiometric performance of individual pixels.pixels. The initial configurationconfiguration mode has been determined before launch during the calibration of the instrument with the aim of achievingachieving no defective pixel and the highest SNR. During the flight,flight, the performance may degrade and a new configuration configuration may be decided and uploaded to to the the satellite. satellite. The relative sensitivity of each pixel varies with thethe detectordetector physical lines and pixels.pixels. The Therere is no dependence on on the the line line number number of of the the detector detector matrix. matrix. It is It correct is correct to assume to assume that that all detectors all detectors are arevirtually virtually single single-line-line detectors detectors -only -only one one value value per per pixel pixel will will remain remain-- as as long long as as the calibration calibration coefficientscoefficients are recomputedrecomputed everyevery timetime thethe TDITDI configurationconfiguration oror thethe lineline selectionselection changes.changes. Before the launch, each pixel of each TDI line has been tested on ground and affected to one of the following categories:  1: valid pixel with SNR compliant to specification,  2: valid pixel with SNR below specification,  3: non-valid saturated pixel,  4: non-valid blind pixel,  5: non-valid with SNR below specification. Then, a rearrangement rule has been elaborated to classify the configurations relying on these indicators. The rules aim at maximising the SNR. When a rearrangement is required, the decision is taken with respect to a radiometric performance threshold and the best possible remaining configuration is adopted. A memory of past configurations and associated performances is obviously necessary to be maintained. As reselection is done on board the satellite, the impact on the mission is important. The new selection has to be uploaded and immediately followed by a sun-diffuser calibration as well as a dark signal acquisition to calibrate properly the new equalisation model of the pixel and validate the radiometric performance increase. Moreover, relevant GIPPs have also to be updated, particularly to take into account the new LOS of the pixel.

Remote Sens. 2017, 9, 584 19 of 81

• 1: valid pixel with SNR compliant to specification, • 2: valid pixel with SNR below specification, • 3: non-valid saturated pixel, • 4: non-valid blind pixel, • 5: non-valid with SNR below specification.

Then, a rearrangement rule has been elaborated to classify the configurations relying on these indicators. The rules aim at maximising the SNR. When a rearrangement is required, the decision is taken with respect to a radiometric performance threshold and the best possible remaining configuration is adopted. A memory of past configurations and associated performances is obviously necessary to be maintained. As reselection is done on board the satellite, the impact on the mission is important. The new selection has to be uploaded and immediately followed by a sun-diffuser calibration as well as a dark signal acquisition to calibrate properly the new equalisation model of the pixel and validate the Remoteradiometric Sens. 2017 performance, 9, 584 increase. Moreover, relevant GIPPs have also to be updated, particularly19 of 80 to take into account the new LOS of the pixel. If none of the TDI configuration is satisfying, the pixel might be declared invalid and If none of the TDI configuration is satisfying, the pixel might be declared invalid and interpolated interpolated by ground processing. by ground processing.

4.1.5. Crosstalk Crosstalk Correction Correction Calibration An electrical electrical crosstalk crosstalk phenomenon has been characterised on ground in SWIR bands. As As shown shown on Figure 14,14, when a a band is illuminated (only (only a a part part of of B12 B12 on on the the f figure)igure) the the two two other other bands bands are are affected by a negative signal perfectly proportional to the signal in the illuminated band. The The impact impact on a real landscape is a decrease of the actual signal in the affected bands.

1

0

B10 -1

-2 200 400 600 800 1000 1200

0.5

0

-0.5

B11

-1

-1.5 200 400 600 800 1000 1200

1000

500

B12 0

-500 200 400 600 800 1000 1200 0 Figure 14.14.Characterisation Characterisation of the of electrical the electri crosstalk:cal crosstalk example: example of test result of testfrom result ground from measurement ground CROSSTALK_Det5_right_B12 measurementat Focal Plane at Assembly Focal Plane (FPA) Assembly level (digital (FPA) counts level (digital vs. pixel counts number). vs. pixel number).

Electrical crosstalk affects the 3 SWIR bands and does not evolve along time. This crosstalk Electrical crosstalk affects the 3 SWIR bands and does not evolve along time. This crosstalk slightly varies from one detector to the other. Averaged attenuations in dB are given in Table 5. slightly varies from one detector to the other. Averaged attenuations in dB are given in Table5. The figures from Table5 expressed in percentage represent −0.45% for crosstalk from B11 to B10 Table 5. Averaged SWIR electrical crosstalk. and −0.2% from B12 to B10 whatever the radiance level. Note again that the figures are negative, which means that the crosstalk is subtracted from originalMeasurement signal. on Band (dB)

B10 B11 B12 B10 0 −83 −60 Illuminated Band B11 −47 0 −76 B12 −54 −53 0

The figures from Table 5 expressed in percentage represent −0.45% for crosstalk from B11 to B10 and −0.2% from B12 to B10 whatever the radiance level. Note again that the figures are negative, which means that the crosstalk is subtracted from original signal. Moreover, ghost images induced by crosstalk are shifted by constant number of pixels along track because of focal plane layout (see Section 2.3). For B10, for example, this number of pixel is exactly 66.6 from crosstalk from B12. An example of such a crosstalk on a real landscape is given in Figure 15. On this figure, dynamic has been extremely stretched. Indeed, the order of magnitude of the effect is inferior to 2 or 3 DC.

Remote Sens. 2017, 9, 584 20 of 81

Table 5. Averaged SWIR electrical crosstalk.

Measurement on Band (dB) B10 B11 B12 B10 0 −83 −60 Illuminated Band B11 −47 0 −76 B12 −54 −53 0

Moreover, ghost images induced by crosstalk are shifted by constant number of pixels along track because of focal plane layout (see Section 2.3). For B10, for example, this number of pixel is exactly 66.6 from crosstalk from B12. An example of such a crosstalk on a real landscape is given in Figure 15. On this figure, dynamic hasRemote been Sens. extremely 2017, 9, 584 stretched. Indeed, the order of magnitude of the effect is inferior to 2 or 3 DC.20 of 80

B11

B10 B12

Figure 15 15.. Left Left imageimage shows shows an an ex exampleample of of crosstalk crosstalk affecting affecting B10 (contrast strongly enhanced). Right represent the corresponding acquisitionsacquisitions inin B11B11 andand B12.B12.

In Figure 1515:: • Region 1:1: thethe signalsignal is is the the signal signal measured measured in B10in B10 far fromfar from the lakethe lak frome from which which is subtracted is subtracted 0.45% of0.45 the% signalof the measuredsignal measured in B11 (aroundin B11 (around 5 DC) and5 DC) 0.2% and of 0.2 the% signal of the measured signal measured in B12 (around in B12 2(around DC) also 2 DC) far fromalso far the from lake. the B10 lake. own B10 signal own is signal a landscape is a landscape residual residual that is seenthat is because seen because of the lackof the of lack cirrus. of cirrus. • Region 2: B10 is on the lake. B10 own radiometry is around 0 DC but B11 and B12 are not yet on the lake and crosstalk induces negatives valuesvalues atat focal plane level on B10. At ground level those negative values are truncated to zero.zero. • Region 3: B10 is still on the lake. B11 is now on the lake but B12 not yet. Thus the radiom radiometryetry is superior toto thethe oneone observedobserved inin regionregion 2 2 because because B11 B11 crosstalk crosstalk is is null null (because (because B11 B11 radiometry radiometry is null).is null).  • Region 4: B10, B11 and B12 are on the lake. Crosstalk is negligible. On this region we observe the real response ofof B10B10 onon thethe lake.lake.  Region 5: B10 is no more on the lake but B11 and B12 are still. On this region we observe the real • Region 5: B10 is no more on the lake but B11 and B12 are still. On this region we observe the real response of B10 on the lake border. response of B10 on the lake border.  Region 6: B10 and B11 are no more on the lake. B12 is still on the lake.  Region 7: idem region 1. The signal level measured is fully consistent with the crosstalk measured on ground. The 3 main ghosts of the lake are exactly shifted by 67 pixels one from the other. Thus, this kind of crosstalk is fully predictable and corrigible thanks to a simple linear combination of SWIR bands. This processing had been anticipated on ground and implemented in the image processing chain. It is now systematically applied. Figure 16 shows an example of such a correction.

Remote Sens. 2017, 9, 584 21 of 81

• Region 6: B10 and B11 are no more on the lake. B12 is still on the lake. • Region 7: idem region 1.

The signal level measured is fully consistent with the crosstalk measured on ground. The 3 main ghosts of the lake are exactly shifted by 67 pixels one from the other. Thus, this kind of crosstalk is fully predictable and corrigible thanks to a simple linear combination of SWIR bands. This processing had been anticipated on ground and implemented in the image processingRemote Sens. 2017 chain., 9, 584 It is now systematically applied. Figure 16 shows an example of such a correction.21 of 80

Figure 16. Example of crosstalk correction on band 10 (before/after). Figure 16. Example of crosstalk correction on band 10 (before/after).

Finally, it is noted that also sun-diffuser acquisitions are affected by crosstalk, and therefore are Finally, it is noted that also sun-diffuser acquisitions are affected by crosstalk, and therefore are also corrected before being processed for calibration purposes. also corrected before being processed for calibration purposes. 4.1.6. MSI Refocusing 4.1.6. MSI Refocusing MSI refocusing is only foreseen as a contingency activity in case of Modulation Transfer Function MSI refocusing is only foreseen as a contingency activity in case of Modulation Transfer Function (MTF) degradation (see Section 4.3.7 for more details on MTF assessment). This degradation is not (MTF) degradation (see Section 4.3.7 for more details on MTF assessment). This degradation is not expected, even considering the ageing of the instrument. expected, even considering the ageing of the instrument. Nevertheless, if required, the MSI can allow some focal adjustment. Indeed, M3 mirror (cf. Nevertheless, if required, the MSI can allow some focal adjustment. Indeed, M3 mirror (cf. Figure 1) temperature has a direct impact on the instrument focus. It is nominally regulated at 20 °C, Figure1) temperature has a direct impact on the instrument focus. It is nominally regulated at 20 ◦C, temperature for which the focus has been optimised on ground. When modifying this temperature, temperature for which the focus has been optimised on ground. When modifying this temperature, the consequence is a focus variation in the range of 5.5 µm/C in the focal plane that has been pre- the consequence is a focus variation in the range of 5.5 µm/◦C in the focal plane that has been pre- validated on ground. validated on ground.

4.2. Geometric Calibration Activities Geometric calibration activities allow the determination of of all the GIPPs of the geometric calibration model model which which aims aims at better at better ensuring ensuring geometry geomet compliancery compliance to the requirements to the requirements for Sentinel-2 for Sentinelimages (orientation-2 images (orientation of the viewing of the frames, viewing lines frames, of sight lines of the of detectorssight of the of thedetectors different of focalthe different planes). Thesefocal planes). parameters These have parameters been estimated have been before estimated launch before and thelaunch purpose and the of thepurpose geometric of the calibration geometric calibrationactivities is acti to takevities into is to account take into any account update any of theseupdate parameter of these parameter values that valu mightes that occur. might Moreover, occur. Moreover,to meet the to multi-temporalmeet the multi-temporal registration registration and absolute and absolute geo-location geo-location requirements, requirements, a Global a Global (to be (tounderstood be understood as worldwide) as worldwide) Reference Reference Image called Image GRI called is generated GRI is generated and used and for automatic used for automatic extraction extractionof Ground of Control Ground Points Control (GCP) Points for systematic(GCP) for systematic refinement refinement of the geometric of the modelgeometric within model the within Image theProcessing Image Processing Facility (IPF). Facility (IPF).

4.2.1. Global Reference Image Generation

4.2.1.1. Definition and Goals of the Global Reference Image The Global Reference Image (GRI) can be defined as a set of as cloud free as possible mono- spectral (B4, red channel) L1B products, whose geometrical model has been refined through a dedicated process designed by IGN (French Geographic Institute). The area of interest is worldwide, including most of the isolated islands, so that once the GRI is complete the whole archive of L1B products can benefit from its geolocation. The GRI will be actually used as a ground control reference: homologous points will be found between the GRI and the L1B product to be refined within the processing chain, then a geometric correction will be estimated.

Remote Sens. 2017, 9, 584 22 of 81

4.2.1. Global Reference Image Generation

4.2.1.1. Definition and Goals of the Global Reference Image The Global Reference Image (GRI) can be defined as a set of as cloud free as possible mono-spectral (B4, red channel) L1B products, whose geometrical model has been refined through a dedicated process designed by IGN (French Geographic Institute). The area of interest is worldwide, including most of the isolated islands, so that once the GRI is complete the whole archive of L1B products can benefit from its geolocation. The GRI will be actually used as a ground control reference: homologous points will be found between the GRI and the L1B product to be refined within the processing chain, then a geometric correction will be estimated. RemoteConsequently, Sens. 2017, 9, 584 the GRI must have an absolute geolocation performance which allows respecting22 of 80 the following two specifications: Consequently, the GRI must have an absolute geolocation performance which allows respecting the following two specifications: • The geolocation of L1C products refined with the GRI is better than 12.5 m CE95 (95 percentile of  theThe circular geolocation error). of L1C products refined with the GRI is better than 12.5 m CE95 (95 percentile • Theof multi-temporalthe circular error) registration. performance between refined products is better than 0.3 pixels.  The multi-temporal registration performance between refined products is better than 0.3 pixels. 4.2.1.2. Selection and Constitution 4.2.1.2. Selection and Constitution Since the GRI has a worldwide extent, the image selection work is a quite long step. It is done continentSince by the continent. GRI has a The worldwide main constraint extent, the is image to find selection cloud freework products. is a quite long Even step. if the It is metadata done filescontinent contain by some continent. information The main about constraint the cloud is to coverage, find cloud it free is not products. sufficient Even in ourif the case. metadata Indeed, files L1B contain some information about the cloud coverage, it is not sufficient in our case. Indeed, L1B products may be very long (up to 8000 km): consequently, low cloud coverage does not mean that products may be very long (up to 8000 km): consequently, low cloud coverage does not mean that the part of the image we are interested in is really cloud free. Conversely, products with high cloud the part of the image we are interested in is really cloud free. Conversely, products with high cloud coverage may be partially useful when building up the GRI. Finally, almost none of the products are coverage may be partially useful when building up the GRI. Finally, almost none of the products are totally cloud free, which means that only parts of full products shall be selected to produce good totally cloud free, which means that only parts of full products shall be selected to produce good qualityquality mosaics. mosaics. TheThe selection selection of of products products remainsremains aa fully manua manuall task. task. Quick Quick-look-look images images at at320 320mm resolution resolution are are used,used, their their quality quality being being sufficient sufficient toto selectselect or reject products products upon upon a acloud cloud criterion. criterion. End End of ofJuly July 2016 2016 almostalmost 500 500 products products had had been been selected selected throughoutthroughout the world, world, as as illustrated illustrated on on Figure Figure 17. 17 .

FigureFigure 17. 17Overview. Overview of of thethe Sentinel-2Sentinel-2 GRI selection, selection, July July 2016 2016 (European (European GRI GRI products products are are not not present present on this map, since they were produced in the In-Orbit Commissioning Review (IOCR) context, in on this map, since they were produced in the In-Orbit Commissioning Review (IOCR) context, in October 2015). October 2015). Equatorial areas are cloudy most of the time, and there are very few chances to get a cloud-free imageEquatorial over these areas regions. are cloudy So, the mostGRI is of not the made time, of and100 % there cloud are-free very products, few chances but it is to only get necessary a cloud-free imageto find over enough these regions.GCPs in So,the theGRI GRI to refine is not other made L1B of 100 products. % cloud-free Since products,segments butare very it is only long, necessary some cloudy parts can be accepted by the registration method. Figure 18 shows the monthly evolution of the selection over South America between end of April and end of July 2016: the Amazonian rainforest is always under clouds, and no satisfying images could be found yet. From a Sentinel-2 point of view, it may not be a real problem since there are enough cloud-free images and long data strips to retrieve a satisfying density of GCPs. This also illustrates that the GRI is not an aesthetical view of the world.

Remote Sens. 2017, 9, 584 23 of 81 to find enough GCPs in the GRI to refine other L1B products. Since segments are very long, some cloudy parts can be accepted by the registration method. Figure 18 shows the monthly evolution of the selection over South America between end of April and end of July 2016: the Amazonian rainforest is always under clouds, and no satisfying images could be found yet. From a Sentinel-2 point of view, it may not be a real problem since there are enough cloud-free images and long data strips to retrieve a satisfying density of GCPs. This also illustrates thatRemote the Sens. GRI 2017 is, not9, 584 an aesthetical view of the world. 23 of 80

Figure 18. Monthly evolution of the image selection for the GRI (April to July 2016). Figure 18. Monthly evolution of the image selection for the GRI (April to July 2016). The complete GRI is expected to comprise around 1000 products, including products over isolatedThe completeislands. The GRI completion is expected is toscheduled comprise for around mid 2017. 1000 products, including products over isolated islands. The completion is scheduled for mid 2017. 4.2.1.3. Methods of Refinement 4.2.1.3. Methods of Refinement Basically, the method used to refine L1B products is the block spatio-triangulation: it provides refinedBasically, values theof the method attitude used of the to satellite refine L1B platform products by using is the observations block spatio-triangulation: from ground and itimages provides refined(GCPs valuesand tie ofpoints). the attitude The full of coverage the satellite of L1B platform products by used using for observationsthe GRI will be from split ground into continental and images (GCPsor multi and continental tie points). blocks The full in order coverage to achieve of L1B the products calculation used process. for the GRITie points will be are split extracted into continental using orSURF multi (Speeded continental Up Robust blocks Features): in order to these achieve features the calculationare more robust, process. less sensitive Tie points to areshadow extracted effects, using SURFand provide (Speeded in general Up Robust better Features): results than these the features usual points are more of interest robust, (e.g. less Harris sensitive corners) to shadow used in effects, a andclassical provide correlation in general process. better These results homologous than the usual points points are very of interest numerous (e.g., and Harris located corners) everywhere used in a classicalthere is an correlation overlap: along process. the same These orbit homologous when availabl pointse and are between very numerous two (or more and in located high latitudes) everywhere thereneighbouring is an overlap: orbits. along the same orbit when available and between two (or more in high latitudes) neighbouringGCPs can orbits. be selected using various methods and sources of GCPs databases. Several sources can GCPsbe used: can be selected using various methods and sources of GCPs databases. Several sources can be used:Identification on ground calibration sites or ITRF GPS network: the ITRF (International Terrestrial Reference Frame) network is worldwide. When possible, GCPs from these sites are • Identificationused. on ground calibration sites or ITRF GPS network: the ITRF (International Terrestrial  ReferenceIdentification Frame) in an network image of is reference: worldwide. several When sources possible, can GCPsbe used from as a these reference sites because are used. they • Identificationare accurate enough in an image for GRI of reference: need. This several is the case sources of aeri canal be BD used Ortho as a® reference over France because made they by are accurateIGN, AGRI enough (Australian for GRI Geographic need. This Reference is the case Image) of aerial made BD with Ortho ALOS® overimagery France over made Australia, by IGN, AGRIaerial (Australianimagery over Geographic the USA made Reference by the Image) USGS, made etc. In with this ALOS use case, imagery homologous over Australia, points of aerial imageryinterest over are extracted the USA between made by the the referenc USGS, etc.e and In this the GRI use case, to be homologous built. This extraction points of interestis done are extractedautomatically between or manually the reference when andthe automatic the GRI to process be built. fails. This extraction is done automatically or manuallyIn the specific when case the of automatic small and/or process isolated fails. islands, GCPs may not be available at all. Thanks to the excellent initial geolocation performance of Sentinel-2 (around 10–12 m), it is reasonable to only In the specific case of small and/or isolated islands, GCPs may not be available at all. Thanks compute a relative spatio-triangulation (using only tie points) of several Sentinel-2 overlapping to the excellent initial geolocation performance of Sentinel-2 (around 10–12 m), it is reasonable to products. Indeed, experiences show that the geolocation performance increases along with the number of overlapping images. Thanks to this method without any GCP, the specifications are respected. Using these observations (tie points and GCPs), it is possible to refine the chosen set of geometrical parameters (attitudes, position): it is the spatio-triangulation step itself, i.e., the resolution of the least square system. At this step, outlier observations are removed automatically. As the initial geolocation of the satellite is very good, only an order 0 or 1 (according the situation)

Remote Sens. 2017, 9, 584 24 of 81 only compute a relative spatio-triangulation (using only tie points) of several Sentinel-2 overlapping products. Indeed, experiences show that the geolocation performance increases along with the number of overlapping images. Thanks to this method without any GCP, the specifications are respected. Using these observations (tie points and GCPs), it is possible to refine the chosen set of geometrical parameters (attitudes, position): it is the spatio-triangulation step itself, i.e., the resolution of the least square system. At this step, outlier observations are removed automatically. As the initial geolocation of the satellite is very good, only an order 0 or 1 (according the situation) correction on yaw, pitch and roll is estimated. Most of the time, the values of adjustment are weak, corresponding to a few meters on ground.

4.2.1.4. Internal Controls As producer of the GRI, IGN is not part of the validation team. However, some internal checks are performed before the GRI is delivered to the validation team. This step is achieved just after the refining task, using a derivative of the Reference 3D® product. This derivative, called BDAmer, is a worldwide set of GCPs, directly extracted from the whole Spot 5 HRS archive, that has been spatio-triangulated before being turned into the Reference 3D® product (a global orthoimage layer). This database has a very high density, which helps to provide trustful statistical results in IGN checking step. The GCPs of the BDAmer database were automatically extracted and fully qualified. One strong advantage of this product is that every GCP is seen in stereoscopy (thanks to Spot 5 HRS instrument), sometimes with a higher multiplicity, and a long time interval. Consequently, a quality index can be computed from these parameters, as well as other indicators such as geometrical accuracy, multiplicity, durability, etc. All available 200 × 200 pixels Spot 5 HRS cropped and centred images are associated to each GCP. As derived from Reference 3D® , the BDAmer database is accurate enough to be considered as a reference for Sentinel-2 images, although the estimated accuracy may locally vary. Since the multiplicity of each BDAmer point is higher than 2, each check point is correlated at least twice to limit the risk of false correlation. It also helps to increase the confidence in the result. These checking operations can be considered as fully independent since the GCPs used to refine the models are not extracted from this product (or other derivative). To collect homologous points between Sentinel-2 and BDAmer, the method is identical to the extraction of GCPs. Finally, the locations of the homologous points are compared, and statistical results can be extracted. The main advantage of this method is that checks can be done anywhere in the Sentinel-2 products, and not only on local areas. In the case of Europe during the In-Orbit Commissioning Review (IOCR), more than 5000 check points were used to compute a significant statistical result. Finally, the accuracy of the GRI over Europe was estimated at 8 m CE95 after IGN checking operations. CNES team made additional checks with Pleiades imagery in many local areas, as confirmed by Figure 19. Remote Sens. 2017, 9, 584 24 of 80 correction on yaw, pitch and roll is estimated. Most of the time, the values of adjustment are weak, corresponding to a few meters on ground.

4.2.1.4. Internal Controls As producer of the GRI, IGN is not part of the validation team. However, some internal checks are performed before the GRI is delivered to the validation team. This step is achieved just after the refining task, using a derivative of the Reference 3D® product. This derivative, called BDAmer, is a worldwide set of GCPs, directly extracted from the whole Spot 5 HRS archive, that has been spatio- triangulated before being turned into the Reference 3D® product (a global orthoimage layer). This database has a very high density, which helps to provide trustful statistical results in IGN checking step. The GCPs of the BDAmer database were automatically extracted and fully qualified. One strong advantage of this product is that every GCP is seen in stereoscopy (thanks to Spot 5 HRS instrument), sometimes with a higher multiplicity, and a long time interval. Consequently, a quality index can be computed from these parameters, as well as other indicators such as geometrical accuracy, multiplicity, durability, etc. All available 200 × 200 pixels Spot 5 HRS cropped and centred images are associated to each GCP. As derived from Reference 3D® , the BDAmer database is accurate enough to be considered as a reference for Sentinel-2 images, although the estimated accuracy may locally vary. Since the multiplicity of each BDAmer point is higher than 2, each check point is correlated at least twice to limit the risk of false correlation. It also helps to increase the confidence in the result. These checking operations can be considered as fully independent since the GCPs used to refine the models are not extracted from this product (or other derivative). To collect homologous points between Sentinel-2 and BDAmer, the method is identical to the extraction of GCPs. Finally, the locations of the homologous points are compared, and statistical results can be extracted. The main advantage of this method is that checks can be done anywhere in the Sentinel-2 products, and not only on local areas. In the case of Europe during the In-Orbit Commissioning Review (IOCR), more than 5000 check points were used to compute a significant statistical result. Finally, the accuracy of the GRI over RemoteEurope Sens. was2017 estimated, 9, 584 at 8 m CE95 after IGN checking operations. CNES team made additional25 of 81 checks with Pleiades imagery in many local areas, as confirmed by Figure 19.

Figure 19. Checks from Reference 3D® /BDAmer (turquoise markers) data over Europe. Red and blue Figure 19. Checks from Reference 3D® /BDAmer (turquoise markers) data over Europe. Red and blue Remotelines Sens. represent2017, 9, 584 errorerror vectorvector ofof thethe productproduct beforebefore andand afterafter refinement.refinement. 25 of 80

4.2.1.5. The The Example Example of Australia This paragraph illustratesillustrates thethe methodmethod describeddescribed above, above, through through the the example example of of the the processing processing of of33 33products products selected selected for the for GRI the over GRI Australia, over Australia, as shown as shown in Figure in 20 Figure. GRI processing20. GRI processing over Australia over Australiais relatively is straightforwardrelatively straightforward for several for reasons: several the reasons: climate the is helpfulclimate to is find helpful cloud-free to find images cloud-free and imagesthe territory and the is veryterritory compact. is very The compact. final selection The final covers selection almost covers the almost whole territorythe whole without territory any without cloud. anyThere cloud. are a There couple are of holesa couple left, of but holes it is left, not consideredbut it is not as considered blocking the as useblocking of the the GRI use in theof the Sentinel-2 GRI in thecontext. Sentinel Indeed,-2 context. acquisitions Indeed, are acquisitions several thousands are several of thousands kilometres of long, kilometres so that enoughlong, so GCPsthat en canough be GCPsfound can almost be found everywhere almost ineverywhere the image. in the image.

FigureFigure 20 20.. SelectionSelection of of 33 33 products products over over Australia for the GRI.

The extraction of the the GCPs GCPs for for this this block block was was facilitated facilitated by by the the availability availability of of the the AGRI AGRI database. database. At first, first, candidate points were ext extractedracted from the orthorectified orthorectified AGRI database, using SURF SURF methods methods (see Figure 21).21). Then Then these these points points were were resampled resampled at at the the resolution of of Sentinel Sentinel-2,-2, and a correlation process was applied to find find the homologous point in the Sentinel Sentinel-2-2 image (orthorectified). (orthorectified). About 800 points were found using this method, leading to a homogeneous repartition shown in Figure 22. Most of the points of interest found by the process are dark or light items of the landscape.

Figure 21. Three couples (3 examples over Australia) of homologous points between Sentinel-2 (left) and Australian Geographic Reference Image (AGRI) (right). Each time, the correlation between both images is made at 10m Ground Sample Distance from ground (orthorectified) images.

Remote Sens. 2017, 9, 584 25 of 80

4.2.1.5. The Example of Australia This paragraph illustrates the method described above, through the example of the processing of 33 products selected for the GRI over Australia, as shown in Figure 20. GRI processing over Australia is relatively straightforward for several reasons: the climate is helpful to find cloud-free images and the territory is very compact. The final selection covers almost the whole territory without any cloud. There are a couple of holes left, but it is not considered as blocking the use of the GRI in the Sentinel-2 context. Indeed, acquisitions are several thousands of kilometres long, so that enough GCPs can be found almost everywhere in the image.

Figure 20. Selection of 33 products over Australia for the GRI.

The extraction of the GCPs for this block was facilitated by the availability of the AGRI database. RemoteAt first, Sens. candidate2017, 9, 584 points were extracted from the orthorectified AGRI database, using SURF methods26 of 81 (see Figure 21). Then these points were resampled at the resolution of Sentinel-2, and a correlation process was applied to find the homologous point in the Sentinel-2 image (orthorectified). About 800 points were found using this method, leading to a homogeneous repartition shown in Figure 2222.. Most of the points of interest found byby thethe processprocess areare darkdark oror lightlight itemsitems ofof thethe landscape.landscape.

Figure 21. Three couples (3 examples over Australia) of homologous points between Sentinel-2 (left) Figure 21. Three couples (3 examples over Australia) of homologous points between Sentinel-2 (left) and Australian Geographic Reference Image (AGRI) (right). Each time, the correlation between both and Australian Geographic Reference Image (AGRI) (right). Each time, the correlation between both images is made at 10m Ground Sample Distance from ground (orthorectified) images. Remoteimages Sens. 2017 is, made 9, 584 at 10m Ground Sample Distance from ground (orthorectified) images. 26 of 80

FigureFigure 22. 22. RepartiRepartitiontion of of the the Ground Ground Control Control Points Points ( (GCPs)GCPs) extracted from AGRI.

The identified parameters to refine are an order 1 on yaw, pitch and roll and an order 0 on The identified parameters to refine are an order 1 on yaw, pitch and roll and an order 0 on magnification. If acquisitions are very long, the order 1 is very constrained, to avoid a “leverage magnification. If acquisitions are very long, the order 1 is very constrained, to avoid a “leverage effect” effect” along track. In this case, there are seven unknowns by product to determine, with a high along track. In this case, there are seven unknowns by product to determine, with a high redundancy redundancy in the observations, since there are 800 GCPs, and about 35,000 tie points. in the observations, since there are 800 GCPs, and about 35,000 tie points. After refining, the accuracy is about 9.50 m CE95 over Australia. This value was estimated thanks After refining, the accuracy is about 9.50 m CE95 over Australia. This value was estimated thanks to the checking step based on BDAmer. This has still to be confirmed by the validation team, and to the checking step based on BDAmer. This has still to be confirmed by the validation team, and then then integrated in the system, that will be able to deliver refined products to the final users. integrated in the system, that will be able to deliver refined products to the final users.

4.2.2.4.2.2. Absolute Absolute and and Relative Relative Calibrati Calibrationon of the of Focal the Focal Plane Plane

4.2.2.1.4.2.2.1. Methods Methods TheThe focal focal plane calibration calibration consists consists in in re re-estimating-estimating the the lines lines of of sight sight of of the the detectors from from the various retinas, i.e. i.e.,, estimating distortion and possible possible discontinuities discontinuities between between the the different different arrays arrays of of thethe focal planeplane to improveimprove internalinternal imageimage consistency. consistency. Such Such accurate accurate measurements measurements are are not not possible possible on onground ground before before launch. launch. InIn flight, flight, an absoluteabsolute methodmethod isis used: used: it it is is based based on on the the correlation correlation between between an an image image acquired acquired by byone one of theof the retina retina to be to mapped be mapped and and an absolute an absolut referencee reference image image (orthorectified (orthorectified database, database, the so the called so calledBDortho, BDortho, done bydone IGN by over IGN France). over France). B4 has B4 been has chosen been chosen as the as reference the reference retina forretina Sentinel-2. for Sentinel-2. UsingUsing absolute absolute reference reference images images guarantees guarantees planimetric planimetric and and altimetric altimetric accuracy accuracy consistency consistency with withrequirements requirem becauseents because they they are geometric are geometric references references close close to ground to ground truth. truth. Moreover, Moreover they, they cover cover the thefull full swath swath so that so that calibration calibration of all of detectorsall detectors can can be done. be done. The absolute reference images are rectified into the geometry of the sensor focal plane to be calibrated (B4 spectral band). Two sets of images are then obtained: the image from the B4 spectral band to be calibrated, which represents the physical reality of the system, and the reference image, rectified in the geometry of the estimated sensor. These rectified images are then correlated. The aim here is to measure the deviation parallel and perpendicular to the track (ALT and ACT respectively). These raw image measurements resulting from correlation are reduced in the focal plane to angular units, e.g., radians, taking into account the aperture angle for each detector: this varies along track as a function of the roll angle. Deviations (in the form of polynomials or other functions) are modelled per detector or per array. This modelling is then added to the current model of lines of sight, in such a way that minimises the row and column deviations between the absolute reference and the image of the sensor being calibrated. The relative method for focal plane calibration uses the absolute method, but takes a well- calibrated existing band as a reference and uses an appropriate DEM (Digital Elevation Model). This

Remote Sens. 2017, 9, 584 27 of 81

The absolute reference images are rectified into the geometry of the sensor focal plane to be calibrated (B4 spectral band). Two sets of images are then obtained: the image from the B4 spectral band to be calibrated, which represents the physical reality of the system, and the reference image, rectified in the geometry of the estimated sensor. These rectified images are then correlated. The aim here is to measure the deviation parallel and perpendicular to the track (ALT and ACT respectively). These raw image measurements resulting from correlation are reduced in the focal plane to angular units, e.g., radians, taking into account the aperture angle for each detector: this varies along track as a function of the roll angle. Deviations (in the form of polynomials or other functions) are modelled per detector or per array. This modelling is then added to the current model of lines of sight, in such a way that minimises the row andRemote column Sens. 2017 deviations, 9, 584 between the absolute reference and the image of the sensor being calibrated.27 of 80 The relative method for focal plane calibration uses the absolute method, but takes a method is used to calibrate SWIR focal plane from VNIR focal plane and more generally to calibrate well-calibrated existing band as a reference and uses an appropriate DEM (Digital Elevation Model). the different multispectral bands to ensure consistency between channels LOS. Calibration This method is used to calibrate SWIR focal plane from VNIR focal plane and more generally to parameters are thus estimated by correlating the various bands with each other. But the correlation calibrate the different multispectral bands to ensure consistency between channels LOS. Calibration reaches its limits when the spectral bands are very different. A study has been carried out before the parameters are thus estimated by correlating the various bands with each other. But the correlation commissioning phase to determine the best pairs for correlation. The following pairs are used for reaches its limits when the spectral bands are very different. A study has been carried out before calibration: the commissioning phase to determine the best pairs for correlation. The following pairs are used for calibration:B4 spectral band is used as a reference to calibrate the focal plane of the B2, B3, B5 and B8 spectral bands; • B4 spectral band is used as a reference to calibrate the focal plane of the B2, B3, B5 and B8  B5 spectral band is used as a reference to calibrate the focal plane of the B1, B6, B7, B11 and B12 spectral bands; spectral bands; • B5 spectral band is used as a reference to calibrate the focal plane of the B1, B6, B7, B11 and B12  B8 spectral band is used as a reference to calibrate the focal plane of the B8a and B9 spectral spectral bands; bands; •  B8B2 spectralspectral bandband isis used used as as a a reference reference to to calibrate calibrate the the focal focal plane plane of of the the B8a B10 and spectral B9 spectral band. bands; • B2 spectral band is used as a reference to calibrate the focal plane of the B10 spectral band. Figure 23 details all the couples of bands used to perform the multi-spectral registration. The blackFigure-links refer23 details to the all couples the couples used for of bands the calibration. used to perform The blue the-links multi-spectral are additional registration. couples used The black-linksfor final checking refer to of the the couples performances. used for the calibration. The blue-links are additional couples used for final checking of the performances.

Figure 23. Couples of bands used to perform the multi-spectral registration. Blue boxes: 10 m bands; greenFigure boxes: 23. Couples 20 m bands; of bands red used boxes: to 60perform m bands. the multi-spectral registration. Blue boxes: 10 m bands; green boxes: 20 m bands; red boxes: 60 m bands. 4.2.2.2. Results 4.2.2.2. Results A calibration has been performed to correct the geometric model computed on ground. After correctionA calibration of the geometric has been model, performed the residuals to correct errors the geometric were less thanmodel 0.2 computed pixels. on ground. After correctionTo evaluate of the anygeometric evolution model, of the the focal residuals plane errors after thiswere calibration, less than 0.2 another pixels. method was used by correlationTo evaluate (whatever any theevolution site) between of the focal two productsplane after from this the calibration, same orbit. another This is themethod same was method used but: by correlation (whatever the site) between two products from the same orbit. This is the same method • but: products are located anywhere around the world so cloud-free acquisitions can be chosen;  products are located anywhere around the world so cloud-free acquisitions can be chosen;  correlation is computed between two Sentinel-2 products: a reference image is not needed as the correlation noise is low. This method is a relative method to determine any evolution of the absolute calibration. There has been no evolution of the focal plane since the beginning of October 2015. Concerning the relative calibration, a particular case has been observed for band B10. At first, the images used for calibration of B10, with B01 as reference, were cloudy areas with cumulus that are visible in both bands. But the results were too noisy, because there is 0.45 s of time lag between acquisitions by the two bands and the movement of clouds during this time lag can be up to 1 pixel of 60 m. For clouds at 10 km of altitude, parallax error between these 2 bands is 42.5 m. In addition, the cloud speed at this altitude can be around 200 km/h, which means 50 m of error. So the images finally used for calibration are over very high mountains, without clouds. In that case, the integrated

Remote Sens. 2017, 9, 584 28 of 81

• correlation is computed between two Sentinel-2 products: a reference image is not needed as the correlation noise is low.

This method is a relative method to determine any evolution of the absolute calibration. There has been no evolution of the focal plane since the beginning of October 2015. Concerning the relative calibration, a particular case has been observed for band B10. At first, the images used for calibration of B10, with B01 as reference, were cloudy areas with cumulus that are visible in both bands. But the results were too noisy, because there is 0.45 s of time lag between acquisitions by the two bands and the movement of clouds during this time lag can be up to 1 pixel of 60 m. For clouds at 10 km of altitude, parallax error between these 2 bands is 42.5 m. In addition, the cloud speed at this altitude can be around 200 km/h, which means 50 m of error. So the images finally used for calibration are over very high mountains, without clouds. In that case, the integrated water vapour content is sufficiently low to allow a good detection in B10. Therefore, B10 is calibrated on ground not at cloud altitude Generally speaking, the LOS is well calibrated on ground with an in-orbit correction of less than 2 pixels for VNIR bands and less than 15 pixels for SWIR bands. Only few gaps have been corrected (except on SWIR bands where the focal plane is not the same as the VNIR one, so the gaps were larger). There are 3 main contributors to the registration error:

• Static LOS calibration residuals; • Dynamic vibrations residuals that represent the main contributor because of on-board oscillations; • Correlation noise and outliers which depend on landscapes and spectral couples.

The performances are very good and homogeneous on every couple. The registration residual (positive distance) is better than 0.3 pixels at 3σ, in all cases except on band B10 where it reaches 0.6 pixels at 3σ. This is principally due to the very noisy measurements obtained on this particular band, so this is not a bias between bands but the measured gap for each pixel with a lot of noise. Moreover, as the LOS is measured for this band on high mountains, there is an effect from the DEM: products from different repeat orbits must be used to average biases coming from DEM. Viewing directions are very stable. From the beginning of September 2015, no evolution has been observed. It is important to note that if viewing frame biases have a temporal evolution, there is an evolution on viewing directions. A correction of these biases should correct the focal plane residuals.

4.2.3. Absolute Calibration of the Viewing Frames

4.2.3.1. Methods Absolute calibration of the viewing frames or calibration of the absolute alignment biases consists in determining the absolute orientation of these frames. This calibration is achieved by refining the geometric models on a large set of scenes, using Ground Control Points (GCP) or well-located reference images. Scenes are acquired on various sites (geographical sites which are perfectly known geometrically, also called GIQ sites, as reference for Geometric Image Quality purposes) distributed all over the world; the aim is to cover the maximum number of longitude and latitude ranges to:

• Determine any possible change in alignment biases as a function of criteria such as latitude (analysis of possible thermo-elastic effects), • Avoid being too dependent on weather conditions.

For all the scenes, the geometric models are then refined by space-triangulation to determine an average biases set per scene (pitch, roll and yaw biases). An analysis of the alignment biases estimated for all scenes and all GIQ sites then allows to:

• Determine a mean biases set to update GIPP, Remote Sens. 2017, 9, 584 29 of 81

• Observe a possible evolution of biases according to such criteria as latitude, date, etc.

For Sentinel-2, this method is used for the calibration of the VNIR viewing frame. Sentinel-2 images are acquired on the various GIQ sites. The B04 spectral band of these Sentinel-2 images is systematically correlated with the ground control points or the well-located reference images of the GIQ sites to refine the geometric model of the VNIR frame.

4.2.3.2. Results The output of alignment biases calibration of a viewing frame is the mean biases set in pitch, roll, and yaw, and the associated GIPP to be delivered (GIPP_SPAMOD, standing for spacecraft model). Those biases are corrected by on-ground processing parameters, although in case of an error larger than 2 km just after launch, an update of the on-board model would have been performed. A new ground spacecraft model GIPP would then be associated to these new on-board biases. But it has not been necessary. Just after launch, measured alignment biases were around 2.5 km. After on-board Stars Trackers calibration, on 3 July 2015, biases decreased to around 700 m. Then, to reach the specification (under 20 m for non-refined products, and under 12.5 m for refined product with GRI), calibration of viewing frames is managed via Image Quality Geometric Model. As those biases are evolving over time, several sets of alignment biases have to be applied depending on a validity date indicated in the GIPP spacecraft model. From the launch until July 2016, Table6 shows the values of pitch, roll and yaw angles to be applied.

Table 6. Viewing frames alignment biases.

GIPP SPAMOD Applicability Date GIPP DATATI Roll Pitch Yaw 03/07/2015 −2 ms −81.0 µrad 953.0 µrad 0 01/09/2015 −2 ms −77.8 µrad 949.4 µrad 22 µrad 15/11/2015 −2 ms −77.8 µrad 946.0 µrad 47 µrad 01/02/2016 −2 ms −78.7 µrad 952.4 µrad 62 µrad

Alignment biases in pitch and roll are very stable. The yaw angle is slowly drifting and is fully monitored.

4.3. Radiometry Validation Activities Radiometric validation activities aim at assessing all radiometric performances related to image quality requirements. The validation activities are performed regularly but with various time scales from weekly basic checks to yearly in-depth analyses, including long-term statistical and trend analysis. In case of reported performance degradation, specific calibration activities can be triggered.

4.3.1. Equalisation Validation The goal of the equalisation validation is to verify that all pixels in a given band have the same response to a uniform radiance level. This is directly linked to the quality of the dark signal calibration and relative gains calibration (see Section 4.1.1 to Section 4.1.3) and to the linearity model characterization. In case of an observed degraded performance, a dedicated calibration activity and possible investigations can be triggered. Figure 24 illustrates the effect of equalisation on the image quality. Remote Sens. 2017, 9, 584 29 of 80 new ground spacecraft model GIPP would then be associated to these new on-board biases. But it has not been necessary. Just after launch, measured alignment biases were around 2.5 km. After on-board Stars Trackers calibration, on 3 July 2015, biases decreased to around 700 m. Then, to reach the specification (under 20 m for non-refined products, and under 12.5 m for refined product with GRI), calibration of viewing frames is managed via Image Quality Geometric Model. As those biases are evolving over time, several sets of alignment biases have to be applied depending on a validity date indicated in the GIPP spacecraft model. From the launch until July 2016, Table 6 shows the values of pitch, roll and yaw angles to be applied.

Table 6. Viewing frames alignment biases.

GIPP GIPP SPAMOD Applicability Date DATATI Roll Pitch Yaw 03/07/2015 −2 ms −81.0 µrad 953.0 µrad 0 01/09/2015 −2 ms −77.8 µrad 949.4 µrad 22 µrad 15/11/2015 −2 ms −77.8 µrad 946.0 µrad 47 µrad 01/02/2016 −2 ms −78.7 µrad 952.4 µrad 62 µrad

Alignment biases in pitch and roll are very stable. The yaw angle is slowly drifting and is fully monitored.

4.3. Radiometry Validation Activities Radiometric validation activities aim at assessing all radiometric performances related to image quality requirements. The validation activities are performed regularly but with various time scales from weekly basic checks to yearly in-depth analyses, including long-term statistical and trend analysis. In case of reported performance degradation, specific calibration activities can be triggered.

4.3.1. Equalisation Validation The goal of the equalisation validation is to verify that all pixels in a given band have the same response to a uniform radiance level. This is directly linked to the quality of the dark signal calibration and relative gains calibration (see Sections 4.1.1 to 4.1.3) and to the linearity model characterization. RemoteIn case Sens. of2017 an, 9 , observed 584 degraded performance, a dedicated calibration activity and possible30 of 81 investigations can be triggered. Figure 24 illustrates the effect of equalisation on the image quality.

(a) (b)

Figure 24. Illustration of image equalisation impact: (a) Non-equalised diffuser image; (b) Equalised Figure 24. Illustration of image equalisation impact: (a) Non-equalised diffuser image; (b) Equalised diffuser image. diffuser image.

4.3.1.1. Methods 4.3.1.1.Remote Sens. Methods 2017, 9, 584 30 of 80

The equalisation equalisation validation validation is isbased based on onthe the analysis analysis of images of images over radiometrically over radiometrically uniform uniform scenes thatscenes can that be: can be:

• On-boardOn-board Sun-diffuserSun-diffuser acquisitions:acquisitions: inin thisthis casecase thethe truetrue radianceradiance isis known;known; • “Uniform” natural targets onon EarthEarth (vicarious(vicarious method):method): deserts,deserts, ice,ice, etc.etc.

The method using acquisitions over natural sites is always more pessimistic because the natural targets are are never never perfectly perfectly uniform uniform and thus and the thus scene the non-uniformity scene non-uniformity contributes tocontributes the measurement to the error.measurement The non-uniformity error. The non at-uniformity detector transitions at detector due transitions to BRDF due impact to BRDF (the viewingimpact (the angle viewing being differentangle being for adjacentdifferent detectors)for adjacent makes detectors) this method makes relevant this method only insiderelevant each only detector. inside Moreovereach detector. as it isMoreover not possible as it tois not find possible a uniform to find landscape a uniform for thelandscape full field for of the view, full this field method of view, shall this be method performed shall combiningbe performed a set combining of sub-swaths. a set of This sub makes-swaths. the This operations makes the quite operations demanding. quite demanding. For these reasons reasons the the nominal nominal method method for for the the computation computation of the of the equalisation equalisation criteria criteria is the is one the oneusing using diffuser diffuser data. data. The The vicarious vicarious method method on on uniform uniform scenes scenes is is used used for for cross cross-validation-validation of the parameters providedprovided byby thethe diffuserdiffuser andand correctioncorrection ifif necessary.necessary. The instrument instrument response response non non-uniformity-uniformity is isassessed assessed through through the the Fixed Fixed Pattern Pattern Noise Noise (FPN) (FPN) and andMaximum Maximum Equalisation Equalisation Noise Noise (MEN) (MEN) which which quantify quantify local local non non-uniformities-uniformities in in the the responses responses of physical pixels across thethe swath. Although the general principle, shown in Figure 2525,, is applicable to both sun-diffuser sun-diffuser image image and and to uniformto uniform scenes, scenes, the computation the computation methods methods are slightly are slightly different. different. Indeed inIndeed the case in the of diffusercase of diffuser image, one image, can useone thecan sun-diffuser use the sun- characterizationdiffuser characterization to know theto know true radiancethe true ofradiance the image. of the image.

Figure 25.25. General principle for equalisation assessment.

For Sun-diffuser L1B images, FPN and MEN are computed as follows: For Sun-diffuser L1B images, FPN and MEN are computed as follows: For each band, For each band, (1) For each detector, correct each pixel of the sun-diffuser BRDF (to avoid local fluctuations due to the spatial response of the sun-diffuser) and of solar angle: Compute the average line of the ratio between observed radiance and sun-diffuser simulated radiance (see Equation (3) in Section 4.1.2):

푁ℓ 1 푅(푏, 푑, 푝) = ∑ 푍푚푒푎푠(푏, 푑, ℓ, 푝)/푍푠𝑖푚푢(푏, 푑, ℓ, 푝) 푁ℓ (9) ℓ=1 In practice the average is computed over Nl = 5000 lines (at 10 m resolution). (2) Build a full-swath line concatenating the previous average line but keeping only pixels that will be seen in the end-user product: R(b,p’) (3) Using a sliding window with a step of 1 pixel, compute the mean and standard deviations over a section of 100 pixels and derive the (normalised) FPN and MEN:  퐹푃푁(푏, 푝′) = 푆푇퐷(푅)/푀퐸퐴푁(푅), where MEAN and STD are respectively mean and standard-deviation over pixels in the sliding window centred at p.  푀퐸푁(푏) = 푀퐴푋(푆푇퐷(푅)) over all pixels across-track Note: All defective pixels are ignored in the computation. For uniform L1B images, FPN and MEN are computed as follows: For each Region of Interest in the image (included in one detector) and for each band,

Remote Sens. 2017, 9, 584 31 of 81

(1) For each detector, correct each pixel of the sun-diffuser BRDF (to avoid local fluctuations due to the spatial response of the sun-diffuser) and of solar angle: Compute the average line of the ratio between observed radiance and sun-diffuser simulated radiance (see Equation (3) in Section 4.1.2): 1 N` R(b, d, p) = ∑ Zmeas(b, d, `, p)/Zsimu(b, d, `, p) (9) N` `=1 In practice the average is computed over Nl = 5000 lines (at 10 m resolution). (2) Build a full-swath line concatenating the previous average line but keeping only pixels that will be seen in the end-user product: R(b,p’) (3) Using a sliding window with a step of 1 pixel, compute the mean and standard deviations over a section of 100 pixels and derive the (normalised) FPN and MEN:

• FPN(b, p0) = STD(R)/MEAN(R), where MEAN and STD are respectively mean and standard-deviation over pixels in the sliding window centred at p. • MEN(b) = MAX(STD(R)) over all pixels across-track

Note: All defective pixels are ignored in the computation. For uniform L1B images, FPN and MEN are computed as follows: For each Region of Interest in the image (included in one detector) and for each band,

(1) Compute the mean line of the image. (2) Compute FPN and MEN performing step (3) of the method using sun-diffuser acquisitions.

4.3.1.2. Results on Sun-diffuser Acquisitions This method is performed at the same time that the relative gains calibration using sun-diffuser acquisitions. The FPN and MEN are estimated on the sun-diffuser acquisition converted to L1B product using the current relative gains and dark offsets. It allows assessing the status of the equalisation just before the new calibration (worst case). For instance, Table7 gives a statistical overview of the values of FPN measured for each spectral band on the sun-diffuser acquisition on 4 July 2016 by applying the equalization coefficients estimated from the previous sun-diffuser acquisition on 8 June. About 5000 lines (N`) at 10 m resolution are used to compute the average line in Equation (9). For all the VNIR bands, the maximal value of FPN is clearly below the specified limit (0.2% for all bands, except for B09 for which the specification is 0.3%) typically by an order of magnitude. For SWIR bands, it is not the case for the B10 and B11 bands: for these bands, the maximal value of FPN can be higher than the specified limit (0.3% for B10, 0.2% for B11), but the FPN values exceed the limit for a limited number of pixels as the values of FPN for the quantiles 98% are significantly below the specified acceptable limits, even for these bands. The update of equalization coefficients resets the FPN to zero, and ensures that the requirements will be met until the next calibration.

4.3.1.3. Results on Uniform Scenes Here are presented the results obtained on a Greenland snow image. Comparable results have been obtained on other similar images. The first step is to select one or more uniform areas to be able to compute the FPN. At first glance, that seems easy since snowy landscapes are known to have a high and quite spectrally flat radiance in VNIR and to be spatially uniform. However, very small effects are investigated, so the zone selection has to be performed carefully, avoiding variation in snow properties, and any patterns caused by wind or elevation variation. As Sentinel-2 has a very wide field of view, only subparts of the swath are considered. Once this is done, it is possible to compute, for each zone, statistical quantities on FPN as depicted in Figure 26. Remote Sens. 2017, 9, 584 32 of 81

Table 7. Statistics on the Fixed Pattern Noise (FPN) (in %) estimated on the sun-diffuser acquisition of 04 July 2016 by applying the operational equalisation coefficients calculated from the sun-diffuser acquisition of 08 June 2016. Values are given for the minimal, average and maximal values, and for the quantile at 98 %. The specified maximum acceptable (requirement) values are recalled.

Band Req. Min FPN Avg FPN Q. 98% FPN Max FPN B01 0.2 0.00 0.01 0.02 0.03 B02 0.2 0.01 0.01 0.02 0.02 B03 0.2 0.00 0.01 0.01 0.03 B04 0.2 0.01 0.01 0.01 0.03 B05 0.2 0.01 0.01 0.02 0.04 B06 0.2 0.01 0.01 0.01 0.04 B07 0.2 0.01 0.01 0.02 0.05 B08 0.2 0.01 0.01 0.02 0.03 B8A 0.2 0.01 0.01 0.02 0.03 B09 0.3 0.01 0.01 0.02 0.06 B10 0.3 0.00 0.11 0.24 0.95 B11 0.2 0.03 0.07 0.17 0.35 B12 0.2 0.00 0.03 0.07 0.10

FPN derived from Zone 1

FPN derived from Zone 2

Figure 26. Left: Sentinel-2 Greenland image from 2015/09/04 together with the 2 uniform zones that are used to estimate the equalization quality. Right: FPN derived independently from the 2 zones.

In the VNIR range, the FPN is almost systematically lower than the specification (0.3% of input radiance for B09 and B10; 0.2% for other bands). Even in the worst case, the equalization quality meets

1

Remote Sens. 2017, 9, 584 33 of 81 the requirement. This is a very strong confirmation of the quality of the non-uniformity coefficients derived from the diffuser. FPN derived on diffuser images are of course even better because it is a purely uniform target. But it is conservative to claim that the VNIR equalization quality is very good since the FPN derived, and that are certainly tainted by landscape residuals, are within specification. However, for any band of the SWIR domain, the FPN in this image is higher than the requirement. Looking at the selected zones for those specific spectral bands, it is clear that they are less uniform than VNIR ones and that this non uniformity is caused by the landscape itself. Indeed, the reflectance is very sensitive to the snow properties (snow/ice ratio, dust ratio, etc.) in that spectral range. Consequently, we consider that SWIR results obtained here are not relevant. The use of other landscapes, like desert or ocean, is on-going to remove this restriction.

4.3.2. Absolute Radiometry Vicarious Validation The absolute radiometric calibration relies on the on board solar diffuser. This calibration has been validated using vicarious methods. As sensor life increases, the vicarious calibration aims at distinguishing between the sensor aging and the diffuser aging. Vicarious calibration consists in equivalent TOA radiance computation for a known surface reflectance and a known atmosphere. For the spectral band b, the ratio between the TOA observed reflectance and the computed equivalent TOA reflectance gives the absolute calibration coefficient Ab. Except for band B10, the vicarious calibration relies on three methods, the Rayleigh scattering method, the Deep Convective Clouds (DCC) method and the ground-based reflectance method.

4.3.2.1. Methods

Rayleigh Scattering Over Ocean Surface The TOA signal measured by the satellite sensor over ocean targets is mainly due to the scattering by atmospheric components in the visible spectral range [16]. In ideal conditions—stable oceanic region, with low concentration of phytoplankton and sediment, and far from land to ensure a purely maritime aerosol model [17]—the molecular scattering (so-called Rayleigh scattering) constitutes about 90% of the TOA signal. The assumption holds for Case 1 waters with low chlorophyll concentration and where phytoplankton is the only optically significant water column contributor [18]. Thus, Rayleigh scattering can accurately be calculated based on the surface pressure and viewing angles. In this study the Rayleigh scattering is based on the methodology of [19,20] using open , to simulate the molecular scattering (Rayleigh) in the visible and comparing against the observed TOA reflectance to derive a calibration gain coefficient. To ensure a proper computation of the vicarious coefficients over Rayleigh scattering, the following conditions need to be satisfied: (1) 0% cloud coverage on ROI; (2) low wind speed, typically less than 5 m/s; (3) low content of aerosol. Following [19], the Rayleigh-Corrected Normalized Radiance (RCNR) at 865 nm is used for screening aerosol content. The RCNR is defined as:   RCNR = ρTOA − ρRayleigh cos (SZA), (10) where ρTOA is the TOA reflectance, ρRayleigh is the Rayleigh reflectance, and SZA is the Sun-Zenith Angle. This coefficient is dimensionless. Images are processed only if the RCNR is lower than 0.008. This threshold avoids using further data screening for sun glint too. A marine model following [21] provides an estimate of irradiance reflectance at null depth from 350 to 700 nm, as a function of chlorophyll concentration and sun zenith angle. The conversion from irradiance reflectance to marine reflectance above sea surface (ρw) follows [22]. To ensure a 0% cloud coverage over the region of interest (ROI), the target dataset is automatically cloud-screened using DIMITRI (Database for Imaging Multi-spectral Instruments and Tools for Radiometric Inter-comparison) package (https://dimitri.argans.co.uk). To ensure a low concentration Remote Sens. 2017, 9, 584 34 of 80 useful to verify interband calibration or to be used in the synthesis of all calibration methods. Any spectral band in the VNIR range can be used, but usually, the red spectral band is used (670 nm or B04 for Sentinel-2) since the Rayleigh calibration is supposed to be very efficient at this wavelength and can provide an absolute calibration value for this band. The favourable imaging zones for this method are warm ocean sites in tropical latitudes where cumulonimbus clouds develop, like over the Maldives or in the Gulf of Guinea. Since this method

Remotehas been Sens. experimental2017, 9, 584 for Sentinel-2A and there were operational constraints to have additional34 of 81 images, only a small zone (600 × 700 km²) has been chosen over the Maldives during the In-Orbit Commissioning Phase (Figure 27 and Table 8). of phytoplankton and sediment, and pure marine aerosol, six Cal/Val sites were selected following [23] (Figure 27 and Table8).

Figure 27. Cal/Val test sites location for Rayleigh scattering, Deep Convective Clouds (DCC), and ground-based methods chosen for Sentinel-2A commissioning activities. ANWO: Atlantic Figure 27. Cal/Val test sites location for Rayleigh scattering, Deep Convective Clouds (DCC), and NW-Optimum, ASWO: Atlantic SW-Optimum, PNEO: Pacific NE-Optimum, PNWO: Pacific NW-Optimum,ground-based methods PSGO: Pacific chosen Southern-Gyre-Optimum, for Sentinel-2A commissioni SIOO: South-Indianng activities. Ocean-Optimum, ANWO: Atlantic MLDV: NW- MaldivesOptimum, and ASWO: RRVP: Rail-RoadAtlantic SW Valley-Optimum, Playa. (world PNEO: image Pacific is taken NE- fromOptimum, http://visibleearth.nasa.gov PNWO: Pacific NW).- Optimum, PSGO: Pacific Southern-Gyre-Optimum, SIOO: South-Indian Ocean-Optimum, MLDV: MaldivesTable 8. Definition and RRVP: of the absoluteRail-Road radiometry Valley vicarious Playa. validation (world sites image used in is thistaken analysis. from http://visibleearth.nasa.gov). Latitude (◦) Longitude (◦) CumulonimbusName clouds or Deep Convective Clouds (DCC) in this area are high and dense, and Min Max Min Max strongly diffuse the incoming solar radiance with a spectrally flat and lambertian reflectance. Atlantic-SW-Optimum −14.5 −13.5 −24.5 −23.5 Moreover, they reach a very high altitude (8–12 km), and there is almost no atmospheric perturbation Atlantic-NW-Optimum 22.5 23.5 −67.5 −66.5 of the signalPacific-NE-Optimum over the cloud. The measured 17.5 signal at satellite 18.5 level can− be152.5 written as: −151.5 Pacific-NW-Optimum 17.5 18.5 156.5 157.5 퐿 (휆, 휃 , 휃 , Δ휙) = (퐿 (휏 , 휏 , 휏 ) + 퐿 + 퐿 ) × 푇 (휆, 휃 , 휃 ) (11) Pacific-Southern-Gyre-Optimum퐷퐶퐶 푣 푠 −26.5푐푙표푢푑 푐푙표푢푑 푎푒푟−25.5푚표푙 푅푎푦 −푎푒푟121.5 푔 푠 −푣119.5 whereSouthern-Indian-Ocean-Optimum: −27.5 −26.5 77.8 78.5 Maldives −10.0 10.0 60.0 90.0 L is the radiance at top of the cloud, which depends on the geometrical conditions, the cloudRail-Road Valley Playa 38.495 38.505 −115.685 −115.695 cloud particle type and, less significantly, on what is under the cloud (aerosols, tropospheric molecular signal, etc.). The clouds are so thick that the surface has zero influence on the result. After the selection of L1C tiles that are within the ROI, the radiometric and geometric LRay is the residual molecular signal over the cloud. measurements of the 13 bands are re-sampled into B01 grid (60 m spatial resolution). This option Laer is the residual aerosol signal over the cloud. has been motivated by the big size of S2 products with full resolution (10 m). To minimise the cloud Tg is the gaseous transmission above the cloud. contamination and the processed product size, sub-ROIs surface of 0.1 × 0.1 up to 0.3 longitude degree The different radiances are computed in pre-calculated Look-Up Tables (LUTs), and the typical × 0.3 latitude degree are chosen. Then the TOA reflectance, the sun and viewing angles, cloud-mask relative importance of each contribution is given in Table 9. The main difficulty of this method is the and auxiliary variables are stored for each pixel. Quick-looks for each tile are generated. characterization of DCC and in particular the components of the top layer usually constituted of ice Inter-Bandparticles. The Calibration reflectance over of the Deep cloud Convective becomes Cloudssensitive to the type of particle for wavelengths greater than 850 nm, which is why this method is only used in the VNIR range. The inter-band calibration method is usually used for lower resolution sensors (at least hectometric) [24], and has been experimented as a validation one for Sentinel-2 to demonstrate its potential for higher resolution sensors. It is an interband calibration method, which means it only gives relative measurements of calibration coefficients accuracy with respect to a reference band. It is useful to verify interband calibration or to be used in the synthesis of all calibration methods. Any spectral band in the VNIR range can be used, but usually, the red spectral band is used (670 nm or B04 for Sentinel-2) since the Rayleigh calibration is supposed to be very efficient at this wavelength and can provide an absolute calibration value for this band. Remote Sens. 2017, 9, 584 35 of 81

The favourable imaging zones for this method are warm ocean sites in tropical latitudes where cumulonimbus clouds develop, like over the Maldives or in the Gulf of Guinea. Since this method has been experimental for Sentinel-2A and there were operational constraints to have additional images, only a small zone (600 × 700 km2) has been chosen over the Maldives during the In-Orbit Commissioning Phase (Figure 27 and Table8). Cumulonimbus clouds or Deep Convective Clouds (DCC) in this area are high and dense, and strongly diffuse the incoming solar radiance with a spectrally flat and lambertian reflectance. Moreover, they reach a very high altitude (8–12 km), and there is almost no atmospheric perturbation of the signal over the cloud. The measured signal at satellite level can be written as:  LDCC(λ, θv, θs, ∆φ) = Lcloud(τcloud, τaer, τmol) + LRay + Laer × Tg(λ, θs, θv) (11) where: Lcloud is the radiance at top of the cloud, which depends on the geometrical conditions, the cloud particle type and, less significantly, on what is under the cloud (aerosols, tropospheric molecular signal, etc.). The clouds are so thick that the surface has zero influence on the result. LRay is the residual molecular signal over the cloud. Laer is the residual aerosol signal over the cloud. Tg is the gaseous transmission above the cloud. The different radiances are computed in pre-calculated Look-Up Tables (LUTs), and the typical relative importance of each contribution is given in Table9. The main difficulty of this method is the characterization of DCC and in particular the components of the top layer usually constituted of ice particles. The reflectance of the cloud becomes sensitive to the type of particle for wavelengths greater than 850 nm, which is why this method is only used in the VNIR range.

Table 9. Estimation of the relative importance of all contributors to the Top-Of-Atmosphere (TOA) radiance over DCC (average case).

Contribution to the Signal, Except Absorption Band (nm) Cloud Molecular Signal Aerosols Gaseous Transmission 443 95.7% 4.3% 0.0% 0.1% 490 97.2% 2.8% 0.0% 1.2% 560 98.4% 1.6% 0.0% 5.6% 665 99.2% 0.8% 0.0% 2.4% 775 99.5% 0.5% 0.0% 4.9% 865 99.7% 0.3% 0.0% 0.0%

Ground-Based Reflectance Measurements The ground-reflectance-based approach has been used over the Radiometric Calibration Test Site (RadCaTS; Figure 27) to perform the absolute radiometric vicarious calibration for two decades [25] and recently operationally for Landsat-8 [26–28]. In this study are used the ground-based measurements provided by NASA (Landsat Cal/Val Team) via ESA as part of the ESA-NASA agreement. Five cloud-free S2A overpasses were obtained over the Railroad Valley Playa site (RRVP) but only one overpass coincides with the S2A time-series from the processing baseline 02.01 used here. The chosen reference is the TOA normalized reflectance reconstructed by the University of Arizona team using ground and atmosphere radiometric measurements. Note that, the dataset has been specifically generated for the spectral bands of S2A using hyperspectral reference profiles of the site. Remote Sens. 2017, 9, 584 36 of 81

4.3.2.5. Results and Analysis

Rayleigh Scattering Results The assessment of a given sensor calibration consists in comparing the observed TOA reflectance (ρobs) by the satellite sensor to the simulated TOA reflectance (ρsim) by the radiative transfer model. The resulting ratio Ab, defined as ρobs/ ρsim, provides a measure of the quality of the calibration of the instrument. The Rayleigh method in DIMITRI is performed over the valid S2A-L1C products (11 acquisitions) over the commissioning period June 23 2015-April 1 2016. The results seem sufficiently reliable to provide a first estimate of the absolute vicarious calibration coefficients in the visible spectral range (Figure 28 and Table 10). The results are consistent over all test sites (not shown) and in good agreement with independent results obtained by CNES using the on-board diffuser calibration over the commissioning period of S2A (Sentinel-2 Image Quality Team, 2016). In addition, these results are in good agreement with the results of [29] obtained over shorter period, which attests on the temporal stability of the sensor measurement. The estimated measurement uncertainty associated to each site is less than ±5% and it is less than ±2% when taking into account all the test sites. This suggests that the Remoteaccuracy Sens. of2017 the, 9, method584 is compliant with the mission requirement. 36 of 80

(a) (b)

(c) (d)

Figure 28. Calibration results over (a) Rayleigh Scattering; (b) ground-based measurements; (c,d) Figure 28. Calibration results over (a) Rayleigh Scattering; (b) ground-based measurements; (c,d) Deep DeepConvective Convective Clouds. Clouds. The The dashed dashed (resp. (resp. solid) solid) red red lines lines represent represent the the 3% 3% goal goal specification specification (resp. (resp. 5% 5%threshold threshold specification). specification). The The reference reference band band is B04 is and B04 B07 and for B07 (c, d for) respectively. (c,d) respectively. Error bars Error indicate bars indicatethe estimated the estimated uncertainty uncertainty for Rayleigh for Rayleigh and ground-based and ground measurements-based measurements and the standardand the standard deviation deviationfor the DCC for method.the DCC method.

Table 10. Vicarious calibration coefficients estimated from Rayleigh and ground-based reflectance methodology and the associated uncertainty.

Rayleigh DCC In-Situ S2A/MSI Wave Length Vic. Cal. Uncert. Vic. Cal. Vic. Cal. Uncert. Std. Dev. (%) Bands (nm) Coeff. (%) Coeff. Coeff. (%) B01 443 1.028 1.8 1.000 1 1.048 6 B02 490 1.024 1.8 0.990 1 1.028 5 B03 560 1.023 1.8 0.980 <1 1.020 3 B04 665 1.021 1.8 -- -- 1.046 3 B05 705 NA NA 1.000 1 1.034 3 B06 740 NA NA 1.010 1 1.023 3 B07 783 NA NA 1.010 1 1.029 3 B08 842 NA NA 1.020 2 1.011 3 B8A 865 NA NA 1.030 1 1.031 3 B09 945 NA NA NA NA 0.994 3 B10 1375 NA NA NA NA NA 3 B11 1610 NA NA NA NA 1.053 3 B12 2190 NA NA NA NA 1.091 3

4.3.2.2.2. Deep Convective Clouds Method Results This method was used for the first time for a high-resolution satellite with Sentinel-2A. Transposing the method from low to high resolution has required extensive tests, especially about the extraction of valid products. Only 22 images were considered for this analysis, since the acquisitions over the Maldives only started at the end of August 2015 and over a limited area. Out of

Remote Sens. 2017, 9, 584 37 of 81

Table 10. Vicarious calibration coefficients estimated from Rayleigh and ground-based reflectance methodology and the associated uncertainty.

Rayleigh DCC In-Situ S2A/MSI Wave Length Vic. Cal. Uncert. Vic. Cal. Std. Dev. Vic. Cal. Uncert. Bands (nm) Coeff. (%) Coeff. (%) Coeff. (%) B01 443 1.028 1.8 1.000 1 1.048 6 B02 490 1.024 1.8 0.990 1 1.028 5 B03 560 1.023 1.8 0.980 <1 1.020 3 B04 665 1.021 1.8 – – 1.046 3 B05 705 NA NA 1.000 1 1.034 3 B06 740 NA NA 1.010 1 1.023 3 B07 783 NA NA 1.010 1 1.029 3 B08 842 NA NA 1.020 2 1.011 3 B8A 865 NA NA 1.030 1 1.031 3 B09 945 NA NA NA NA 0.994 3 B10 1375 NA NA NA NA NA 3 B11 1610 NA NA NA NA 1.053 3 B12 2190 NA NA NA NA 1.091 3

Deep Convective Clouds Method Results This method was used for the first time for a high-resolution satellite with Sentinel-2A. Transposing the method from low to high resolution has required extensive tests, especially about the extraction of valid products. Only 22 images were considered for this analysis, since the acquisitions over the Maldives only started at the end of August 2015 and over a limited area. Out of these 22 total images that were acquired by Sentinel-2A, 7 presented cloud features, but only one image accounted for 68% of the total valid areas. The results presented here might therefore contain some natural variability, and some more cloudy products could help confirm them. However, despite so few available products, the results presented in Table 10 are consistent, with all VNIR bands within 3% of the reference band. This is compliant with the mission requirements. To make sure there is no artefact in the method, a different reference band has been chosen (B04 or B07): the results show the same inter-band pattern so they are consistent. Note that B03 may be off the other bands because it is very sensitive to the ozone content which is derived from meteorological exogenous data, so having only one major DCC product may create a bias there. Bands B08 and B8A are also in the limit region of the method because the reflectance of the cloud starts to become sensitive to the cloud particle type, so their relatively different behaviour is not surprising. The method is also less limited by geometric conditions than the calibration method over molecular scattering, so results can be derived for the whole field of view of the instrument. The distribution of valid measurements presented in Figure 29 shows an increased number of points in the centre of the FOV. This is only due to the cloud distribution within the selected images. More acquisitions would increase the number of measurements per detector and consolidate the behaviour of the calibration coefficients within the FOV. Considering that Detector 1 corresponds to West edge of the swath, Figure 30 shows a slight relative decrease on the right (East) side of the images for B01 (~1–2%) and B02 (<1%). If this tendency were confirmed, it would help improve the consistency of the Rayleigh method with other methods, since the distribution of valid measurements over the oceans is biased on the left (West) side of the FOV. In conclusion, the 3% goal inter-band specification is confirmed with only a few products, which confirms the power of this method. Remote Sens. 2017, 9, 584 37 of 80 Remote Sens. 2017, 9, 584 37 of 80 thesethese 2222 totaltotal imagesimages thatthat werewere acquiredacquired byby SentinelSentinel--2A,2A, 77 presentedpresented cloudcloud features,features, butbut onlyonly oneone imageimage accounted accounted for for 68 68%% of of the the total total valid valid areas. areas. The The results results presented presented here here might might therefore therefore contain contain somesome natural natural variability, variability, and and some some more more cloudy cloudy products products could could help help conf confirmirm them. them. However, However, despite despite soso few few available available products, products, the the results results presented presented in in Table Table 10 10 are are consisten consistent,t, with with all all VNIR VNIR bands bands within within 33%% of of the the reference reference band. band. This This is is compliant compliant with with the the mission mission requirements. requirements. To To make make sure sure there there is is no no artefactartefact in in the the method, method, a a different different reference reference band band has has been been chosen chosen (B04 (B04 or or B07): B07): the the results results show show the the samesame inter inter--bandband pattern pattern so so they they are are consistent. consistent. NoteNote that that B03 B03 may may be be off off the the other other bands bands because because it it is is very very sensitive sensitive to to the the ozone ozone content content which which isis derived derived from from meteorological meteorological exo exogenousgenous data, data, so so having having only only one one major major DCC DCC product product may may create create aa bias bias there. there. BandsBands B08 B08 and and B8A B8A are are also also in in the the limit limit region region of of the the method method because because the the reflectance reflectance of of the the cloud cloud startsstarts to to become become sensitive sensitive toto the the cloud cloud particleparticle type, type, so so their their relatively relatively differen different t behaviour behaviour is is not not surprising.surprising. TheThe method method is is also also less less limited limited by by geometric geometric conditions conditions than than the the calibration calibration method method over over molecularmolecular scattering,scattering, soso resultsresults cancan bebe derivedderived forfor thethe wholewhole fieldfield ofof viewview ofof thethe instrument.instrument. TheThe distributiondistribution of of valid valid measurements measurements pr presentedesented in in Figure Figure 29 29 shows shows an an increased increased number number of of points points in in thethe centrecentre ofof thethe FOV.FOV. ThisThis isis onlyonly duedue toto thethe cloudcloud distribution distribution within within thethe selectedselected images.images. MoreMore acquisitionsacquisitions would would increase increase the the number number of of measurements measurements per per detector detector and and consolidate consolidate the the behaviour behaviour ofof the the calibration calibration coefficients coefficients within within the the FOV. FOV. Considering Considering that that Detector Detector 1 1 corresponds corresponds to to West West edge edge ofof the the swath, swath, Figure Figure 30 30 shows shows a a slight slight relative relative decrease decrease on on the the right right (East) (East) side side of of the the images images for for B01 B01 (~1(~1––2%)2%) and and B02 B02 (<1%). (<1%). If If this this tendency tendency were were confi confirmed,rmed, it it would would help help improve improve the the consistency consistency of of thethe Rayleigh Rayleigh method method with with other other methods, methods, since since the the distribution distribution of of valid valid measurements measurements over over the the oceansoceans is is biased biased on on the the left left (West) (West) side side of of the the FOV. FOV. In In conclusion, conclusion, the the 3% 3% goal goal inter inter--bandband specification specification Remoteisis confirm confirm Sens. ed2017ed with ,with9, 584 only only a a few few products, products, which which confirms confirms the the power power of of this this method. method. 38 of 81

Figure 29. Distribution of valid measurements over DCC sites as a function of detector number. The FigureFigure 29. 29.Distribution Distribution of of valid valid measurements measurements over over DCC DCC sites sites as as a a function function of of detector detector number. number. The The detector-1 corresponds to the West edge of the swath. detector-1detector-1 corresponds corresponds to to the the West West edge edge of of the the swath. swath.

Figure 30.30. B01 ( left)) and B B0202 (right)) inter inter-band-band calibration coefficients coefficients with with respect to B04 as a referencereference Figure 30. B01 (left) and B02 (right) inter-band calibration coefficients with respect to B04 as a reference band as a function function of of the the detector detector number, number, based based on on Sentinel Sentinel-2-2 images images calibrated calibrated using using the diffuser. the diffuser. band as a function of the detector number, based on Sentinel-2 images calibrated using the diffuser.

Ground-Based Reflectance Measurements Results In this analysis, the reference is the TOA normalized reflectance reconstructed by the University of Arizona team using ground and atmosphere radiometric measurements. It is worth to note that the dataset have been specifically generated for the spectral bands of S2A/MSI using hyperspectral reference profiles of the site. Table 10 displays the vicarious calibration coefficients (Ab) estimated for S2A/MSI. Almost all the VNIR bands show gain less than 3%, except B01 and B04 where the gain is about 5%. Both SWIR bands display gain coefficient higher than 5% (not shown). The uncertainty of the method estimated by the University of Arizona team is of the order of 3% for most bands, and up to 6% for B01 and B02 due to a higher impact of atmospheric effects [27,28]. The results from the three methods show high consistency and good agreement.

4.3.3. Multi-Temporal Relative Radiometry Vicarious Validation This section reports the assessment of S2A/MSI sensor performance in term of radiometry measurements, which aims to identify relative biases in the radiometric calibration and, thereby, to correct for biases if/where necessary. Moreover, the long-term trends in S2A/MSI sensor measurements are assessed to monitor the sun diffuser aging along the mission life. The method uses pseudo-invariant calibration sites (PICS) following [30], and is implemented in DIMITRI (dimitri.argans.co.uk) [29].

4.3.3.1. Pseudo-Invariant Calibration Site Methodology Desertic PICS have been exploited since decades for multi-temporal monitoring and cross-calibration in the solar spectral range [31]. PICS algorithm aims to simulate the TOA reflectance in the visible to near-infrared (VNIR) spectral range over pre-defined desert sites (e.g., CEOS PICS Remote Sens. 2017, 9, 584 39 of 81 sites [32]). The first step of this method consists on building a reference reflectance model for the selected site, and the model calibration using TOA measurements from a reference sensor (MERIS in [29]). TOA measurements are propagated to the surface using an inverse Radiative Transfer simulation. A database of BOA measurements with various acquisition geometries (Sun and viewing zenith angles and relative azimuth angle) is built. The database is used to fit a four-parameter BRDF model (for each spectral band). A hyperspectral model is constructed using spectral interpolation. The second step consists on the simulation of the observed TOA measurements (e.g., S2A/MSI) using the reference BRDF model and considering the observation geometry. Ozone and WV content is retrieved from the European Centre for Medium-range for Weather Forecasts (ECMWF), while a constant aerosol optical thickness is assumed. The resulting hyperspectral signal is then convolved with the spectral response of the instrument to produce the simulated TOA reflectance of the sensor under test. The model is ‘calibrated’ over each PICS site using 4 years of MERIS observations between 2006 and 2009 inclusive. Then it has been validated over the whole archive of MERIS 3rd reprocessing campaign (2002–2012) [29]. The method allows the performance of multi-temporal analysis and comparison of multiple sensors on the same site.

4.3.3.2. Sites Selection Reference [33] identified 20 desert sites in Africa and Arabia, which are characterized as homogeneous and suitable for vicarious calibration. The authors selected site-size of 100 km × 100 km with relative spatial heterogeneity of about 3%. Six of these sites were recognized by the Infrared and Visible Optical Sensors (IVOS) working group from the Committee on the Earth Observation Satellite (CEOS). The six CEOS-PICS sites (Table 11) are used for the PICS calibration. It is worth to note that due to the high spatial resolution of MSI L1C products, and for practical reasons, these sites are subsampled to reduce the size into 20 km × 20 km and 30 km × 30 km (Figure 31). To keep the properties of the sites, the small ones are selected as a subset of the standard one [34]. Figure 31 displays RGB quick-looks of the six desert test sites used for this study.

Table 11. Definition of the desert sites used for this analysis.

Latitude (◦) Longitude (◦) N◦ Name Min Max Min Max 003b Algeria 3 29.82 30.82 7.16 8.16 005b Algeria 5 30.52 31.52 1.73 2.73 031b Libya 1 23.92 24.92 12.85 13.85 034b Libya 4 28.05 29.05 22.89 23.89 038b Mauritania 1 −9.8 −8.8 18.8 19.9 039b Mauritania 2 −9.28 −8.28 20.35 21.35 031a Railroad Valley Playa 38.495 38.505 −115.685 −115.695 Remote Sens. 2017, 9, 584 39 of 80

Note that the surface BRDF is computed considering the full size of the test site. Thus the impact of the site size on the BRDF model has been computed over the six test sites and correction is then applied when needed over the VNIR spectral range.

Table 11. Definition of the desert sites used for this analysis.

Latitude (°) Longitude (°) N° Name Min Max Min Max 003b Algeria 3 29.82 30.82 7.16 8.16 005b Algeria 5 30.52 31.52 1.73 2.73 031b Libya 1 23.92 24.92 12.85 13.85 034b Libya 4 28.05 29.05 22.89 23.89 038b Mauritania 1 −9.8 −8.8 18.8 19.9 Remote039b Sens. 2017, 9, 584Mauritania 2 −9.28 −8.28 20.35 21.35 40 of 81 031a Railroad Valley Playa 38.495 38.505 −115.685 −115.695

(a) (b)

(c) (d)

(e) (f)

Figure 31. RGB Quick-Looks from S2A/MSI over (a) Algeria3, (b) Algeria5, (c) Libya1, (d) Libya4, (e) Figure 31. RGB Quick-Looks from S2A/MSI over (a) Algeria3, (b) Algeria5, (c) Libya1, (d) Libya4, Mauritania1, and (f) Mauritania2 sites. Red squares indicate the region of interest (ROI). (e) Mauritania1, and (f) Mauritania2 sites. Red squares indicate the region of interest (ROI).

Note that the surface BRDF is computed considering the full size of the test site. Thus the impact of the site size on the BRDF model has been computed over the six test sites and correction is then applied when needed over the VNIR spectral range.

4.3.3.3. Results and Analysis As mentioned above, this method considers the VNIR spectral range only [29,30]. Moreover, the gain coefficients are, in fact, relative calibration coefficients as MERIS is considered as a reference to compute the surface BRDF of the test site. After the ingestion of the whole datasets from S2A/MSI, the ratio Rb, defined as observed/simulated (ρobs/ρsim) reflectance, is then computed over the six desert sites (Table 11). Figure 32 shows time series of Rk from S2A/MSI bands B01, B03, B07 and B8A over Remote Sens. 2017, 9, 584 41 of 81

Remote Sens. 2017, 9, 584 40 of 80 the six PICS sites over the period June 2015—July 2016. The gap in the time-series is mainly due to the 4.3.3.3reprocessing Results campaignand Analysis (on-going), consequently, the computed trends are clearly impacted over the first half-year of the data acquisition. In general, all the ratios are close to 1 and within ±5%, which attestsAs amentioned clear consistency above, this between method the considers six PICS. the One VNIR may spectral observe range that short only wave [29,30 lengths]. Moreover, (e.g., B01)the gainshow coefficients more scattered are, in ratios fact, relative than the calibration long ones. coefficients This can be as easily MERIS seen is onconsidered the standard as a deviationreference to on computethe top of the each surface plot. BRDF This effect of the has test been site. observed After the by ingestion [29] and of could the whole be attributed datasets to from the low S2A/MSI, surface thereflectance ratio Rb, defined and the as high observed/simulated contribution of atmospheric (ρobs/ρsim) reflectance, signal. However, is then computed a stable temporalover the six evolution desert sitesof the (Table sensor 11). can Figure be seen 32 shows over these time time-series series of Rk where from S2A/MSI trend values bands are B01, less B03, than B07 1 ± and0.1% B8A per over year, thewithin six PICS the mission sites over requirements. the period AnyJune reliable 2015—July trends 2016. of the The sensor gap in would the time need-series more is than mainly two yearsdue to of theacquisitions reprocessing (J. Barsi, campaign personal (on- comm.).going), consequently, While the uncertainty the computed of the trends method are is clearly estimated impacted to be aboutover the5% first (except half- foryear the of absorptionthe data acquisition. bands B05), In thegeneral, standard all the deviations ratios are over close all to sites1 and (Figure within 32 ±5%,) represent which attestsless than a clear 3% ofconsistency the ratios, between attesting the six good PICS. accuracy One may of the observe sensor that calibration. short wave Even lengths if these (e.g. ratios B01) are showin good more agreement scattered with ratios the than previous the long results ones. over This shorter can be period easily [ 30seen], the on trend the standard values are deviation clearer now on thedue top to of the each longer plot. time-series This effect of has S2A/MSI been observed acquisition. by [29] This and confirms could be the attributed statistical to nature the low of thesurface used reflectancePICS methodology. and the high contribution of atmospheric signal. However, a stable temporal evolution of the sensorThe temporalcan be seen average over these of the time ratios-series for where each bandtrend overvalues all are the less four than test 1 ± sites 0.1% is per summarized year, within in theFigure mission 33. Again, requirements. the ratio Any values reliable are close trends to one of the over sensor all the would VNIR need spectral more range, than and two the years biases of acquisitionsare found to (J. be Barsi, better personal than ± comm.).3% except While for B05,the uncertainty which shows of the high method ratio (aboutis estimated 5% error to be and about 11% 5%uncertainty); (except for thisthe isabsorption due to the bands methodology B05), the limitationstandard deviations to simulate over accurately all sites the (Figure TOA-reflectance 32) represent for lessbands than with 3% significant of the ratios, oxygen attesting and water the good vapour accuracy absorption. of the However, sensor calibration. these results Even illustrate if these the ratios good arequality in good of S2A/MSI agreement calibration with the previous in agreement results with over [35 shorter]. The results period compare [30], the welltrend with values those are obtained clearer nowby the due French to the Space longer Studies time-series Centre of (CNES)S2A/MSI over acquisition. the same This PICS confirms and the samethe statistical commissioning nature period,of the usedbut usingPICS independentmethodology. desert-calibration methodology [36].

Figure 32. Cont.

Remote Sens. 2017, 9, 584 42 of 81 Remote Sens. 2017, 9, 584 41 of 80

Figure 32. Time series of TOA reflectance ratio Rb from S2A/MSI over (red) Algeria3, (yellow) Algeria5, (green)Figure Libya1, 32. Time (light-blue) series of Lybia4, TOA reflectance (dark-blue) ratio Mauritania1 Rb from S2A/MSI and (purple) over (red) Mauritania2 Algeria3, (yellow) for (from top Algeria5, (green) Libya1, (light-blue) Lybia4, (dark-blue) Mauritania1 and (purple) Mauritania2 for to bottom) B01, B03, B04 and B8A over the commissioning period. The green (resp. red) dashed lines (from top to bottom) B01, B03, B04 and B8A over the commissioning period. The green (resp. red) represent the 5% (resp. 10%) threshold specification. Error bars indicate the estimated uncertainty for dashed lines represent the 5% (resp. 10%) threshold specification. Error bars indicate the estimated the pseudo-invariantuncertainty for the pseudo calibration-invariant sites calibration (PICS) method. sites (PICS) method.

Remote Sens. 2017, 9, 584 42 of 80

The temporal average of the ratios for each band over all the four test sites is summarized in Figure 33. Again, the ratio values are close to one over all the VNIR spectral range, and the biases are found to be better than ±3% except for B05, which shows high ratio (about 5% error and 11% uncertainty); this is due to the methodology limitation to simulate accurately the TOA-reflectance for bands with significant oxygen and water vapour absorption. However, these results illustrate the good quality of S2A/MSI calibration in agreement with [35]. The results compare well with those obtained by the French Space Studies Centre (CNES) over the same PICS and the same Remote Sens. 2017, 9, 584 43 of 81 commissioning period, but using independent desert-calibration methodology [36].

Figure 33. Temporal average of the ratio of observed TOA reflectance to simulated one from S2A/MSI Figure 33. Temporal average of the ratio of observed TOA reflectance to simulated one from S2A/MSI over the four-PICS as a function of wavelength. B05 result is shown with black dashed line due over the four-PICS as a function of wavelength. B05 result is shown with black dashed line due to to significant gaseous absorption. The green (resp. red) dashed lines represent the 5% (resp. 10%) significant gaseous absorption. The green (resp. red) dashed lines represent the 5% (resp. 10%) threshold specification. Error bars indicate the estimated uncertainty for the PICS method. threshold specification. Error bars indicate the estimated uncertainty for the PICS method. 4.3.4. Absolute Radiometry Cross-Mission Inter-Comparison 4.3.4. Absolute Radiometry Cross-Mission Inter-Comparison The goal of this section is to assess the performance of S2A/MSI versus a reference sensor, the The goal of this section is to assess the performance of S2A/MSI versus a reference sensor, the Operational Land Imager (OLI) on board LANDSAT-8. The aim is to identify radiometric calibration Operational Land Imager (OLI) on board LANDSAT-8. The aim is to identify radiometric calibration differences between those two sensors by comparing their observations made over the same natural differences between those two sensors by comparing their observations made over the same natural target, at TOA, at the same time, and under the same geometrical configuration. The so-called target, at TOA, at the same time, and under the same geometrical configuration. The so-called sensor- sensor-to-sensor inter-calibration methodology implemented in DIMITRI is used for this task. In to-sensor inter-calibration methodology implemented in DIMITRI is used for this task. In addition, addition, the PICS methodology is used to monitor the performance and the temporal evolution of the PICS methodology is used to monitor the performance and the temporal evolution of the the radiometry measurements of both sensors and then to weight S2A/MSI to LANDSAT8/OLI (as radiometry measurements of both sensors and then to weight S2A/MSI to LANDSAT8/OLI (as reference) for similar spectral bands over the commissioning period. reference) for similar spectral bands over the commissioning period. 4.3.4.1. Methods 4.3.4.1. Methods Sensor-to-Sensor Inter-Calibration 4.3.4.1.1. Sensor-to-Sensor Inter-Calibration The methodology is based on the identification of directly comparable near-simultaneous TOA reflectanceThe methodology measurements is based from on two the space identification sensors made of directly under comparable similar observational near-simultaneous geometries. TOA The reflectancemethodology measurements does not require from two neither space atmospheric sensors made nor under BRDF simil correctionar observational to radiometrically geometries. compare The methodologytwo sensors. Itdoes assumes not require the TOA neither reflectance atmospheric angular nor distribution BRDF correction (1) obeys to theradiometrically principle of reciprocitycompare twoand sensors. (2) is symmetrical It assumes withthe TOA respect reflectance to the principal angular plane distribution [37]. Temporal (1) obeys and the angular principle match-up of reciprocity criteria andas well(2) is as symmetrical the allowable with cloud respect cover to cover the can principal be defined plane by [37 the]. Temporal user. The strictness and angular of the match angular-up criteriamatch as between well as two the concomitantallowable cloud sensors cover observations, cover can be 1 defined and 2, isby controlled the user. The through strictness the Angular of the angularMatching match Criteria between (AMC) two parameter concomitant which sensors is defined obser as:vations, 1 and 2, is controlled through the Angular Matching Criteria (AMC) parameter which is defined as: s  2 2 1 2 AMC = [θs1 − θs2] + [θv1 − θv2] + [|RAA1| − |RAA2|] (12) 4 where: θs and θv are the solar and viewing zenith angles respectively, and RAA the Relative Azimuth Angle for sensors 1 and 2. Comparisons are done between similar spectral bands between two sensors. Then the ratio of TOA reflectance from the sensor under test to the one from the reference sensor is computed. The comparison is done over three desert sites, Algeria3, Libya4 and Railroad Valley Playa (Table 11). Remote Sens. 2017, 9, 584 43 of 80

1 퐴푀퐶 = √([휃 − 휃 ]2 + [휃 − 휃 ]2 + [|푅퐴퐴 | − |푅퐴퐴 |]2) (12) 푠1 푠2 푣1 푣2 4 1 2 where: θs and θv are the solar and viewing zenith angles respectively, and RAA the Relative Azimuth Angle for sensors 1 and 2. Comparisons are done between similar spectral bands between two sensors. Then the ratio of TOA reflectance from the sensor under test to the one from the reference sensorRemote Sens.is computed.2017, 9, 584 The comparison is done over three desert sites, Algeria3, Libya4 and Railroad44 of 81 Valley Playa (Table 11). After the ingestion of L1C and L1T from MSI and OLI respectively into DIMITRI, the dataset is automaticallyAfter the cloud ingestion-screened of L1C and and the L1T following from MSI parameters and OLI respectively are archived: into the DIMITRI, reflectances the dataset in all spectralis automatically bands, the cloud-screened Sun azimuth and and zenith the following angles (SAA, parameters SZA) and are the archived: viewing the azimuth reflectances and zenith in all anglesspectral (VAA, bands, VZA). the Sun These azimuth parameters and zenith are averaged angles (SAA, over SZA)all pixels and included the viewing in the azimuth considered and zenith site. Toangles maximize (VAA, VZA).the number These ofparameters the match are-ups averaged (so-called over doublets), all pixels included are selected in the all considered acquisitions site. that To satisfymaximize the the condition number AMC of the <15, match-ups <45 and(so-called <55 for Algeria3, doublets), Libya4 are selected and Railroad all acquisitions Valley Playa that satisfy sites respectivelythe condition and AMC maximum <15, <45 time and lag <55 (time for Algeria3, window) Libya4 between and sensing Railroad times Valley of Playa 11 days. sites The respectively spectral responseand maximum of the seven time lag reference (time window) bands used between for the sensing intercomparison times of 11 is days. depicted The in spectral Figure response34. Except of bandsthe seven B02 referenceand B04, bandswhich useddisplay for a the shift intercomparison of about 10 nm, is depictedall the considered in Figure 34bands. Except show bands a rather B02 comparableand B04, which relative display spectral a shift response of about (RSR), 10 nm, particularly all the considered in term of bandsband centre. show Hence a rather no comparable correction ofrelative the shift spectral is considered response in (RSR), this analysis; particularly nevertheless, in term of bandits impact centre. has Hence been no computed correction and of theadded shift to is theconsidered uncertainty in this analysis analysis; following nevertheless, [38]. its impact has been computed and added to the uncertainty analysis following [38].

(a) (b) (c)

(d) (e) (f)

(g)

FigureFigure 34 34.. RelativeRelative spectral spectral response of (solid-black)(solid-black) S2A/MSI and and (dashed (dashed-blue)-blue) LANDSAT LANDSAT-8/OLI-8/OLI forfor bands bands ( (aa––gg)) 443, 443, 490, 490, 560, 560, 665, 665, 865, 865, 1610 1610 and and 2190 2190 nm nm respectively. respectively.

Pseudo-Invariant Calibration Site Methodology In addition to the sensor-to-sensor inter-calibration method, the PICS method described above (Section 4.3.3.1) is used. As mentioned above, in the cross-calibration of two sensors, simultaneous acquisitions are required in order to minimize the geometric effect. Therefore, the use of PICS method allows a wider applicability of such comparison through a bidirectional and spectral characterization of the surface reflectance of the site. Nevertheless, the PICS method is limited to the visible to Remote Sens. 2017, 9, 584 44 of 80

4.3.4.1.2. Pseudo-Invariant Calibration Site Methodology In addition to the sensor-to-sensor inter-calibration method, the PICS method described above (Section 4.3.3.1) is used. As mentioned above, in the cross-calibration of two sensors, simultaneous acquisitions are required in order to minimize the geometric effect. Therefore, the use of PICS method allows a wider applicability of such comparison through a bidirectional and spectral characterization Remote Sens. 2017, 9, 584 45 of 81 of the surface reflectance of the site. Nevertheless, the PICS method is limited to the visible to near- infrared (VNIR) spectral range. It is worth to keep in mind that the gain coefficients obtained from near-infraredPICS method are (VNIR) relative spectral when range. the previous It is worth method to keep gives in absolute mind that ones. the gain coefficients obtained from PICS method are relative when the previous method gives absolute ones. 4.3.4.2. Results and Analysis 4.3.4.4. Results and Analysis 4.3.4.2.1. Sensor-to-Sensor Inter-Calibration Results Sensor-to-Sensor Inter-Calibration Results By applying the previously described geometrical and temporal criteria to identify doublets of By applying the previously described geometrical and temporal criteria to identify doublets of reflectance from S2A/MSI and LANDSAT8/OLI, 12, 4 and 5 doublets have been selected over 40, 36 reflectance from S2A/MSI and LANDSAT8/OLI, 12, 4 and 5 doublets have been selected over 40, 36 and 15 acquisitions for Algeria3, Libya4 and RRVP sites respectively. Figure 35 displays the ratios and 15 acquisitions for Algeria3, Libya4 and RRVP sites respectively. Figure 35 displays the ratios between the observed TOA reflectance from the sensor under calibration (ρcal) and the TOA between the observed TOA reflectance from the sensor under calibration (ρcal) and the TOA reflectance reflectance from the reference sensor (ρref). In general, the comparison shows remarkable consistency from the reference sensor (ρ ). In general, the comparison shows remarkable consistency over all the over all the test sites for allref the spectral ranges. The maximum discrepancy over the average gain test sites for all the spectral ranges. The maximum discrepancy over the average gain coefficients is coefficients is observed for bands B02 and B04. This discrepancy could be related to the shift in the observed for bands B02 and B04. This discrepancy could be related to the shift in the RSR as shown RSR as shown in Figure 34. Libya4 results show larger bias over all the spectral ranges, which could in Figure 34. Libya4 results show larger bias over all the spectral ranges, which could be explained be explained by the small number of matchups. However, the averaged ratios show a good by the small number of matchups. However, the averaged ratios show a good agreement between agreement between both sensors with gain coefficients better that 2.5%. These results are found to be both sensors with gain coefficients better that 2.5%. These results are found to be of the same order of of the same order of magnitude as [36] when comparing similar bands from MODIS-A and MERIS magnitude as [36] when comparing similar bands from MODIS-A and MERIS over about 20 PICS sites. over about 20 PICS sites. The estimated uncertainty over all the considered doublets is about 3% for The estimated uncertainty over all the considered doublets is about 3% for bands B01 and B02, while bands B01 and B02, while uncertainties are found less than 2.5% for the other bands (B03, B04, B8A, uncertainties are found less than 2.5% for the other bands (B03, B04, B8A, B11 and B12). The uncertainty B11 and B12). The uncertainty estimation is in good agreement with the estimated uncertainty of estimation is in good agreement with the estimated uncertainty of about 3% by [37] for MODIS-A, about 3% by [37] for MODIS-A, AATSR and MERIS intercomparison. In fact, the uncertainty AATSR and MERIS intercomparison. In fact, the uncertainty difference between the visible bands and difference between the visible bands and the SWIR bands is most likely related to the decrease of the the SWIR bands is most likely related to the decrease of the surface reflectance and the increase of the surface reflectance and the increase of the atmospheric reflectance in the visible domain, as the atmospheric reflectance in the visible domain, as the atmospheric reflectance can reach about 50% of atmospheric reflectance can reach about 50% of the TOA reflectance at 443 nm band. These results the TOA reflectance at 443 nm band. These results support the good quality of S2A/MSI radiometric support the good quality of S2A/MSI radiometric calibration knowing the high quality of the calibration knowing the high quality of the radiometry calibration of LANDSAT8/OLI [27,28,39]. It is radiometry calibration of LANDSAT8/OLI [27,28,39]. It is worthwhile noting that there are several worthwhilefactors contributing noting that to the there discrepancy are several bet factorsween contributing both sensors: to the(1) the discrepancy maximum between time lag both of 11 sensors: days (1)induces the maximum day-to-day time variability; lag of 11 ( days2) the induces assumption day-to-day of symmetry variability; of the (2) TOA the assumptionBRDF with respect of symmetry to the ofprincipal the TOA plane BRDF is with valid respect as far to as the there principal is no azimuth plane is- validdependent as far as structure there is noon azimuth-dependent the surface; (3) the structuredifference onin the RSR surface; of both (3) sensors the difference induces in difference the RSR ofon bothTOAsensors reflectance induces of about difference 1%, see on [37 TOA]. reflectance of about 1%, see [37]. 1.1 1.08 1.06 1.04 1.02 1 0.98 0.96 0.94 0.92 Algeria3 Libya4 RRV All_AVG 0.9 443 490 560 665 865 1610 2190

Figure 35.35. TOA reflectancereflectance ratio defineddefined asas S2A-MSI/LS8-OLIS2A-MSI/LS8-OLI measurements derived from extracted doublets overover (red(red asterisk)asterisk) Algeria3, Algeria3, (orange (orange triangle) triangle) Libya4, Libya4, (blue (blue square) square) Railroad Railroad Valley Valley Playa Playa site (RRVP)site (RRVP and) and (black (black diamond) diamond) the averagethe average over over the threethe three test sitestest sites as a functionas a function of the of wavelength.the wavelength. The green (resp. red) dashed lines represent the 5% (resp. 10%) threshold specification. Error bars indicate the estimated uncertainty. Remote Sens. 2017, 9, 584 46 of 81

PICS Methodology Results As mentioned above, this method considers all time-series of both sensors over a predefined Remoteperiod. Sens. In 2017 other, 9, 5 words,84 it is not restricted to the matchups only but it considers the VNIR spectral45 range of 80 only. Thus the gain coefficients are, in fact, relative calibration coefficients as MERIS is considered as a The green (resp. red) dashed lines represent the 5% (resp. 10%) threshold specification. Error bars reference to compute the surface BRDF of the test site. After the ingestion of the whole datasets from indicate the estimated uncertainty. S2A/MSI and LANDSAT-8/OLI, the ratio Rb, defined as observed/simulated (ρobs/ρsim) reflectance is 4.3.4.2.2.then computed PICS Methodology over the six Results desert sites. Figure 36 shows time series of Rb from both S2A/MSI bands (B03, B04 and B8A) and LANDSAT-8/OLI bands (L03, L04 and L05) over Libya-1 site. AsBoth mentioned sensors displayabove, nearlythis method the same considers temporal all evolution,time-series scatter of both and sensors variability over of a thepredefined resulting period. In other words, it is not restricted to the matchups only but it considers the VNIR spectral Rb. In particular, the concomitant acquisitions show similar responses of both sensors (Figure 36). The rangeresults only. show Thus good the consistency gain coefficients with the are, sensor-to-sensor in fact, relative method calibration and the coefficients conclusions as of MERIS [40]. OLI is consideredratios exhibit as a slightlyreference higher to compute variability the surface in the BRDF blue and of the red test bands site. thanAfter MSI the ingestion (standard of deviation the whole is datasetslarger by from 0.2%), S2A/MSI while theand NIRLANDSAT band of-8/OLI, OLI shows the ratio lower Rb, variabilitydefined as thanobserved/simulated MSI band B8A (standard(ρobs/ρsim) reflectancedeviation isis smallerthen computed by 0.3%). over In spitethe six of thedesert acquisition-time sites. Figure 36 difference shows time and series the RSR of R differencesb from both of S2A/MSIboth sensors, bands the (B03, results B04 exhibitand B8A) a good and LANDSAT correspondence-8/OLI of bands the TOA (L03, reflectance L04 and L05) between over Libya both sensors-1 site. overBoth the VNIRsensors and display SWIR nearly spectral the ranges same intemporal agreement evolution, with [40 scatter]. and variability of the resulting Rb. In Figureparticular, 37 displays the concomitant the cross-calibration acquisitions show results similar over Algeria3,responses Libya1, of both Libya4sensors and(Figure Mauritania2 36). The resultsdesert show sites. Algeria5good consistency and Mauritania1 with the sites sensor have-to been-sensor excluded method from and this the comparison conclusions due of to [40 the]. shortOLI ratiostime seriesexhibit of slightly acquisitions, higher which variability leads toin lessthe reliableblue and results. red bands Except than a slight MSI bias(stan ofdard 2 or deviation 3% in bands is largerB01 over by 0.2 Algeria3%), while and the Mauritania2 NIR band of and OLI B02 shows over Libya4,lower variability the results than are MSI consistent band B8A over (standard the VNIR deviationspectral range,is smaller and by no 0.3%). significant In sp discrepancyite of the acquisition can be observed.-time difference Except forand B05 the from RSR MSI,differences the whole of bothratio sensors, values arethe withinresults ±exhibit5%. These a good results correspondence demonstrate of the the complementarity TOA reflectance between both MSI andsensors OLI, overand the show VNIR that and MSI SWIR products spectral can provideranges in data agreement continuity with for [40 OLI]. in agreement with [40].

Figure 36. Cont.

Remote Sens. 2017, 9, 584 47 of 81 Remote Sens. 2017, 9, 584 46 of 80

Figure 36. Time series of TOA reflectance ratio Rb from (black) S2A/MSI and (blue) LS8/OLI over FigureLybia1 36 for. Time (from series top to of bottom) TOA B03,reflectance B04 and ratio B8A R fromb from S2A (black) over theS2A/MSI commissioning and (blue) period. LS8/OLI The greenover Lybia1(resp. red)for (from dashed top linesto bottom) represent B03, theB04 5% and (resp. B8A from 10%) S2A threshold over the specification. commissioning Error period. bars indicate The green the (resp.estimated red) dashed uncertainty lines for represent the PICS the method. 5% (resp. 10%) threshold specification. Error bars indicate the estimated uncertainty for the PICS method.

Figure 37 displays the cross-calibration results over Algeria3, Libya1, Libya4 and Mauritania2 desert sites. Algeria5 and Mauritania1 sites have been excluded from this comparison due to the short time series of acquisitions, which leads to less reliable results. Except a slight bias of 2 or 3% in bands B01 over Algeria3 and Mauritania2 and B02 over Libya4, the results are consistent over the VNIR spectral range, and no significant discrepancy can be observed. Except for B05 from MSI, the whole ratio values are within ±5%. These results demonstrate the complementarity between MSI and OLI, and show that MSI products can provide data continuity for OLI in agreement with [40].

Remote Sens. 2017, 9, 584 48 of 81 Remote Sens. 2017, 9, 584 47 of 80

Figure 37. Cont.

Remote Sens. 2017, 9, 584 49 of 81 Remote Sens. 2017, 9, 584 48 of 80

Figure 37. Ratio of observed TOA reflectance to simulated one for each sensor (black) S2A/MSI and Figure(blue) LANDSAT-8/Operational37. Ratio of observed TOA Landreflectance Imager to (OLI) simulated over Algeria3,one for each Libya1, sensor Libya4, (black) and S2A/MSI Mauritania2 and (blue)sites asLANDSAT a function-8/Operational of wavelength. Land The Imager green ( (resp.OLI) over red) Algeria3, dashed lines Libya1, represent Libya4, the and 5% Mauritania2 (resp. 10%) sitesthreshold as a func specification.tion of wavelength. Error bars indicateThe green the (resp. estimated red) dashed uncertainty lines for represent the PICS the method. 5% (resp. 10%) threshold specification. Error bars indicate the estimated uncertainty for the PICS method. 4.3.5. Inter-band Relative Radiometric Uncertainty Validation 4.3.5. Inter-band Relative Radiometric Uncertainty Validation The sensor calibration provides absolute calibration coefficients, one per spectral band, which The sensor calibration provides absolute calibration coefficients, one per spectral band, which enables to convert the digital count at the output of the sensor into TOA radiance. For each spectral enables to convert the digital count at the output of the sensor into TOA radiance. For each spectral band, the TOA reflectance of the Level 1C products is directly proportional to the TOA radiance. An band, the TOA reflectance of the Level 1C products is directly proportional to the TOA radiance. An error on the absolute calibration coefficients may have various effects depending on its type. The error on the absolute calibration coefficients may have various effects depending on its type. The Inter-band Relative Radiometric Uncertainty Validation (shortly, inter-band validation) is devoted to Inter-band Relative Radiometric Uncertainty Validation (shortly, inter-band validation) is devoted to assess the impact of the calibration coefficients assessment between spectral bands. For a reflectance assess the impact of the calibration coefficients assessment between spectral bands. For a reflectance ratio, any common multiplicative error on the calibration coefficients does not affect the performance. ratio, any common multiplicative error on the calibration coefficients does not affect the performance. On the contrary, some errors on individual spectral bands may “cumulate” when the ratio is used. On the contrary, some errors on individual spectral bands may “cumulate” when the ratio is used. Thus, the inter-band validation compares the reflectance ratio between two different spectral Thus, the inter-band validation compares the reflectance ratio between two different spectral bands computed from a L1C product to the expected TOA reflectance ratio. The key point of such bands computed from a L1C product to the expected TOA reflectance ratio. The key point of such assessment is the knowledge of the spectral shape of the observed landscape. assessment is the knowledge of the spectral shape of the observed landscape. The inter-band validation was initially envisaged over snowy areas like Antarctica and Greenland. The inter-band validation was initially envisaged over snowy areas like Antarctica and But the snow spectrum varies according to the characteristics of the snow (thickness, wet or dry, size Greenland. But the snow spectrum varies according to the characteristics of the snow (thickness, wet of the snow crystals, etc.), so the inter-band validation now relies on desert sites. It will be performed or dry, size of the snow crystals, etc.), so the inter-band validation now relies on desert sites. It will twice a year. be performed twice a year. 4.3.5.1. Method and Outlook 4.3.5.1. Method and Outlook The method relies on spectral BRDF of desert sand samples measured in a laboratory. The availableThe method samples relies correspond on spectral to the BRDFsites named of desert Algeria-3, sand samples Algeria-4 measured and Negev. in a laboratory. The availableFor eachsamples acquisition correspond over to these the sites sites, named a tool extracts Algeria from-3, Algeria the L1C-4 and product, Negev. for each spectral band, the meanFor each value acquisition of the TOA over reflectance, these sites, the a tool solar extracts angles from and the viewingL1C product, angles. for each spectral band, the meanThe sitevalue coordinates, of the TOA the reflectance, date and thethe viewingsolar angles angles, and climatology the viewing data angles. and the available spectral BRDFThe feed site a coordinates, radiative transfer the date code, and such the vi asewing 6S [41 angles,] or MODTRAN climatology [42 data], which and providesthe available the expectedspectral BRDFTOA reflectancefeed a radiative for each transfer spectral code, band. such as 6S [41] or MODTRAN [42], which provides the expected TOA reflectanceThe reflectance for each ratios spectral of pairs band. of spectral bands are computed for both cases and compared. The implementationThe reflectance of theratios method of pairs is in of progress spectral andbands results are computed are expected for both in the cases coming and months.compared. The implementation of the method is in progress and results are expected in the coming months. 4.3.6. Signal-to-Noise Validation 4.3.6. Signal-to-Noise Validation The objective of this activity is to measure the temporal noise (or column noise) standard-deviation andThe Signal objective to Noise of Ratio this activi (SNR)ty of is the to instrument measure the pixels temporal in order noise to compare (or column it to noise) the data standard quality- deviation and Signal to Noise Ratio (SNR) of the instrument pixels in order to compare it to the data

Remote Sens. 2017, 9, 584 50 of 81 baseline and identify potential instrument performance degradations. The method used for this assessment relies on on-board sun-diffuser device and dark product acquisitions.

4.3.6.1. Method The SNR is a function of the mean radiance of the landscape, generally expressed as:

< L > SNR = (13) σL where is the mean of a set of radiances over a uniform landscape (the signal) and σL is the standard-deviation of this set of radiances (i.e., the noise). SNR depends on the radiance level. It is usually lower for low values of radiance (dark landscape) because the relative influence of the noise is larger. For large radiances, the SNR increases as the relative influence of the noise decreases. Therefore, the SNR should be known at different radiance levels. For that purpose a 2-parameter instrument noise model (per pixel) is established: q 2 σL(p, d, b) = αL(p, d, b) + βL(p, d, b)L(p, d, b) (14) where

2 • σL is the radiometric noise standard-deviation (in W/m /sr/µm). • L is the radiance (in W/m2/sr/µm). 2 • αL and βL are the noise model parameters, both in in W/m /sr/µm.

◦ αL represents the dark noise standard-deviation (at radiance 0). ◦ βL·L represents the variance of the shot noise (due to the particle nature of light) at radiance L.

The two noise model parameters (per pixel/band/detector) are estimated from one dark acquisition and one sun-diffuser acquisition (L0 products processed up to L1B):

• αL is directly determined from measurement of noise standard-deviation on a dark image (L ≈ 0). • βL is determined from measurement of noise standard-deviation in a sun-diffuser image and by inversion of Equation (14) using the knowledge (from diffuser model) of diffuser radiance Ldi f f (p, d, b) and current estimation of αL.

Then the SNR at any radiance L for each instrument pixel is given by Equation (15) below:

L SNR(p, d, b) = q (15) 2 αL(p, d, b) + βL(p, d, b).L

The method for assessing the temporal (or column) noise standard deviation on the two L1B products is as follows:

• On dark image:

◦ σL(p, d, b) on dark image is simply the standard-deviation of the radiance levels in one column. • On sun-diffuser image:

◦ σL(p, d, b) on sun-diffuser image is the standard-deviation of the radiance levels in one column corrected from the sun-diffuser non-uniformity (BRDF and solar angles effects) known from sun-diffuser simulated radiance (see Equation (2) in paragraph 4.1.2). Remote Sens. 2017, 9, 584 51 of 81

4.3.6.2. Results S2A SNR is well inside the requirements (>20% margin) for all bands and appears to be very RemotestableRemote Sens.Sens. over 20172017 the,, 99 period,, 558484 analysed (since January 2016) as shown in Figures 38 and 39. 5050 ofof 8080

SpectralSpectral SNRSNR SPECSPEC LrefLref SNR@LrefSNR@Lref BandBand @@ LrefLref

B1B1 129129 129129 13611361 B2B2 128128 154154 214214 B3B3 128128 168168 249249 B4B4 108108 142142 230230 B5B5 74.574.5 117117 253253 B6B6 6868 8989 220220 B7B7 6767 105105 227227 B8B8 103103 174174 221221 B8AB8A 52.552.5 7272 161161 B9B9 99 114114 227227 B10B10 66 5050 387387 B11B11 44 100100 158158 B12B12 1.51.5 100100 166166

Figure 38. Sentinel-2 reference radiance Lrefref (W/m²/sr/μm),2 Signal-to-Noise Ratio (SNR) requirement Figure 38 38.. SentinelSentinel-2-2 referencereference radianceradiance L Lref (W/m(W/m²/sr/μm),/sr/µm), Sig Signal-to-Noisenal-to-Noise Ratio Ratio ( (SNR)SNR) requirement requirement atat reference reference radiance radiance andand currentcurrent SNRSNR measurementmeasurement onon sun-diffusersunsun--diffuserdiffuser atatat referencereferencereference radiance. radiance.radiance.

FigureFigure 39 39.39.. AverageAverage SNR SNR atat referencereference radianceradiance LLrefrefref measurementsmeasurements (per (per band) band) since since January January 2016. 2016.

4.3.7.4.3.7. Modulation Modulation Transfer Transfer FunctionFunction ValidationValidation ModulationModulation Transfer Transfer Transfer Function Function Function (MTF) (MTF) (MTF) measurement measurement measurement aims at aims aims quantifying at at quantifying quantifying sensor spatial sensor sensor resolution. spatial spatial resolution.MTFresolution. may be MTFMTF affected maymay by bebelaunch affectedaffected vibrations, byby launchlaunch transition vibrations,vibrations, from transitiontransition air to vacuum fromfrom andairair toto thermal vacuumvacuum state. andand This thermalthermal leads state.tostate. check This This the leads leads MTF to tovalue check check in orbit. the the MTF MTF Depending value value in in on orbi orbi thet.t. origin, Depending Depending an MTF on on loss the the can origin, origin, be compensated an an MTF MTF loss loss thanks can can be be to compensatedthecompensated refocusing thanks thanks mechanism. to to the the During refocusing refocusing the first mechanism. mechanism. phase of the During During post-launch the the first first calibration, phase phase of of two the the methods post post--launchlaunch have calibration,beencalibration, used for twotwo MTF methodsmethods assessment: havehave beenbeen the mean usedused square forfor MTFMTF methods assessment:assessment: and the thethe linear meanmean object squaresquare method. methodsmethods For andand routine thethe linearperiod,linear object object the well-known method. method. Foredge For r r methodoutineoutine period, period,is used. theThe the MTFwell well--known measurementknown edge edge ismethod method an activity is is used. used. performed The The MTF onceMTF measurementameasurement year. isis anan activityactivity performedperformed onceonce aa year.year.

4.3.7.1.4.3.7.1. Methods Methods FromFrom [ [4343]],],, the the sensor sensor model model forfor spatialspatial resolutionresolution is:is:

Image(x,y)Image(x,y)Image(x,y) = == Landscape(x,y)*h(x,y).DiracComb(x,y) Landscape(x,y)*h(x,y).DiracComb(x,y)Landscape(x,y)*h(x,y).DiracComb(x,y) ((16)(1616)) xx variesvaries alongalong thethe lineline ofof thethe image,image, yy variesvaries alongalong thethe rowrow ofof thethe image,image, hh isis thethe impulseimpulse responseresponse inin otherother wordswords thethe PointPoint SpreadSpread FunctionFunction (PSF),(PSF), TheThe DiracCombDiracComb correspondscorresponds toto thethe samsamplingpling byby thethe detectorsdetectors inin thethe focalfocal plane,plane, ** isis thethe convolutionconvolution product,product, .. isis thethe multiplication.multiplication. AA FourierFourier transformtransform enablesenables toto reachreach thethe frequencyfrequency domain:domain:

Remote Sens. 2017, 9, 584 52 of 81

x varies along the line of the image, y varies along the row of the image, h is the impulse response in other words the Point Spread Function (PSF), The DiracComb corresponds to the sampling by the detectors in the focal plane, * is the convolution product, . is the multiplication. A Fourier transform enables to reach the frequency domain:

ImageSpectrum(fx, fy) = LandscapeSpectrum(fx, fy).H(fx, fy)* DiracComb(fx, fy) (17)

fx is the spatial frequency for the line (or across-track) direction, fy is the spatial frequency for the row (or along-track) direction, H is the Transfer Function. The MTF is the modulus of the transfer function H. The convolution by the DiracComb in the Fourier domain corresponds to a replication of the pattern LandscapeSpectrum(fx, fy).H(fx, fy). MTF measurement methods aim at:

• Choosing a landscape so that the term LandscapeSpectrum can be known, • Managing the replication in order to avoid aliasing effect that corresponds to mixing, for some frequency range, of the pattern with its replica.

Reference Image Method Unlike the other two methods described hereafter, the “reference image method” is used to provide MTF measurement in all directions and in many parts of the field of view. Given a Sentinel-2 image target, it uses an acquisition of another instrument on the same area, but at higher resolution. This reference image is resampled many times at S2 resolution, applying a different MTF each time and searching for the one that gives best resemblance with S2 product. Five Pleiades images over Toulouse, Albuquerque, Las Vegas, Los Angeles and Dallas were selected as high resolution references. This method requires good geometrical superimposable reference and target images. The reference product is thus resampled in S2 level-1B geometry as a pre-processing step. Finally, a least square algorithm is used to compare images and determine the most probable MTF.

Bridge Method This method uses images of bridges for MTF verification. It is inspired by method on slanted edge targets. As such, it uses an object which radiometric response is easy to model and whose response after transmission through the system is compared to its theoretical initial form. The observed bridge has to be straight, of uniform radiometry (both bridge and background) and its orientation slightly tilted with respect to X or Y axis. This inclination is used to build an oversampled profile of the bridge acquired by the instrument, which is the profile of the real bridge after application of the system MTF. This real bridge is itself simply modelled as a rectangular function. The MTF of the system is then obtained as the ratio between the Discrete Fourier Transforms (DFT) of the oversampled profile and of this rectangular function:

DFT(Pro f ile ) MTF = measured (18) DFT(Pro f iletheoretical)

Three bridges were selected thanks to their orientations with respect to Sentinel-2 swath and to their widths: King Fahd Causeway (Saudi Arabia), Albermarle Bay Bridges (USA), Rikers Island Bridge (New York). Remote Sens. 2017, 9, 584 53 of 81

Edge Method For the edge method, the landscape is chosen so that it can stand for a Heaviside function. This can be done thanks a large mosaic of fields. To manage the replication through oversampling an inclination of the edge relative to the line or row of the image is needed. A 1D oversampled edge response is created from the 2D under-sampled image. Thanks to this oversampling in the direction perpendicular to the edge Equation (17) becomes:

ImageSpectrum(f ) = Heaviside(f ).H(f ) (19) Remote Sens. 2017, 9, 584 52 of 80 which leads to: 퐼푚푎푔푒푆푝푒푐푡푟푢푚 ImageSpectrum(푓)( f ) 푀푇퐹MTF(푓)( f=) |=퐻(|푓H)(|f=)||= | (20)(20) 퐻푒푎푣푖푠Heaviside푖푑푒(푓)( f )

4.3.7.5.4.3.7.2. Results For the edge method, MTF was deduced from an image from Maricopa fieldfield and from an image of Ross Ice Shelf in Antarctica. Due to the noise mainly due to the non-uniformitynon-uniformity or the sides of the naturalnatural edges, aa modelmodel waswas fittedfitted betweenbetween 0.1 0.1 fs fs and and 0.5 0.5 fs, fs, fs fs being being the the sampling sampling frequency frequency and and 0.5 0.5 fs beingfs being the the Nyquist Nyquist frequency. frequency. An An example example of MTFof MTF curves curves is givenis given in Figurein Figure 40. 40.

Maricopa1 18/12/2015

1 B2_Mean_mod_ACT 0,8 B2_Mean_mes_ACT 0,6

0,4

Alcross trackMTF 0,2

0 0 0,2 0,4 0,6 0,8 1

Normalized frequency (f/fs)

Figure 40 40.. ModulationModulation Transfer Transfer Function Function (MTF (MTF)) curve curve for B02 for band B02 band (centred (centred at 490at nm) 490 for nm) the foracross the- across-tracktrack direction. direction.

The MTF values at Nyquist frequency obtained with the various methods are mixed and The MTF values at Nyquist frequency obtained with the various methods are mixed and compared compared to the requirements and the expectations deduced from ground measurements in Figures to the requirements and the expectations deduced from ground measurements in Figures 41 and 42. 41 and 42. Most validation measurements are in agreement. Globally, the across-track values are lower than those expected from ground measurements. In December 2015, ESA acceptedAcross a track waiver MTF on the comparison MTF maximum value requirement since the impact on image quality was considered acceptable. Therefore, the measured across-track MTF 0,6 performance for B05 (705 nm), B06 (740 nm), B07 (783 nm) and B8AACT_expected (865 nm) is not a concern. As there is no MTF below the minimum value specified, the spatial resolution reduced to MTF [44] is as 0,5 ACT_otherMethods expected or even better. 0,4 ACT_EdgeMethod spec max 0,3 spec min 0,2

0,1

0

B1 B2 B3 B4 B5 B6 B7 B8 B9

B11 B10 B12

B8A Acrosstrack MTF Nyquistat frequency Spectral band number

Figure 41. MTF results for the across-track direction, and MTF requirements (min value, spec min and max value, spec max).

Remote Sens. 2017, 9, 584 52 of 80

퐼푚푎푔푒푆푝푒푐푡푟푢푚(푓) 푀푇퐹(푓) = |퐻(푓)| = | | (20) 퐻푒푎푣푖푠푖푑푒(푓)

4.3.7.2. Results For the edge method, MTF was deduced from an image from Maricopa field and from an image of Ross Ice Shelf in Antarctica. Due to the noise mainly due to the non-uniformity or the sides of the natural edges, a model was fitted between 0.1 fs and 0.5 fs, fs being the sampling frequency and 0.5 fs being the Nyquist frequency. An example of MTF curves is given in Figure 40.

Maricopa1 18/12/2015

1 B2_Mean_mod_ACT 0,8 B2_Mean_mes_ACT 0,6

0,4

Alcross trackMTF 0,2

0 0 0,2 0,4 0,6 0,8 1

Normalized frequency (f/fs)

Figure 40. Modulation Transfer Function (MTF) curve for B02 band (centred at 490 nm) for the across- track direction.

The MTF values at Nyquist frequency obtained with the various methods are mixed and compared to the requirements and the expectations deduced from ground measurements in Figures Remote Sens. 2017, 9, 584 54 of 81 41 and 42.

Across track MTF comparison 0,6 ACT_expected 0,5 ACT_otherMethods

0,4 ACT_EdgeMethod spec max 0,3 spec min 0,2

0,1

0

B1 B2 B3 B4 B5 B6 B7 B8 B9

B11 B10 B12

B8A Acrosstrack MTF Nyquistat frequency Spectral band number

Figure 41 41.. MTFMTF results results for the the across across-track-track direction, and MTF requirements (min value, spec min and Remotemax Sens. value, 2017, 9 spec, 584 max). 53 of 80

Along track MTF comparison 0,7 ALT_expected 0,6 ALT_otherMethods 0,5 ALT_edgeMethod 0,4 spec max 0,3 spec min 0,2 0,1

0

B1 B2 B3 B4 B5 B6 B7 B8 B9

Along Along trackMTFat Nyquist frequency

B12 B10 B11 B8A Spectral band number

Figure 42 42.. MTF results for the along along-track-track direction, and MTF requirements (min value, spec min and max value, spec max).

4.4. GeometryMost validation Validation measurements Activities are in agreement. Globally, the across-track values are lower than those expected from ground measurements. InGeometric December validation 2015, ESA activities accepted aims a waiver at assessing on the allMTF geometric maximum performances value requirement related since to image the qualityimpact on requirements. image quality The was validation considered activities acceptable. areperformed Therefore, regularly the measured but with across various-track MTF time performancefrequencies from for B05 weekly (705 basicnm), B06 checks (740 to nm), yearly B07 in-depth (783 nm) analyses, and B8A including (865 nm) long-termis not a concern. statistical As thereand trend is no analysis.MTF below In casethe minimum of observed value performance specified,degradation, the spatial resolution specific calibration reduced to activities MTF [44 could] is as expectedbe triggered. or even better. 4.4.1. Geolocation Uncertainty Validation 4.4. Geometry Validation Activities Geolocation performance is assessed both at level 1B and at level 1C with or without geometric Geometric validation activities aims at assessing all geometric performances related to image refinement using the GRI. All results are presented in the following paragraphs. Geometric refinement quality requirements. The validation activities are performed regularly but with various time being not activated in the operational processing chain at the manuscript-writing time, the results frequencies from weekly basic checks to yearly in-depth analyses, including long-term statistical and related to refined products are obtained off-line by CNES using the Ground Processor Prototype trend analysis. In case of observed performance degradation, specific calibration activities could be triggered.

4.4.1. Geolocation Uncertainty Validation Geolocation performance is assessed both at level 1B and at level 1C with or without geometric refinement using the GRI. All results are presented in the following paragraphs. Geometric refinement being not activated in the operational processing chain at the manuscript-writing time, the results related to refined products are obtained off-line by CNES using the Ground Processor Prototype (GPP) and should be representative of the operational products performance when refinement will be activated. In practice the geolocation performances at L1C should be similar to those at L1B level, except the small error introduced by the resampling.

4.4.1.1. General Methods and Data Geolocation assessment is based on detection of Ground Control Points (GCP) in Sentinel 2 images by correlation with a database of accurately localised images spread over the world. The general principle is shown in Figure 43 and is composed of the following steps: (1) Find approximate GCP location in Sentinel 2 image using product geolocation metadata (L1B physical geolocation model or L1C georeferencing metadata).

Remote Sens. 2017, 9, 584 55 of 81

(GPP) and should be representative of the operational products performance when refinement will be activated. In practice the geolocation performances at L1C should be similar to those at L1B level, except the small error introduced by the resampling.

4.4.1.1. General Methods and Data Geolocation assessment is based on detection of Ground Control Points (GCP) in Sentinel 2 images by correlation with a database of accurately localised images spread over the world. The general principle is shown in Figure 43 and is composed of the following steps: Remote Sens. 2017, 9, 584 54 of 80 (1) Find approximate GCP location in Sentinel 2 image using product geolocation metadata (L1B (2)physical Resample geolocation both images model in or the L1C same georeferencing geometry: Either metadata). S2 image is resampled to GCP image or (2) ResampleGCP image both images is resampled in the same to Senti geometry:nel 2 image. Either S2 This image is based is resampled on L1B to physical GCP image geolocation or GCP imagemodel is resampled or L1C georeferencing to Sentinel 2 image. metadata. This is based on L1B physical geolocation model or L1C (3)georeferencing Assess the GCP metadata. geolocation error (shift between the GCP image and Sentinel-2 image) by (3) Assesscorrelation the GCP technique geolocation error (shift between the GCP image and Sentinel-2 image) by (4)correlation Repeat the technique. previous steps for several GCPs in one product and several products (4) (5)Repeat Filter the out previous bad correlation steps for points several based GCPs on in correlation one product quality and several criteria products.

(5) (6)FilterCompute out bad statistics correlation to assess points performance based on correlation (e.g., at quality95.45% criteriaconfidence level). (6) TheCompute process statistics is performed to assess using performance Sentinel-2 (e.g.,band atB04 95.45% as reference confidence band. level). The 10m bands obviously provide a better correlation accuracy.

Figure 43. General principle of geolocation uncertainty validation using ground control points. Mean Figure 43. General principle of geolocation uncertainty validation using ground control points. Mean shift computation is obtained by correlation technique. shift computation is obtained by correlation technique.

Two different GCPs database are used for the various geolocation validations: The process is performed using Sentinel-2 band B04 as reference band. The 10m bands obviously provide aA better dedicated correlation high-resolution accuracy. ortho-images database with geolocation accuracy better than Two5m different is used GCPs by Thales database Alenia are Space used in for the the frame various of MPC, geolocation see world validations: distribution in Figure 44.  Pleiades HR database including more than 500 images with accuracy better than 5 m is used by • A dedicatedCNES. high-resolution ortho-images database with geolocation accuracy better than 5m is used by Thales Alenia Space in the frame of MPC, see world distribution in Figure 44. • Pleiades HR database including more than 500 images with accuracy better than 5 m is used by CNES.

Figure 44. Map of Thale Alenia Space GCP database sites (red dots are GCP sites currently available and used as baseline for S2 validation activities).

Remote Sens. 2017, 9, 584 54 of 80

(2) Resample both images in the same geometry: Either S2 image is resampled to GCP image or GCP image is resampled to Sentinel 2 image. This is based on L1B physical geolocation model or L1C georeferencing metadata. (3) Assess the GCP geolocation error (shift between the GCP image and Sentinel-2 image) by correlation technique (4) Repeat the previous steps for several GCPs in one product and several products (5) Filter out bad correlation points based on correlation quality criteria (6) Compute statistics to assess performance (e.g., at 95.45% confidence level). The process is performed using Sentinel-2 band B04 as reference band. The 10m bands obviously provide a better correlation accuracy.

Figure 43. General principle of geolocation uncertainty validation using ground control points. Mean shift computation is obtained by correlation technique.

Two different GCPs database are used for the various geolocation validations:  A dedicated high-resolution ortho-images database with geolocation accuracy better than 5m is used by Thales Alenia Space in the frame of MPC, see world distribution in Figure 44. Remote Sens.Pleiades2017, 9, 584 HR database including more than 500 images with accuracy better than 5 m is used56 of by 81 CNES.

Figure 44. Map of Thale Alenia Space GCP database sites (red dots are GCP sites currently available Figure 44. Map of Thale Alenia Space GCP database sites (red dots are GCP sites currently available and used as baseline for S2 validation activities). and used as baseline for S2 validation activities).

These ground control points have to be as accurate as possible and on flat areas to limit parallax effects, shadows, etc., that would impact measurement accuracy. The quality of the measurement also depends on:

• the accuracy of matching algorithm, given the difference in acquisition date, sensors characteristics and acquisition conditions between Sentinel-2 images and the reference image, • the number of GCPs and their distribution over the scene.

For the assessment of geolocation on refined products a special care shall be taken to use GCPs independent from the ones used for the GRI construction. The general performance assessment principle is declined into several methods depending on the product level processed: Validation on L1C products (refined or not):

• In step (2) GCPs ortho-images (in any cartographic projection) are resampled onto Sentinel-2 L1C tile cartographic projection. • The error shifts are thus observed in fraction of L1C pixels directly convertible into meters or X/Y UTM coordinates. The shifts can be converted into across-track and along-track directions using knowledge of satellite acquisition direction on ground.

Validation on L1B products (refined or not): Method 1: In step (2) Sentinel-2 L1B image is resampled on the GCPs image (in cartographic projection). The estimation of performance is then directly in metres and accounts for all contributing factors. Method 2: In step (2) the GCP image is resampled on Sentinel-2 L1B image (in sensor geometry). Thus the geolocation performance is not estimated in metres on the ground, but in the focal plane. In this case, the inverse location function is used: from the GCP ground coordinates, the image coordinates are estimated by means of the inverse geometric model and then compared with measured image coordinates. This provides the location performance along rows (in pixels), also known as column location performance, and the location performance along columns (in pixels), also known as row location performance. The benefit of this method lies in obtaining the performance in the focal plane and therefore in better understanding the physical phenomena. Hence pitch, roll or yaw bias, magnification or even drift in pitch or roll, can be shown. The performance indicator to be compared to the requirement is computed as follows:

(1) For each processed S2 product, compute the mean circular error on all GCPs inside the product. Remote Sens. 2017, 9, 584 57 of 81

(2) Compute the quantile at 95.45% of the mean circular errors over all the products.

Sentinel-2 acquisition needs are segments of L1B or L1C (refined or not refined), depending on the product level analysed. They shall be distributed over the world, including various latitudes and various seasons, the accuracy of the image geolocation depending on these two parameters. The geolocation performance assessment is by essence a long-term activity requiring at least one year of product acquisitions.

Results for Non-Refined Products Geolocation performance on non-refined products represents the absolute location performance as provided by the system, and depends uniquely on the calibrated imaging parameters without any ground-processing improvement based on external information (such as Ground Control Points—GCP—for example). Performances before production baseline 2.04 suffered from a yaw bias correction anomaly, particularly visible at the edge of the swath, leading to a performance of about 14.5 m CE at 95.5% confidence level on L1C products. RemoteSince Sens. production2017, 9, 584 baseline 2.04, the performances are very good on both types of products L1B56 andof 80 L1C, and considerably under the limit fixed by specifications of 20 m CE at “2σ”. Table 12, Figures 45 and 46Since present production the statistics baseline obtained 2.04, onthe system performances geolocation are very performances good on forboth L1 types non-refined of products products. L1B and L1C,The 95.45%and considerably confidence under level (“2theσ limi”) systemt fixed geolocationby specifications performance of 20 m CE value at “2 obtainedσ”. Table on 12, analysed Figures products45 and 46 is present10 m for the non-refined statistics L1B obtained and L1C on products. system geolocation It is compliant performances with the specification for L1 non- ofrefined 20 m atproducts. 2σ.

Table 12.12. Statistics onon geolocationgeolocation forfor non-refinednon-refined L1BL1B andand L1CL1C products.products.

System Geolocation Performance System Geolocation Performance (CircularL1B Error) Non RefinedL1B Non Refined L1C NonL1C Refined Non Refined (Circular Error) 95.45% conf. level 10 m 10 m 95.45% conf. level 10 m 10 m Mean valueMean value 5 m5 m 5 m 5 m RequirementRequirement (for non-refined (for non products)-refined products) 20 m20 m 20 m 20 m

FigureFigure 45.45. SystemSystem GeolocationGeolocation PerformancesPerformances in metresmetres (Band(Band 04)04) forfor L1BL1B non-refinednon-refined products.products. RedRed (resp.(resp. orange)orange) circlecircle isis thethe requirementrequirement withoutwithout (res.(res. with)with) geometricgeometric refinementrefinement onon GCPs.GCPs.

ACT/ALT error measured on all GCPs Mean circular error measured in each product 20 20 Spec without GCP at L1B (20m) 18 Spec with GCPs at L1C (12.5m) 15 Mean CE on one product 16 10 14 5 12 0 10

ALT error (m) ALT error -5 Circular error (m) error Circular 8

-10 6 ACT/ALT error on GCP -15 Spec without GCP at L1B (20m) 4 Spec with GCPs at L1C (12.5m)

-20 2 -25 -20 -15 -10 -5 0 5 10 15 20 25 26/06 11/07 26/07 ACT error (m) Acquisition date (a) (b)

Figure 46. System Geolocation Performances in metres for L1C non-refined products. Reference band is B04. (a) Geolocation error (in metre) along-track and across-track measured on each GCP; (b) Mean geolocation error in each product processed in function of the acquisition dates of the product. The two outliers seen on both figures correspond to products from orbit 5601 and are due to contingencies (star-tracker outage).

The 95.45% confidence level (“2σ”) system geolocation performance value obtained on analysed products is 10 m for non-refined L1B and L1C products. It is compliant with the specification of 20 m at 2σ.

Remote Sens. 2017, 9, 584 56 of 80

Since production baseline 2.04, the performances are very good on both types of products L1B and L1C, and considerably under the limit fixed by specifications of 20 m CE at “2σ”. Table 12, Figures 45 and 46 present the statistics obtained on system geolocation performances for L1 non-refined products.

Table 12. Statistics on geolocation for non-refined L1B and L1C products.

System Geolocation Performance (Circular Error) L1B Non Refined L1C Non Refined 95.45% conf. level 10 m 10 m Mean value 5 m 5 m Requirement (for non-refined products) 20 m 20 m

Remote Sens.Figure2017 ,459,. 584System Geolocation Performances in metres (Band 04) for L1B non-refined products. Red 58 of 81 (resp. orange) circle is the requirement without (res. with) geometric refinement on GCPs.

ACT/ALT error measured on all GCPs Mean circular error measured in each product 20 20 Spec without GCP at L1B (20m) 18 Spec with GCPs at L1C (12.5m) 15 Mean CE on one product 16 10 14 5 12 0 10

ALT error (m) ALT error -5 Circular error (m) error Circular 8

-10 6 ACT/ALT error on GCP -15 Spec without GCP at L1B (20m) 4 Spec with GCPs at L1C (12.5m)

-20 2 -25 -20 -15 -10 -5 0 5 10 15 20 25 26/06 11/07 26/07 ACT error (m) Acquisition date (a) (b)

Figure 46. System Geolocation Performances in metres for L1C non-refined products. Reference band Figure 46. System Geolocation Performances in metres for L1C non-refined products. Reference band is B04. (a) Geolocation error (in metre) along-track and across-track measured on each GCP; (b) Mean is B04. (a) Geolocation error (in metre) along-track and across-track measured on each GCP; (b) Mean geolocation error in each product processed in function of the acquisition dates of the product. The geolocationtwo outliers error seen in on each both product figures correspond processed to in products function from of theorbit acquisition 5601 and are dates due to of contingencies the product. The two(star outliers-tracker seen outage). on both figures correspond to products from orbit 5601 and are due to contingencies (star-tracker outage). The 95.45% confidence level (“2σ”) system geolocation performance value obtained on analysed Remote Sens. 2017, 9, 584 57 of 80 Resultsproducts for Refined is 10 m for Products non-refined L1B and L1C products. It is compliant with the specification of 20 m at4.4.1.1.2. 2σ. Results for Refined Products The objective is to characterize the geolocation performances of a L1C product whose geometric model hasThe been objective refined is to by characterize matching withthe geoloca a well-geolocalisedtion performances Global of a L1C Reference product Image whose (GRI).geometric Themodel statistical has been results refined obtainedby matching on refinedwith a well L1C-geolocalised product geolocation Global Reference performances Image (GRI). are summarised The statistical results obtained on refined L1C product geolocation performances are in Table 13 and represented in Figure 47. summarised in Table 13 and represented in Figure 47.

Table 13. Statistics on geolocation for refined L1C products. Table 13. Statistics on geolocation for refined L1C products.

SystemSystem Geolocation Geolocation Performance Performance (Circular (Circular Error) Error) L1C L1C Refined Refined P Productroduct 95.45%95.45 conf.% levelconf. level 8 8 m m MeanMean value value 4.5 4.5 m m Max valueMax value 11 11 m m

FigureFigure 47. Product 47. Product Geolocation Geolocation Performances Performances in in metres metres (Band(Band 04)04) for L1C L1C refined refined tiles. tiles. Orange Orange circle circle is the requirementis the requirement with geometricwith geometric refinement refinement on on GCPs. GCPs.

The 95.45% confidence level (“2σ”) product geolocation performance value obtained on analysed products is 8 m for L1C refined products. It is compliant with the specification of 12.5 m at 2σ.

4.4.2. Multi-spectral Registration Uncertainty Validation The goal of the multispectral registration assessment is to verify that the spatial registration of spectral bands is within the specifications, with a special focus on the registration performance between the focal planes. Multi-spectral registration assessment can be performed at:  System level, on L1B products without any improvement of VIS/SWIR focal planes registration by processing task  Product level, on L1B products with focal plane registration processing between VNIR and SWIR focal planes enabled. In that case the registration within each focal plane is not refined and the performance is the same as system level performance. At the manuscript-writing time, the focal plane registration being stable and excellent, the registration processing task has not been activated.

4.4.2.1. Methods

Remote Sens. 2017, 9, 584 58 of 80

The nominal method to assess multi-spectral registration performance, within each focal plane and between VIS and SWIR focal planes, is based on correlation of spectral band couples (Figure 48): For any band couple and detector in one L1B product: 1) Resample the image with the finest resolution (secondary image) in the geometry of the Remote Sens. 2017, 9, 584 59 of 81 image with the coarsest resolution (reference image). For that purpose a resampling grid establishing the correspondence between pixels of the reference band and their position in Thethe 95.45% secondary confidence band level is computed (“2σ”) product using geolocation the line of performancesights physical value model. obtained Then on a analysedB-spline productsinter is 8 mpolation for L1C method refined is products. used to interpolate It is compliant the radiometric with the specification values of the of 12.5 secondary m at 2σ band. at locations defined by the resampling grid. 4.4.2.2) Multi-spectral Select a list Registration of well-matching Uncertainty tie-points Validation in the images where to perform the correlations The(avoiding goal of the clouds, multispectral water, etc) registration and extract assessment small image is tochips verify around that thetie-points spatial in registration both images. of spectral3) bandsThis process is within is theapplied specifications, to cloud-free with products a special acquired focus on theon several registration areas performance at different latitudes between the focaland planes. with various kinds of landscapes to estimate registration uncertainty (along-track and Multi-spectralacross-track registration shifts) at each assessment tie point can by be correlation performed at:of the reference and secondary image chips. • System4) Filter level, out onoutliers L1B products with poor without correlation any improvementbased on correlation of VIS/SWIR quality focal criteria planes registration by5) processingCompute the task. registration uncertainty parameter: 99.73% quantile among shifts measured on • Productall tie level,-points. on L1BThis products is compared with to focal the planerequirement. registration processing between VNIR and SWIR focal6) Compare planes enabled. results against In that requirements. case the registration within each focal plane is not refined and the performance is the same as system level performance. Several products on various areas are used to decrease the influence of landscape radiometric contentAt the on manuscript-writingthe correlation measurements. time, the focal Indeed, plane on registration some landscapes, being stable the correlation and excellent, quality the registrationbetween certain processing band couples task has can not be been poor activated. due to very different spectral responses. In particular band B10 is especially difficult to correlate with other bands, since it is radiometrically very different from 4.4.2.1.other bands. Methods The spectral band couples used to perform multi-spectral registration assessment are shown in The nominal method to assess multi-spectral registration performance, within each focal plane Figure 23. They have been chosen in order to optimize the correlation quality. and between VIS and SWIR focal planes, is based on correlation of spectral band couples (Figure 48):

Figure 48.48. HighHigh-level-level principle of the multi multi-spectral-spectral registration uncertainty assessment.assessment.

ForWhe anyn the band registration couple and processing detector in task one L1Bis enabled, product: the system-level registration uncertainty between VIS and SWIR focal planes can also be assessed by a complementary analysis of the (1)registrationResample processing the image outputs with the(registration finest resolution residuals). (secondary image) in the geometry of the image with the coarsest resolution (reference image). For that purpose a resampling grid establishing 4.4.2.1.the Results correspondence between pixels of the reference band and their position in the secondary band is computed using the line of sights physical model. Then a B-spline interpolation method Tishe used system to interpolate-level multi the-spectral radiometric registration values performance of the secondary is very band good at since locations the first defined calibrations by the duringresampling commissioning grid. phase are within the requirement of 0.3 pixels of the band with the coarser resolution, for any correlated band couple. (2) Select a list of well-matching tie-points in the images where to perform the correlations (avoiding

clouds, water, etc) and extract small image chips around tie-points in both images. (3) This process is applied to cloud-free products acquired on several areas at different latitudes and with various kinds of landscapes to estimate registration uncertainty (along-track and across-track shifts) at each tie point by correlation of the reference and secondary image chips. Remote Sens. 2017, 9, 584 60 of 81

(4) Filter out outliers with poor correlation based on correlation quality criteria (5) Compute the registration uncertainty parameter: 99.73% quantile among shifts measured on all tie-points. This is compared to the requirement. (6) Compare results against requirements.

Several products on various areas are used to decrease the influence of landscape radiometric content on the correlation measurements. Indeed, on some landscapes, the correlation quality between certain band couples can be poor due to very different spectral responses. In particular band B10 is especially difficult to correlate with other bands, since it is radiometrically very different from other bands. The spectral band couples used to perform multi-spectral registration assessment are shown in Figure 23. They have been chosen in order to optimize the correlation quality. When the registration processing task is enabled, the system-level registration uncertainty between VIS and SWIR focal planes can also be assessed by a complementary analysis of the registration processing outputs (registration residuals).

4.4.2.2. Results Remote Sens. 2017, 9, 584 59 of 80 The system-level multi-spectral registration performance is very good since the first calibrations duringFigure commissioning 49 shows an phase example are of within measured the requirement error clouds ofin 0.3one pixels detector of thefor two band band with couples. the coarser The resolution,production forbaseline any correlated of the product band couple.used is version 2.04. FigureDue to 49 the shows very an good example and of stable measured multi error-spectral clouds registration in one detector performance, for two band the couples. VIS/SWIR The productionregistration baselineprocessing of thetask product has not used been is activated version 2.04.so far.

Figure 49 49.. ExampleExample of global of global mis-registration mis-registration shifts shifts measured measured on ‘Central on ‘Central Australia’ Australia’ for B04/B03 for- B04/B03-DetectorDetector 08 and B05/B12 08 and- B05/B12-DetectorDetector 02. Black 02.points Black are points tie points are tie rejected points rejectedbased on based correlation on correlation quality qualitycriteria criteria(step 4). (step Units 4). are Units pixels are of pixels the coarser of the coarserband in band the couple. in the couple.

DueThe 99.73 to the% confidence very good level and (“3 stableσ”) multi multi-spectral-spectral registration registration performance performance, (without the registration VIS/SWIR registrationprocessing task) processing obtained task on has analysed not been products activated is so always far. less than 0.3 Spatial Sampling Distance (SSD)The of the 99.73% coarser confidence band correlated. level (“3σ ”) multi-spectral registration performance (without registration processing task) obtained on analysed products is always less than 0.3 Spatial Sampling Distance (SSD) 4.4.3. Multi-Temporal (XT) Registration Uncertainty Validation of the coarser band correlated. The objective of the multi-temporal registration uncertainty validation is to assess the performance of co-registration between L1C tiles acquired on the same site at different dates (superimposable time-series). This relative measurement is important to appreciate stability of the instrument, to distinguish year-to-year and season-to-season variations due to global change from technical issues that may arise during operations. The achievement of the multi-temporal registration performance requirement nominally relies on the geometric model refinement processing based on Global Reference Image (GRI). This processing is not yet activated in the operational chain, thus in this section we present:  The current multi-temporal registration performance for non-refined product which is representative of the current situation.  A preliminary assessment of the performance with refinement activated, based on products processed with the ground-processor prototype and the available European part of the GRI.

4.4.3.1. Method The assessment method, for both refined and non-refined products is based on correlation between L1C tiles as follows and illustrated in Figure 50: (1) Select two cloud-free L1C tiles acquired on the same area on Earth at different dates. Select also the reference band to be processed. In practice, for a given tile, the older L1C image is

Remote Sens. 2017, 9, 584 61 of 81

4.4.3. Multi-Temporal (XT) Registration Uncertainty Validation The objective of the multi-temporal registration uncertainty validation is to assess the performance of co-registration between L1C tiles acquired on the same site at different dates (superimposable time-series). This relative measurement is important to appreciate stability of the instrument, to distinguish year-to-year and season-to-season variations due to global change from technical issues that may arise during operations. The achievement of the multi-temporal registration performance requirement nominally relies on the geometric model refinement processing based on Global Reference Image (GRI). This processing is not yet activated in the operational chain, thus in this section we present:

• The current multi-temporal registration performance for non-refined product which is representative of the current situation. • A preliminary assessment of the performance with refinement activated, based on products processed with the ground-processor prototype and the available European part of the GRI.

4.4.3.1. Method The assessment method, for both refined and non-refined products is based on correlation between

L1CRemote tiles Sens. as 2017 follows, 9, 584 and illustrated in Figure 50: 60 of 80 (1) Select two cloud-free L1C tiles acquired on the same area on Earth at different dates. Select also taken as reference and all new cloud-free acquisitions of the same tile are compared to the the reference band to be processed. In practice, for a given tile, the older L1C image is taken as reference tile. reference and all new cloud-free acquisitions of the same tile are compared to the reference tile. (2) Select a list of well-matching tie-points in the images where to perform the correlations (2) Select a list of well-matching tie-points in the images where to perform the correlations (avoiding clouds,(avoiding water, etc.)clouds, and water, extract etc small.) and image extract chips small around image tie-points chips in around both images. tie-points in both (3) Estimateimages registration. uncertainty (shifts along X and Y axes) at each tie point by correlation of the (3)reference Estimate and registration secondary imageuncertainty chips. (shifts along X and Y axes) at each tie point by correlation (4) Filterof outthe outliersreference with and poor secondary correlation image based chips on. correlation quality criteria. (5) (4)Compute Filter out the outliers registration with poor uncertainty correlation metrics based at on tile correlation level: Average quality circularcriteria. error over all (5)non-rejected Compute tie-pointsthe registration of the tile. uncertainty metrics at tile level: Average circular error over all (6) Repeatnon the-rejected previous tie-points steps (1)of tothe (5) tile for. as many tile couples as possible. (7) (6)Compute Repeat the the global previous registration steps (1) uncertainty to (5) for as metrics, many tile to couples be compared as possible to the. requirement: 95.45% (7)quantile Compute among the the global average registration shifts measured uncertainty on each metrics, L1C tileto couples.be compared to the requirement: 95.45% quantile among the average shifts measured on each L1C tile couples.

Figure 50 50.. HighHigh-level-level principle of the multi multi-temporal-temporal registration uncertainty assessment.assessment.

Various factors influence the multi-temporal matching accuracy, some of which are listed here after:  The quality of the images,  The seasonal scene variations and meteorological/atmospheric properties (water vapour and aerosols should be limited in the processed scenes),  The properties of the scene, relief, surface reflectance, and image content,  When the two tiles correlated have been acquired from two adjacent orbits (i.e. covering the overlapping areas), the viewing angles are different. In areas with relief this causes a parallax between the two images that can degrade the correlation accuracy. These factors imply that the multi-temporal performance assessment by correlation is always pessimistic. It is here highlighted that the performances presented in the following paragraphs are assessed only in products acquired from the same relative orbit in the repeat cycle. Indeed, products acquired on the same site from adjacent orbits, therefore with different viewing angles, cannot meet the stringent requirement, since the system error due to operational DEM accuracy is higher than 0.3 SSD at 2σ. The elevation accuracy of the PlanetDEM90 used in operational processing being 16 m @2σ, the error obtained is about 16 m * tan(10°) * 2 = 5.6 m = 0.56 SSD (for 10m bands) of co-registration error.

4.4.3.2. Results for Non-Refined Products The assessment of multi-temporal registration uncertainty on non-refined L1C products has been performed on operational products, using the band B04 as reference band for correlation.

Remote Sens. 2017, 9, 584 62 of 81

Various factors influence the multi-temporal matching accuracy, some of which are listed here after:

• The quality of the images, • The seasonal scene variations and meteorological/atmospheric properties (water vapour and aerosols should be limited in the processed scenes), • The properties of the scene, relief, surface reflectance, and image content, • When the two tiles correlated have been acquired from two adjacent orbits (i.e. covering the overlapping areas), the viewing angles are different. In areas with relief this causes a parallax between the two images that can degrade the correlation accuracy.

These factors imply that the multi-temporal performance assessment by correlation is always pessimistic. It is here highlighted that the performances presented in the following paragraphs are assessed only in products acquired from the same relative orbit in the repeat cycle. Indeed, products acquired on the same site from adjacent orbits, therefore with different viewing angles, cannot meet the stringent requirement, since the system error due to operational DEM accuracy is higher than 0.3 SSD at 2σ. The elevation accuracy of the PlanetDEM90 used in operational processing being 16 m @2σ, the error obtained is about 16 m * tan(10◦) * 2 = 5.6 m = 0.56 SSD (for 10m bands) of co-registration error.

4.4.3.2. Results for Non-Refined Products The assessment of multi-temporal registration uncertainty on non-refined L1C products has been performed on operational products, using the band B04 as reference band for correlation. Performances for product baseline 2.00 to 2.03 suffered from a yaw bias correction anomaly, particularly visible at the edge of the swath, leading to a performance of about 1.83 SSD [email protected]% confidence level on L1C products. The analysis was based on 933 L1C tiles included in 184 cloud-free products correlated against reference tiles took in 55 older products with the same relative orbit. The reference and secondary tiles were acquired between mid-December 2015 and the first of June 2016. They cover 195 MGRS tiles. Multi-temporal registration performance in product baseline 2.04 is improved and has been assessed to 1 SSD [email protected]% confidence level on L1C products as synthesized in Table 14. The analysis was based on 149 L1C tiles included in 46 cloud-free products correlated against reference tiles took in 19 older products with the same relative orbit. The reference and secondary tiles were acquired between 15/06/2016 and 10/08/2016. They cover 55 MGRS tiles. The sites selected for this analysis are shown in Figure 51. They include the geolocation validation sites and additional sites specific for multi-temporal registration assessment. Each site includes at least one MGRS tile. Figure 52 shows the registration error in each correlation tile with respect to the reference tile from which the 95.5% confidence level performance has been estimated. The performance is still not compliant with the requirement but this is expected because the refinement process with GRI is not activated.

Table 14. Multi-temporal registration performance for non-refined L1C products.

Multi-Temporal Registration Performance (PB 2.04) Non-Refined L1C Results (Circular Error) 95.45% conf. level 1 pixel at 10 m Mean value 0.36 pixel at 10 m Max value 1.5 pixel at 10 m Remote Sens. 2017, 9, 584 61 of 80

Performances for product baseline 2.00 to 2.03 suffered from a yaw bias correction anomaly, particularly visible at the edge of the swath, leading to a performance of about 1.83 SSD [email protected]% confidence level on L1C products. The analysis was based on 933 L1C tiles included in 184 cloud-free products correlated against reference tiles took in 55 older products with the same relative orbit. The reference and secondary tiles were acquired between mid-December 2015 and the first of June 2016. They cover 195 MGRS tiles. Multi-temporal registration performance in product baseline 2.04 is improved and has been assessed to 1 SSD [email protected]% confidence level on L1C products as synthesized in Table 14. The analysis was based on 149 L1C tiles included in 46 cloud-free products correlated against reference tiles took in 19 older products with the same relative orbit. The reference and secondary tiles were acquired between 15/06/2016 and 10/08/2016. They cover 55 MGRS tiles. The sites selected for this analysis are shown in Figure 51. They include the geolocation validation sites and additional sites specific for multi-temporal registration assessment. Each site includes at least one MGRS tile. Figure 52 shows the registration error in each correlation tile with respect to the reference tile from which Remote Sens. 2017, 9, 584 63 of 81 the 95.5% confidence level performance has been estimated. The performance is still not compliant with the requirement but this is expected because the refinement process with GRI is not activated.

FigureFigure 51. Sites51. Sites used used for for multi-temporal multi-temporal registration uncertainty uncertainty validation validation on non on- non-refinedrefined products products (red(red points). points). They They includeinclude th thee geolocation geolocation validation validation sites (in sites blue) (in and blue) other and sites other specific sites for specificmulti- for Remotemulti-temporal Sens.temporal 2017, 9, registration584 registration assessment. assessment. 62 of 80

Figure 52.52. Multi-temporalMulti-temporal registration uncertainty. Each Each point represents the average X shift and Y shift measure in one secondary tile with respectrespect toto thethe referencereference tile.tile.

4.4.3.3. ResultsTable for 14 Refined. Multi-temporal Products registration over European performance GRI for non-refined L1C products.

The objectiveMulti is-Temporal to characterise Registration the multi-temporal Performance registration of the L1C tiles, after geometric Non-Refined L1C Results processing refining Sentinel-2(PB 2.04) products (Circular over Error) European GRI. This preliminary analysis was performed by CNES during the Sentinel-295.45% commissioning conf. level phase, with the Ground1 pixel Processor at 10 m Prototype. Correlation and refiningMean processing value parameters have been tuned0.36 pix toel comply at 10 m with features of Sentinel-2 products (cloud-free, with a lot of water, cloudy, very cloudy and desert). B04 and B11 bands Max value 1.5 pixel at 10 m

4.4.3.3. Results for Refined Products over European GRI The objective is to characterise the multi-temporal registration of the L1C tiles, after geometric processing refining Sentinel-2 products over European GRI. This preliminary analysis was performed by CNES during the Sentinel-2 commissioning phase, with the Ground Processor Prototype. Correlation and refining processing parameters have been tuned to comply with features of Sentinel-2 products (cloud-free, with a lot of water, cloudy, very cloudy and desert). B04 and B11 bands are the reference bands for this analysis, considering that they give respectively the best and the worst performances (shown by studies achieved before launch). The 95.45% confidence level multi-temporal registration performance value obtained from 505 analysed L1C products (see data strips in Figure 53) is 0.22pixel for B04 band (see Figure 54, with max mean value per tile of 0.36 pixel) and 0.17 pixel for B11 band (see Figure 55, with max mean value per tile of 0.26 pixel). However these figures must be consolidated in the future because the number of products processed is not very large, leading to possible statistical estimation errors. Indeed, the analysis was based on 505 L1C tiles included in 29 cloud-free products dispatched on 8 different relative orbits and correlated against reference tiles in 9 older products with the same relative orbit. The reference and secondary tiles were acquired between July and September 2015.

Remote Sens. 2017, 9, 584 64 of 81 are the reference bands for this analysis, considering that they give respectively the best and the worst performances (shown by studies achieved before launch). The 95.45% confidence level multi-temporal registration performance value obtained from 505 analysed L1C products (see data strips in Figure 53) is 0.22pixel for B04 band (see Figure 54, with max mean value per tile of 0.36 pixel) and 0.17 pixel for B11 band (see Figure 55, with max mean value per tile of 0.26 pixel). However these figures must be consolidated in the future because the number of products processed is not very large, leading to possible statistical estimation errors. Indeed, the analysis was based on 505 L1C tiles included in 29 cloud-free products dispatched on 8 different relative orbits and correlated against reference tiles in 9 older products with the same relative orbit. The reference and secondary tiles were acquired between July and September 2015.

• Orbit 51: 6 dates, 312 tiles (France). • Orbit 50: 6 dates, 76 tiles (East Europe). • Orbit 35: 4 dates, 48 tiles (Middle East). • Orbit 137: 2 dates, 12 tiles (UK). • Orbit 122: 2 dates, 10 tiles (Austria). • Orbit 36: 3 dates, 12 tiles (Hungary). • Orbit 22: 4 dates, 8 tiles (Sweden). • Orbit 64: 2 dates, 27 tiles (Ukraine). Remote Sens. 2017, 9, 584 63 of 80

Figure 53. Areas used for multi-temporal registration uncertainty validation on refined products. Figure 53. Areas used for multi-temporal registration uncertainty validation on refined products.  Orbit 51: 6 dates, 312 tiles (France).  Orbit 50: 6 dates, 76 tiles (East Europe).  Orbit 35: 4 dates, 48 tiles (Middle East).  Orbit 137: 2 dates, 12 tiles (UK).  Orbit 122: 2 dates, 10 tiles (Austria).  Orbit 36: 3 dates, 12 tiles (Hungary).  Orbit 22: 4 dates, 8 tiles (Sweden).  Orbit 64: 2 dates, 27 tiles (Ukraine).

Figure 54. Multi-temporal (XT) registration uncertainty for B04 band. Each point represents the average shift measured in one secondary tile with respect to the reference tile.

Remote Sens. 2017, 9, 584 63 of 80

Figure 53. Areas used for multi-temporal registration uncertainty validation on refined products.

 Orbit 51: 6 dates, 312 tiles (France).  Orbit 50: 6 dates, 76 tiles (East Europe).  Orbit 35: 4 dates, 48 tiles (Middle East).  Orbit 137: 2 dates, 12 tiles (UK).  Orbit 122: 2 dates, 10 tiles (Austria).  Orbit 36: 3 dates, 12 tiles (Hungary).  Orbit 22: 4 dates, 8 tiles (Sweden). Remote Sens. 2017, 9, 584 65 of 81  Orbit 64: 2 dates, 27 tiles (Ukraine).

FigureFigure 54. Multi-temporal 54. Multi-temporal (XT) (XT) registration registration uncertainty uncertainty for B04 for B04band. band. Each Each point point represents represents the average the average shift measured in one secondary tile with respect to the reference tile. Remoteshift Sens. measured 2017, 9, 584 in one secondary tile with respect to the reference tile. 64 of 80

Figure 55.55.Multi-temporal Multi-temporal (XT) (XT) registration registration uncertainty uncertainty for B11 for band. B11 band. Each point Each represents point represents the average the shiftaverage measured shift measured in one secondary in one secondary tile with tile respect with to respect the reference to the reference tile. tile.

The requirement at 95.45% confidence level of 0.3 pixel multi-temporal registration performance The requirement at 95.45% confidence level of 0.3 pixel multi-temporal registration performance is achieved for refined products on GRI and acquired from the same relative orbit in the cycle. is achieved for refined products on GRI and acquired from the same relative orbit in the cycle.

4.4.4. Global Reference Images Validation This activity consists in validating the completeness (coverage) and geolocation performance of the Global Reference Image that will be used in the operational ground processor to refine the geometric model of images by image matching and spatio-triangulation techniques. Note that at the manuscript-writing time, only the European block of the GRI has been completely validated by CNES in the frame of the commissioning phase. This is presented in paragraph 4.4.4.2. The validation of Australian block is in progress, while the validation of the other blocks will start when they will be available.

4.4.4.1. Methods GRI being a collection of L1B products, the nominal method for assessing its geolocation performance is exactly the same as the one used for geolocation validation on L1B products (see paragraph 4.4.1.1). The only difference is that the geometric model of the GRI has been refined. The most important constraint of this method in view of a global validation is that it requires a lot of GCPs or reference images worldwide, independent from those used for the GRI construction. Specific procurement is under consideration in order to be able to assess geolocation performance in areas where the GRI geometric accuracy is potentially weak (e.g. far from the GCPs used during GRI construction).

4.4.4.2. Results for European Block of Global Reference Image The coverage percentage of each orbit of the final GRI over Europe is given by Figure 56.

Remote Sens. 2017, 9, 584 66 of 81

4.4.4. Global Reference Images Validation This activity consists in validating the completeness (coverage) and geolocation performance of the Global Reference Image that will be used in the operational ground processor to refine the geometric model of images by image matching and spatio-triangulation techniques. Note that at the manuscript-writing time, only the European block of the GRI has been completely validated by CNES in the frame of the commissioning phase. This is presented in paragraph 4.4.4.2. The validation of Australian block is in progress, while the validation of the other blocks will start when they will be available.

4.4.4.1. Methods GRI being a collection of L1B products, the nominal method for assessing its geolocation performance is exactly the same as the one used for geolocation validation on L1B products (see paragraph 4.4.1.1). The only difference is that the geometric model of the GRI has been refined. The most important constraint of this method in view of a global validation is that it requires a lot of GCPs or reference images worldwide, independent from those used for the GRI construction. Specific procurement is under consideration in order to be able to assess geolocation performance in areas where the GRI geometric accuracy is potentially weak (e.g., far from the GCPs used during GRI construction).

4.4.4.2. Results for European Block of Global Reference Image Remote Sens. 2017, 9, 584 65 of 80 The coverage percentage of each orbit of the final GRI over Europe is given by Figure 56.

Figure 56. Coverage percentage of Europe GRI versus relative orbit number. Figure 56. Coverage percentage of Europe GRI versus relative orbit number.

Final GRI over Europe covers 95% of the Europe area. Missing images are located on: Final GRI over Europe covers 95% of the Europe area. Missing images are located on: • OrbitOrbit 021:021: 12401240 kmkm missingmissing onon thethe North.North. • OrbitOrbit 035:035: 16001600 kmkm missingmissing onon thethe North.North. • OrbitOrbit 078:078: 11001100 kmkm missingmissing onon thethe North.North.

Moreover,Moreover, 6666 kmkm of of image image are are missing missing on on orbit orbit 123, 123, due due to to the the acquisition acquisition of of the the 29/09/2015 29/09/2015 splitsplit inin twotwo segments.segments. ThoseThose partsparts ofof orbitsorbits shouldshould bebe filledfilled laterlater toto completecomplete thethe GRIGRI overover Europe,Europe, whenwhen cloud-freecloud-free productsproducts willwill bebe available.available. TheThe geometricalgeometrical model model of of the the 65 65 European European GRI GRI products products was was refined refined by space by space triangulation triangulation using tieusing points tie points between between images images and GCPs. and GCPs. TheThe locationlocation precisionprecision ofof thethe finalfinal GRIGRI waswas validatedvalidated usingusing PleiadesPleiades databasedatabase (references(references withwith aa locationlocation precision precision better better than than 5 m). 5 m). Figure Figure 57 presents 57 presents the area the of area each Europeanof each European GRI product GRI according product toaccording the number to the of number Pleiades of references Pleiades available,references with available, the following with the legend following of colours: legend of colours:

• Green:Green: more more than than 6 6Pleiades Pleiades references references crossing. crossing. The The computed computed location location precision precision will willbe very be veryreliable. reliable.  Orange: 3 to 5 Pleiades references crossing  Red: 1 or 2 Pleiades references crossing  Black: no Pleiades references crossing. The location performance cannot be computed for these products.

Remote Sens. 2017, 9, 584 67 of 81

• Orange: 3 to 5 Pleiades references crossing • Red: 1 or 2 Pleiades references crossing • Black: no Pleiades references crossing. The location performance cannot be computed for these products. Remote Sens. 2017, 9, 584 66 of 80

Figure 57 57.. European GRI product strips according to PleiadesPleiades references.

If allall productsproducts were were crossing crossing other other products products (that (that is to is say to ifsay one if blockone block was availablewas available for the for whole the Europe),whole Europe), it would it would not be not important be important to validate to validate geolocation geolocation on each on each segment segment separately. separately. When When the refinementthe refinement is performed is performed with with all segmentsall segments together, together, the the geolocation geolocation is homogeneous is homogeneous on on the the block. block. Figure 58 represents location precision of each European GRI product after refinement,refinement, with the following legend of colours:

• Green:Green: geolocation better than 7 m.m. • Orange: geolocationgeolocation between 7 m and 10 m.m. • Red: geolocation worse than 10 m.m. • Black: no reference to estimate geolocation.geolocation. • White: missing products in the finalfinal GRI.GRI.

Remote Sens. 2017, 9, 584 68 of 81 Remote Sens. 2017, 9, 584 67 of 80

Figure 58 58.. European GRI product geolocation performance.

Red products with bad geolocation performance are cloudy and not linked to other products on Red products with bad geolocation performance are cloudy and not linked to other products on the GRI. So they were not linked to the Europe block in refinement, and therefore their geolocation the GRI. So they were not linked to the Europe block in refinement, and therefore their geolocation is is not homogeneous to other products. not homogeneous to other products. Consequently, it is recommended to add new products on red and white strips when available Consequently, it is recommended to add new products on red and white strips when available to to complete the European GRI. complete the European GRI. The global performance of the European GRI is assessed to be 8m at 95.45% confidence level. The global performance of the European GRI is assessed to be 8m at 95.45% confidence level.

5. Level Level-2A-2A Calibration and Validation Status Level-2A products are generated by the Atmospheric Correction (AC) processor Sen2Cor, which Level-2A products are generated by the Atmospheric Correction (AC) processor Sen2Cor, generates BOA reflectance, i.e., Surface Reflectance (SR), from a single-date Level-1C TOA product. which generates BOA reflectance, i.e., Surface Reflectance (SR), from a single-date Level-1C TOA Sen2Cor outputs can be classified in two main domains: (1) Radiometry, which concerns the product. Sen2Cor outputs can be classified in two main domains: (1) Radiometry, which concerns Atmospheric Correction outputs of Sen2Cor: Surface reflectance products (SR), Aerosol Optical the Atmospheric Correction outputs of Sen2Cor: Surface reflectance products (SR), Aerosol Optical Thickness (AOT) and Water Vapour (WV) maps and (2) Cloud Screening and Scene Classification Thickness (AOT) and Water Vapour (WV) maps and (2) Cloud Screening and Scene Classification (SCL) outputs of Sen2Cor: a scene classification map which assigns a class to each pixel (Vegetation, (SCL) outputs of Sen2Cor: a scene classification map which assigns a class to each pixel (Vegetation, Not-vegetated, Water, 2 types of Clouds, Thin Cirrus, Snow, etc.), a Cloud probabilistic mask and a Not-vegetated, Water, 2 types of Clouds, Thin Cirrus, Snow, etc.), a Cloud probabilistic mask and a Snow probabilistic mask. The scene classification map does not constitute a land cover classification Snow probabilistic mask. The scene classification map does not constitute a land cover classification map in a strict sense. Its main purpose is to be used internally in Sen2Cor in the atmospheric map in a strict sense. Its main purpose is to be used internally in Sen2Cor in the atmospheric correction correction module to distinguish between cloudy pixels, clear pixels and water pixels. module to distinguish between cloudy pixels, clear pixels and water pixels. Figure 59 with associated legends and colour tables gives an example of the outputs of Sen2Cor for one selected test site over Easton-MDE (USA) processed with default configuration. The Level- 2A surface reflectance images are much clearer than the uncorrected Level-1C image showing no

Remote Sens. 2017, 9, 584 69 of 81

Figure 59 with associated legends and colour tables gives an example of the outputs of Sen2Cor for one selected test site over Easton-MDE (USA) processed with default configuration. The Level-2A surface reflectance images are much clearer than the uncorrected Level-1C image showing no visible difference between the 2 different spatial resolutions. SCL correctly classified water, vegetation and not-vegetated pixels.Remote Sens. The2017, 9,Sentinel-2 584 product example is composed of 4 granules,68 for of 80which Sen2Cor retrieves WV andvisible AOT difference maps between in an the independent 2 different spatial manner resolutions. (tile bySCL tile correctly processing). classified water, The WV and AOT values retrievedvegetation are very and not close-vegetated and pixels. in this The example Sentinel-2 product no border example effect is composed at the of 4 boundariesgranules, for is visible for which Sen2Cor retrieves WV and AOT maps in an independent manner (tile by tile processing). The WV, AOT and RGBWV and composites AOT values retrieved of surface are very reflectance.close and in this example The WV no border and AOTeffect at imagesthe boundaries appear to be quite homogeneous becauseis visible for the WV, colour AOT and scale RGB composites used for of theirsurface visualisation reflectance. The WV is and large AOT with images respect appear to the internal to be quite homogeneous because the colour scale used for their visualisation is large with respect to variations withinthe internal the image. variations within the image.

Figure 59. (a) Level-2A product example. Aeronet site Easton-Maryland-Department-Environment (USA), acquired on October 18, 2016, characterized by MidlatitudeN, Flat terrain, Forest, Croplands, Water, Urban. (Top) RGB compositions (B04, B03, B02) for L1C TOA, L2A Surface Reflectance (SR) at 10 m, L2A SR at 60 m; (Bottom) Scene Classification, L2A Water Vapour at 20 m, L2A Aerosol Optical Thickness (AOT) at 20 m. (b) Cloud Screening and Scene Classification (SCL) pixel legend; (c) Colour tables for Total Water Vapour Column and Aerosol Optical Thickness at 550 nm; (d) Scale, north arrow and coordinates of the four product granules. Remote Sens. 2017, 9, 584 69 of 80

Figure 59. (a) Level-2A product example. Aeronet site Easton-Maryland-Department-Environment (USA), acquired on October 18, 2016, characterized by MidlatitudeN, Flat terrain, Forest, Croplands, Water, Urban. (Top) RGB compositions (B04, B03, B02) for L1C TOA, L2A Surface Reflectance (SR) at 10 m, L2A SR at 60 m; (Bottom) Scene Classification, L2A Water Vapour at 20 m, L2A Aerosol Optical Thickness (AOT) at 20 m. (b) Cloud Screening and Scene Classification (SCL) pixel legend; (c) Colour tables for Total Water Vapour Column and Aerosol Optical Thickness at 550 nm; (d) Scale, north Remotearrow Sens. 2017 and, 9coordinates, 584 of the four product granules. 70 of 81

5.1. Level-2A Calibration Activities 5.1. Level-2A Calibration Activities 5.1.1. Level-2A Calibration Dataset 5.1.1. Level-2A Calibration Dataset A calibration dataset of Sentinel-2 Level-1C products is constituted and regularly updated, A calibration dataset of Sentinel-2 Level-1C products is constituted and regularly updated, covering different land cover types (e.g., snow, rocks, desert, urban, vegetation, grass, forest, covering different land cover types (e.g., snow, rocks, desert, urban, vegetation, grass, forest, cropland, cropland, vineyard, irrigated crops, rivers, lakes, sand, costal area, wetlands, ocean water) and vineyard, irrigated crops, rivers, lakes, sand, costal area, wetlands, ocean water) and different different atmospheric conditions (e.g., cloud cover, aerosol optical thickness, water vapour content). atmospheric conditions (e.g., cloud cover, aerosol optical thickness, water vapour content). The The calibration dataset is worldwide including different latitudes to cover various solar angles and calibration dataset is worldwide including different latitudes to cover various solar angles and seasons. seasons. The 24 Level-2A calibration sites were selected above an active AERONET [45] site. Their The 24 Level-2A calibration sites were selected above an active AERONET [45] site. Their geographical geographical locations are shown on Figure 60. Their characteristics are further defined in [46]. locations are shown on Figure 60. Their characteristics are further defined in [46].

Figure 60 60.. GeographicalGeographical distribution of the 24 selected AERONET test sites for thethe Level-2ALevel-2A Calibration.Calibration.

5.1.2. Atmospheric Correction Parameterization The objective objective of of this this task task is to is perform to perform a regular a regular check and check calibration and calibration of AC parameters of AC parameters concerning theconcerning Atmospheric the Atmospheric Correction processor Correction using processor the Level-2A using the calibration Level-2A dataset calibration covering dataset different covering land coverdifferent types, land atmospheric cover types, conditions, atmospheric solar and conditions, viewing conditions. solar and Updated viewing calibration conditions. parameters Updated arecalibration delivered parameters under the areform delivered of an updated under configuration the form of anfile updated of Sen2Cor configuration processor. file of Sen2Cor processor. 5.1.2.1. Method 5.1.2.1.The Method following main activities take place during this calibration task: The following main activities take place during this calibration task: • The sensitivity of the Sen2Cor processor to the parameters stored in the processor configuration  filesThe sensitivity “L2A_CAL_AC_GIPP.xml” of the Sen2Cor processor and “L2A_GIPP.xml” to the parameters is stored investigated in the processor by performing configuration several runsfiles “L2A_CAL_AC_GIPP.xml”of Sen2Cor with modification and of “L2A_GIPP.xml” the (single) calibration is investigated parameter by of performing interest: Min several Dark Denseruns of VegetationSen2Cor with (DDV) modification area, SWIR of the reflectance (single) calibration lower threshold, parameter DDV of 1.6interest:µm reflectanceMin Dark threshold, SWIR2.2µm-red reflectance ratio, red-blue reflectance ratio, cut-off for AOT-iterations: max percentage of negative reflectance vegetation pixels (B4), cut off for AOT-iterations: max percentage of negative reflectance water pixels (B8), Aerosol type ratio threshold, topographic correction threshold, slope threshold, water vapour map box size, cirrus correction threshold, BRDF lower bound; The impact of these individual parameter variations on the different Level-2A products (AOT, WV and BOA reflectance) is analysed and compared to the in-situ data of AOT Remote Sens. 2017, 9, 584 71 of 81

and Water Vapour measurements. For “continuously varying” calibration parameter, a best value is retained to be the default configuration in the L2A_CAL_AC_GIPP.xml and L2A_GIPP.xml files. This activity will be performed once the internal atmospheric calibration parameters have been moved to the L2A_CAL_AC_GIPP.xml configuration file. • Concerning yes/no calibration parameters (e.g., cirrus correction, BRDF correction), a qualitative analysis of the impact of the activation/deactivation of each parameter is undertaken. (These choices usually depend on particular kind of user applications or scene landscape). The individual impact of each parameter is assessed on a variety of landscapes and weather conditions. The outputs of this calibration activity are expected to provide (1) advices to the Sen2Cor user, based on these previous assessments, when Sen2Cor is used within the Sentinel-2 Toolbox and (2) a default configuration (more generic) to cover the needs of systematic Level-2A production.

5.1.2.2. Results and Outlook The main result to date of this calibration activity is the Sen2Cor version 2.2.1 issued on May 4, 2016. This version includes an updated internal resampling method that preserves the geolocation information when spectral bands are resampled to another resolution. This update has improved the quality of the WV map retrieved with Atmospheric Pre-corrected Differential Absorption (APDA) algorithm [11] which for Sen2Cor relies on spectral bands B8A and band B09 that have different native resolutions, respectively 20 m and 60 m. A thorough investigation has been performed on Sen2Cor handling of two different DEMs, mostly SRTM v4.1 in GeoTIFF format, the default DEM used in Sen2Cor and Planet DEM as a user-provided DEM in DTED format used in the Level-1C processing chain. Several issues have been identified, regarding the SRTM geolocation itself, the DTED processing and the handling of DEM NoData values. For each of these issues a recommendation has been formulated and a solution proposed for implementation in the next release of Sen2Cor, version 2.3. A number of algorithm evolutions are foreseen for the atmospheric correction module of Sen2Cor: improvements of cirrus correction and AOT retrieval as well as the implementation of an AOT fallback estimation based on ECMWF aerosol information when AOT retrieval fails in case of missing DDV pixels.

5.1.3. Cloud Screening and Scene Classification Calibration The objective of this task is to perform a regular check and calibration of thresholds parameters concerning the Cloud Screening and Scene Classification (SC) module using the Level-2A calibration dataset covering different land cover types, atmospheric conditions, solar and viewing conditions. Updated calibration parameters are delivered under the form of an updated configuration file of Sen2Cor processor.

5.1.3.1. Method The procedure for the calibration of the Cloud Screening and Scene Classification algorithm is performed as follows:

1. Sen2Cor processor (with the option for scene classification only) runs on the Level-2A Calibration dataset using all thresholds by default from the current Level-2A Processing Baseline, producing for each S2 scene (100 km × 100 km) a scene classification map and 2 quality indicators maps (Cloud confidence and Snow confidence). These outputs are stored with the objective to be used later in the calibration procedure as reference. 2. For each class of the Scene Classification Map, the pixel classification is verified by manual inspection by superimposing the scene classification map and Level-1C spectral bands. (At this stage the work performed by DLR on Level-2A Product Validation is also used as input when available). The outputs of this step are a performance assessment of the classification for each Remote Sens. 2017, 9, 584 72 of 81

class reporting on: over-detection, under-detection, misclassification (indicating the wrong class assigned) and how the edges/boundaries between classes are handled by the processor. 3. Based on this first assessment, a list of potential areas of improvements is identified with their corresponding thresholds. Some thresholds are more critical than others, bearing in mind that the principal objective of the scene classification algorithm is to distinguish between cloudy and clear pixels. Some of the most difficult optimal thresholds to find are the thresholds linked to thin clouds (e.g., thin cloud over desert areas) because in some cases their tuning will lead to over detection (some clear urban landscape being classified as medium probability clouds) and in other cases under detection (missed thin clouds). On the other hand, some other thresholds have less importance for the atmospheric correction algorithm e.g., some NDVI thresholds have an influence on the number of pixels classified as vegetation versus the number of pixel classified as not-vegetated. 4. The proper tuning activity consists in manually slightly varying the SC thresholds to improve the Scene Classification. 5. When the tuning activity is finished, a new processing baseline is provided with a new set of updated SC parameters delivered in an updated Sen2Cor calibration file “L2A_CAL_SC_GIPP.xml“. 6. Sen2Cor processor (with the option for scene classification only) runs again on the Level-2A Calibration dataset using the updated Level-2A Processing Baseline, producing for each S2 scene (100 km × 100 km) a new scene classification map and 2 new quality indicators maps (Cloud confidence and Snow confidence). 7. A comparison exercise is performed between the SC results with the updated thresholds and the SC results with the standard baseline to assess the impact of the changes. Through visual assessment, the pixel class transitions are evaluated on a large range of land cover types, atmospheric conditions, solar and viewing conditions.

Based on this comparison exercise, if it seems possible again to improve the SC parameters through thresholds tuning process, the decision is taken to start another tuning (step 4) calibration. However in other cases the tuning activity is not sufficient and the improvement of the scene classification can only be achieved by a further evolution of the SC algorithm, e.g., including an additional test on Sentinel-2 band or including DEM auxiliary information.

5.1.3.2. Results and Outlook The main result of this calibration activity is the Sen2Cor version 2.2.1 issued on May 4, 2016. This version includes an updated set of thresholds for Cloud Screening and Scene Classification (SCL) which improves the overall classification accuracy. In addition to the thresholds tuning process, several evolutions have been introduced in the SCL algorithm. They are:

• Cloud shadow detection evolution. • Topographic shadow evolutions. • Snow and water classification improvements. • Cirrus detection improvements.

The implementation of the cloud shadow algorithm has been reviewed to allow a proper computation of the cloud shadows for all configurations of solar angles and viewing angles in northern and southern hemispheres. In addition, the dark features identification used as input for the cloud shadow computation has been calibrated to improve the resulting cloud shadow mask. Figure 61a gives an example of the cloud shadow class appearance in the Scene Classification map. Remote Sens. 2017, 9, 584 73 of 81

(a) (b)

Figure 61. (a) Cloud shadows classification; (b) Topographic shadows classification. Figure 61. (a) Cloud shadows classification; (b) Topographic shadows classification.

In version 2.2.1 the topographic shadows can now be identified by using an illumination map derived from the solar position and a DEM. Additionally the DEM slope information is used to filter out water pixels detected on steep slopes. Figure 61b illustrates this new feature. The issue of snow and water detection in the clouds has been partially solved by adjusting the calibration parameters and adding threshold conditions on TOA reflectance. There are still areas of improvement for this algorithm and several ideas of investigation using auxiliary data (DEM altitude, Snow Climatology, Water Bodies) have been proposed for evolutions.

5.2. Level-2A Validation Activities Validation includes an accuracy assessment of the SCL product and radiometric validation of WV, AOT and SR products. Validation data sets differ for SCL validation and for radiometric validation. SCL validation is performed on full granules with area of 110 km × 110 km whereas radiometric validation works on area of only 9 km × 9 km around sun photometer location. Several AERONET sunphotometer sites have been selected for radiometric validation. Radiometric validation relies also on fiducial reference data gathered on ground during ad-hoc campaigns. However, analysis of data from ad-hoc campaigns is no part of the present paper. Only first validation results for a limited number of test cases and test sites will be presented in the following sections.

5.2.1. Cloud Screening and Scene Classification Validation The scene classification algorithm is designed to detect clouds, snow and cloud shadows and to generate a classification map, which consists of 3 different classes for clouds (including cirrus), together with six different classifications for shadows, cloud shadows, vegetation, not-vegetated, water and snow, at 60 m and 20 m resolution. The goal of this section is to assess the quality of cloud screening and classification provided by Sen2Cor.

5.2.1.1. Method The SCL validation is limited due to the lack of “ground truth” data sets. The applied validation method performs stratified random sampling followed by establishment of a reference database by visual inspection of the samples, and finally establishment of an error matrix allowing calculation of accuracy statistics. 1

Remote Sens. 2017, 9, 584 73 of 80 Remote Sens. 2017, 9, 584 74 of 81 The SCL validation is limited due to the lack of “ground truth” data sets. The applied validation method performs stratified random sampling followed by establishment of a reference database by Eachvisual scene inspection classification of the samples, product and from finally selected establishment test sites of containsan error matrix more allowing than 30 millioncalculation pixels. of As the validationaccuracy statistics. is performed manually in a pixel based manner, a representative subset of these data Each scene classification product from selected test sites contains more than 30 million pixels. has to be selected. Stratified random sampling guarantees statistical consistency (validity) and avoids As the validation is performed manually in a pixel based manner, a representative subset of these exclusion of spatially limited classes from the validation. data has to be selected. Stratified random sampling guarantees statistical consistency (validity) and Visual,avoids exclusion pixel-based of spatially manual limited inspection classes offrom the the samples validation. is supported by RGB band composition image (BandsVisual, 4,3,2), pixel- Colourbased manual Infra-Red inspection composition of the samples image is to supported highlight by vegetation RGB band composition features (bands 8A,4,3)image or snow (Bands (bands 4,3,2), 12, Colour 11,8A), Infra shape-Red composition of the spectral image curve, to highlight snow confidence vegetation Qualityfeatures (bands Image and cloud8A,4,3) confidence or snow Quality (bands 12, Image. 11, 8A) After, shape the of Scenethe spectral Classification curve, snow results confidence are comparedQuality Image against and the referencecloud database, confidence an Quality error matrix Image. (confusion After the Scene matrix) Classification is calculated. results Overall, are compared user’s and against producer the 's accuraciesreference are database, then computed an error as matrix validation (confusion measures. matrix) is calculated. Overall, user’s and producer's accuracies are then computed as validation measures. 5.2.1.2. Results and Analysis 5.2.1.2. Results and Analysis Ground-truth classification images were created and compared to corresponding pixels of the Ground-truth classification images were created and compared to corresponding pixels of the Sen2Cor classification images for four scenes at four different locations. One example of data analysis Sen2Cor classification images for four scenes at four different locations. One example of data analysis is shownis shown in Figure in Figure 62. 62.

Figure 62. Example images for Cloud Screening and Classification Validation analysis, Site Potsdam Figure 62. Example images for Cloud Screening and Classification Validation analysis, Site Potsdam (Germany), acquired on April 22, 2016, characterized by temperate climate zone and flat terrain. (Germany), acquired on April 22, 2016, characterized by temperate climate zone and flat terrain. Whole Whole granule at the top and zoomed area at the bottom. granule at the top and zoomed area at the bottom. The image contains clouds, cloud shadows, vegetation, not-vegetated and water pixels. Table 15 Thecontains image the contains corresponding clouds, confusion cloud shadows, matrix. The vegetation, diagonal not-vegetatedin bold shows the and number water of pixels. correctly Table 15 containsclassified the corresponding pixels in the validation confusion set. matrix. In this The example, diagonal vegetation, in bold shows not-vegetated, the number water, of highcorrectly classifiedprobability pixels clouds in the validationand cloud shadow set. In thisclasses example, have been vegetation, correctly not-vegetated,classified for a very water, large high portion probability of clouds and cloud shadow classes have been correctly classified for a very large portion of validation data. The resulting user’s accuracy (UA) ranges from 18.3% for class unclassified to 97.9% for class “vegetation”. UA is a value indicating the probability that the pixel was classified correctly. Other measures for characterization of classification performance are producer’s accuracy (PA), overall accuracy (OA) and Cohen's Kappa coefficient. PA indicates the probability that a pixel belongs to the Remote Sens. 2017, 9, 584 75 of 81 class classified by Sen2Cor. PA per class is reported in Table 15. OA indicates the relative number of correctly classified pixels relative to the total number of pixels whereas Cohen's Kappa indicates the inter-rater agreement. OA and Cohen's Kappa, for the example presented, are respectively 84.0% and 0.796.

Table 15. Example of confusion matrix for classification validation. The classes are numbered like in Figure 60. Note that classes 2 and 3 are merged.

Ground-Truth Class UA Sen2Cor Class (1) (2+3) (4) (5) (6) (7) (8) (9) (10) (11) (%) (1) Saturated or defective 0 0 0 5 0 0 0 9 7 16 0 (2+3) Dark area, cloud shadows 0 6785 880 0 0 5 0 0 0 0 88.5 (4) Vegetation 0 31 15,398 52 0 46 0 0 198 0 97.9 (5) Not-vegetated 0 441 406 38,063 0 177 34 460 0 0 96.2 (6) Water 0 3223 3 0 42,351 110 0 0 89 0 92.5 (7) Unclassified 0 1 154 354 0 205 163 212 33 0 18.3 (8) Cloud medium probability 0 21 75 599 0 517 823 347 327 0 30.4 (9) Cloud high probability 0 0 0 24 0 14 533 4197 148 0 85.4 (10) Thin cirrus 0 3234 6444 2205 0 934 7010 187 48,308 0 70.7 (11) Snow 0 0 0 0 0 0 0 0 0 0 0 Producer Accuracy (%) 0 49.4 65.9 92.2 100.0 10.2 9.6 77.5 98.4 0

The analysis was performed for four more example scenes located in Madrid (Spain), Manila (Phillipines), Tatra Mountains (Poland) and Potsdam (Germany). The overall accuracy of all scenes processed is relatively stable with mean of 80.8% and standard deviation 3.4%. This is a satisfactory result in comparison with those of other studies. Crop and tree species classifications on single data Sentinel-2 acquisitions had been studied with a supervised Random Forest classifier and a classical pixel-based classification [47]. The achieved overall accuracies ranged between 65% and 76%. Water (if present in the image) and high probability clouds are classified with similar UA for all examples investigated. UA for vegetation and not-vegetated classes vary from scene to scene due to the different types of included landscape. Not-vegetated class includes a relative broad set of possible objects. It contains not only soil but urban objects, roads or railroads. Also, classification of dark areas and cloud shadows varies due to the different terrains and cloud types at different altitudes in the atmosphere.

5.2.2. Validation of AOT and WV Products The objective of this study is to validate AOT and WV products estimated by Level-2A processing chain in Sen2Cor. AOT and WV are the key parameters for atmospheric correction and have a large influence on the accuracy of BOA-SR product.

5.2.2.1. Method The principle of validation of AOT and WV products is a direct comparison of Sen2Cor outputs with ground truth from AERONET [45] sunphotometer measurements. For WV validation, the average Sen2Cor output over all soil and vegetation pixels within 9 km × 9 km area around sunphotometer location is computed. AOT validation includes also water pixels for the same area.

5.2.2.2. Results and Analysis The Level-1C products were processed to Level-2A using the default Sen2Cor configuration (e.g., cirrus correction deactivated) and with SRTM DEM activated. Processing results with 60 m, 20 m or 10 m spatial resolution give consistent results. AOT validation results are shown in Figure 63 Note that only a small number of products was included into the analysis until now. Sen2Cor produced some very good AOT estimation samples, however also some with large differences between Sen2Cor and ground truth. Most of bad samples suffer from lack of DDV-pixels, which are necessary for the AOT-retrieval algorithm implemented in Sen2Cor. AOT is set to a default value of 0.20 when there Remote Sens. 2017, 9, 584 75 of 80

The principle of validation of AOT and WV products is a direct comparison of Sen2Cor outputs with ground truth from AERONET [45] sunphotometer measurements. For WV validation, the average Sen2Cor output over all soil and vegetation pixels within 9 km × 9 km area around sunphotometer location is computed. AOT validation includes also water pixels for the same area.

5.2.2.2. Results and Analysis The Level-1C products were processed to Level-2A using the default Sen2Cor configuration (e.g., cirrus correction deactivated) and with SRTM DEM activated. Processing results with 60 m, 20 m or 10 m spatial resolution give consistent results. AOT validation results are shown in Figure 63 RemoteNote Sens. 2017 that, 9 only, 584 a small number of products was included into the analysis until now. Sen2Cor76 of 81 produced some very good AOT estimation samples, however also some with large differences between Sen2Cor and ground truth. Most of bad samples suffer from lack of DDV-pixels, which are are notnecessary sufficient for DDVthe AOT pixels-retrieval in the algorithm image. implemented Limiting the in statistical Sen2Cor. analysisAOT is set to to samples a default with value enough of DDV0.20 pixels when in there the image are not and suffici withent cloudinessDDV pixels lessin the than image. 5% Limiting gives a the mean statistical difference analysis between to samples Sen2Cor and groundwith enough truth DDV of 0.03 pixels± 0.02 in the with image maximum and with difference cloudiness of less 0.05. than This 5% resultgives a is mean in agreement difference with experiencebetween from Sen2Cor earlier and validation ground truth studies of 0.03 for ± Landsat0.02 with andmaximum RapidEye difference sensors of 0.05. [48]. This Present result validation is in agreement with experience from earlier validation studies for Landsat and RapidEye sensors [48]. results for Sen2Cor are aligned with validation results for other atmospheric correction algorithms Present validation results for Sen2Cor are aligned with validation results for other atmospheric foundcorrection in the literature algorithms [49 found]. in the literature [49].

Figure 63. AOT and Water Vapour (WV) Validation. Direct comparison of Sen2Cor with ground Figure 63. AOT and Water Vapour (WV) Validation. Direct comparison of Sen2Cor with ground reference from AERONET. Sen2Cor is represented by average over 9 km × 9 km area with AERONET referencesunphot fromometer AERONET. in the centre. Sen2Cor Filled is symbols represented represent by samples average with over cloudiness 9 km × 9 > km5% a areand not with filled AERONET with sunphotometercloudiness < in5%. the centre. Filled symbols represent samples with cloudiness >5% and not filled with cloudiness <5%. Water vapour retrieval gives better results than AOT (Figure 63). Retrieval accuracy is less Waterinfluenced vapour by cloud retrieval cover givesand missing better DDV results pixels. than Limiting AOT the (Figure statistical 63). analysis Retrieval to samp accuracyles with is less cloudiness less than 5% gives a mean difference between Sen2Cor and ground truth of influenced by cloud cover and missing DDV pixels. Limiting the statistical analysis to samples with (0.29 ± 0.11) g/cm2 with maximum difference of 0.42 g/cm2. cloudiness less than 5% gives a mean difference between Sen2Cor and ground truth of (0.29 ± 0.11) 2 2 g/cm5.2.3.with Validation maximum of BOA difference Reflectance of 0.42 Products g/cm .

5.2.3. ValidationThe objective of BOA of Reflectancethis investigation Products is to estimate the uncertainty of BOA SR product resulting from processing Sentinel-2 data using Sen2Cor Level-2A processor. The validation uses the same test Thesites objectiveand in-situ of data this sets investigation as the ones isused toestimate for validation the uncertainty of AOT and ofWV BOA products. SR product resulting from processing Sentinel-2 data using Sen2Cor Level-2A processor. The validation uses the same test sites and in-situ5.2.3.1. dataMethod sets as the ones used for validation of AOT and WV products.

5.2.3.1. Method

The validation is performed comparing Sen2Cor outputs with surface reflectance reference data for a 9 km × 9 km subset around the sunphotometer location. Reference data are computed by a second run of Sen2Cor using AOT550 information provided by collocated sunphotometer measurements as input. The resulting “sunphotometer-corrected” surface reflectance data are considered to provide the surface reflectance “truth”, since the greatest uncertainty in atmospheric correction comes from the aerosol characterization.

5.2.3.2. Results and Analysis Processing results with 60 m, 20 m or 10 m spatial resolution give consistent results. Example for BOA-SR validation at test site Belsk (Poland) is typical for locations with enough DDV pixels in the image (Figure 64). Example spectra were extracted at different locations in the reference image and in the image to validate. Example spectra for dark and bright soils, for forest and for different other vegetated locations show the expected spectral dependency and agree within the target accuracy of Remote Sens. 2017, 9, 584 76 of 80

The validation is performed comparing Sen2Cor outputs with surface reflectance reference data for a 9 km × 9 km subset around the sunphotometer location. Reference data are computed by a second run of Sen2Cor using AOT550 information provided by collocated sunphotometer measurements as input. The resulting “sunphotometer-corrected” surface reflectance data are considered to provide the surface reflectance “truth”, since the greatest uncertainty in atmospheric correction comes from the aerosol characterization.

5.2.3.2. Results and Analysis Processing results with 60 m, 20 m or 10 m spatial resolution give consistent results. Example for BOA-SR validation at test site Belsk (Poland) is typical for locations with enough DDV pixels in the image (Figure 64). Example spectra were extracted at different locations in the reference image Remoteand in Sens. the2017 image, 9, 584 to validate. Example spectra for dark and bright soils, for forest and for different77 of 81 other vegetated locations show the expected spectral dependency and agree within the target accuracy of relative 5% for BOA-reflectance. The SR differences up to 0.04 between Sen2Cor results relative 5% for BOA-reflectance. The SR differences up to 0.04 between Sen2Cor results and reference and reference lead to a Normalized Density Vegetation Index (NDVI) uncertainty up to 0.06. lead to a Normalized Density Vegetation Index (NDVI) uncertainty up to 0.06.

FigureFigure 64.64. ValidationValidation example example for for Bottom-Of-AtmosphereBottom-Of-Atmosphere (BOA)(BOA) reflectance,reflectance, Site Site Belsk Belsk (Poland), (Poland), acquiredacquired on on 14th 14th of of August August 2015, 2015, characterized characterized by by flat flat terrain. terrain. (Left(Left):): BOA-RGB BOA-RGB and and scene scene 2 classificationclassification image image of of the the 9 9× × 99 km km2 areaarea around around AERONET AERONET sunphotometer sunphotometer (longitude (longitude 20.79 20.79 E, E, latitudelatitude 51.84 51.84 N). N). ( Right(Right):): Example Example spectra spectra and and relative relative difference difference to to reference reference for for soil soil and and vegetated vegetated pixelspixels at at indicated indicated locations. locations.

TheseThese resultsresults are are consistent consistent with with other other investigations. investigations. Sentinel-2Sentinel-2 Level-2ALevel-2A datadata processedprocessed with with Sen2CorSen2Cor were were compared compared to to atmospherically-corrected atmospherically-corrected Landsat-8 Landsat-8 data data achieving achieving an an overall overall Root Root MeanMean SquareSquare Error Error (RMSE) (RMSE) of of 0.031 0.031 over over all all six six bands bands used used for for comparison comparison [ 50[50].]. RMSE RMSE ranges ranges from from 0.0230.023 forfor thethe green green to to 0.043 0.043 for for the the near-infrared near-infrared band. band. The The pixel-based pixel-based comparison comparison was was conducted conducted onon acquisitions acquisitions of of Sentinel-2 Sentinel-2 and and Landsat-8 Landsat-8 on on the the same same day day over over six six test test sites sites in in Europe. Europe. The The atmospherically-correctedatmospherically-corrected Landsat Landsat Surface Surface Reflectance Reflectance Climate Climate Data Data Record Record (CDR) (CDR) was was used used for for the the comparison,comparison, which which isis inin veryvery goodgood agreementagreement withwith MODISMODIS datadata andand AERONETAERONET measurementsmeasurements [[5151].]. The same study also found very good agreement between Sentinel-2 surface reflectance data and reflectance measured on ground during two field campaigns. Sentinel-2 Level-2A product has also potential for application over water surface [52], even if Sen2Cor was developed as atmospheric correction over land surface. Results of processing Sentinel-2 data with three different atmospheric corrections relative to spectra measured on a lake have been compared: Sen2Cor, ACOLITE and MIP (Modular Inversion and Processing System). Both ACOLITE and MIP are algorithms developed for application over water bodies. Comparison at seven test points showed that all 3 algorithms give comparable results. On average over all test points MIP performed a bit better than Sen2Cor and ACOLITE considering both the shape and the intensity of resulting spectra. RMSE between Sen2Cor surface reflectance spectra and in-situ measurements range from 0.002 to 0.005. Another study could not rely on reflectance measurements on ground in parallel to Sentinel-2 overpasses [53]. They found that shape and magnitude of spectra over lakes generated with Sen2Cor are consistent with field Remote Sens. 2017, 9, 584 78 of 81 measurements from previous years. The agreement is better in case of more turbid water giving higher reflectance. This may be due to sun-glint effects not considered in Sen2Cor. Sun glint has a larger influence on lower signals coming from the water. Another result is that water parameter retrievals on basis of band ratio algorithms performed better with TOA-data than with BOA-data. Band ratio algorithms are very sensitive to small errors in the signals. They relate signals from the NIR bands to the green band which is much more affected by the atmosphere than the NIR. Consequently, atmospheric correction uncertainty has a larger effect on the green band leading to error magnification on the band ratio. Bio-optical modelling is less sensitive to atmospheric correction uncertainty [52].

6. Conclusions and Perspectives The main objective of the Sentinel-2 MSI mission is to provide stable time series of images at medium-high spatial resolution. After one year in orbit, this objective is mainly achieved. Sentinel-2 time series are now consistent and comparable with other missions such as Landsat-8 OLI. This paper provided a description of the calibration activities and the current status, one year after Sentinel-2A launch, of the mission products validation activities. Measured performances, derived from the validation activities, have been estimated and presented for the different mission product levels. Results obtained show the very good performances of the mission products both in terms of radiometry and geometry. Thanks to a robust in-flight calibration strategy, the radiometry is both accurate (<5% absolute uncertainty) and stable (<1%/year variation estimated). Cancelling seasonal effects on diffuser acquisition is the key to this performance: this involves an accurate model of the Sun-Earth distance and of the diffuser bi-directional Reflectance Function. Further progress on the latter point should lead to improved pixel response stability (i.e., Fixed Pattern Noise) in the near future. The geometric accuracy is also very satisfactory, especially after the correction of the LOS yaw bias at the beginning of June 2016. The typical multi-temporal co-registration between two acquisitions is now better than one pixel. The full performance (better than 0.3 pixel at 2σ confidence level) will however be reached later, after the activation of the geometric refinement based on the GRI. At this point, the accuracy of the DEM becomes the main limiting factor for the geometric performance. Level-2A Sen2Cor processor provides Cloud Detection and Scene Classification with an overall accuracy of (81 ± 3)% for the products investigated. Uncertainty of retrieval of the Aerosol Optical Thickness is 0.03 ± 0.02 if the image contains Dense Dark Vegetation pixels. If there are no Dense Dark Vegetation pixels in the image, then Aerosol Optical Thickness estimation fails. A processor evolution is being developed using aerosol information from ECMWF. Water Vapour content is retrieved from the Level-1C image with uncertainty of (0.29 ± 0.11) g/cm2. Top-of-Atmosphere to Bottom-Of-Atmosphere conversion gives uncertainties in surface reflectance up to 0.04 in products with Dense Dark Vegetation pixels. Finally, the Sentinel-2 time series need to meet the high availability, reliability and timeliness performance requirements to support the Copernicus services. These aspects are particularly challenging for the Sentinel-2 mission given the amount of data produced by the satellite. After one year or ramp-up, the processing baseline (currently 02.04) is essentially stabilized with few anomalous products. New challenges will be faced in the coming months with the of Sentinel-2B and the start of the production with geometric refinement. Operational procedures and monitoring tools are being optimized in order to meet this challenge, taking into account lessons learnt during the ramp-up phase. Innovative solutions could again improve the level of service of Sentinel-2: for instance a machine-to-machine Application Programming Interface (API) could provide real-time information about product anomalies or unavailabilities.

Acknowledgments: RADCATS dataset has been provided by the NASA Landsat Cal/Val Team as part of the ESA expert users effort. The authors thank NASA-USGS for providing Landsat-8 dataset and DIMITRI team for their technical support. Remote Sens. 2017, 9, 584 79 of 81

Author Contributions: Jerome Louis and Bringfried Pflug designed and prepared the L2A-part of the paper. Jerome Louis concentrated on Sen2Cor calibration and Bringfried Pflug on L2A validation. Jakub Bieniarz worked on scene classification validation. Ferran Gascon is the ESA technical officer of the MPC activity and the Sentinel-2 data quality manager. Catherine Bouzinac is the coordinator of the MPC Calibration and Validation activities, replacing Olivier Thepaut. Sebastien Clerc is the technical manager of the MPC. Stephane Massera is responsible for the GRI generation. Mathieu Jung is the leader of the L1 calibration activities. Bruno Lafrance is responsible for the radiometric calibration analyses. Benjamin Francesconi is the leader of the L1 validation activities. Francoise Viallefont is responsible for the MTF validation. Bahjat Alhammoud is responsible for the vicarious validation. Laetitia Pessiot is the service manager of the MPC. Vincent Lonjou is the CNES radiometry expert for Sentinel-2. Angélique Gaudel-Vacaresse and Florie Languille are the CNES geometry experts for Sentinel-2. Thierry Tremas is the CNES leader for Sentinel-2 image quality. Enrico Cadau is the ESA Sentinel-2 Mission Management Support Engineer. Roberto De Bonis is the ESA Sentinel-2 Data Processing Support Engineer. Valérie Fernandez is the ESA Sentinel-2 Mission Engineering and Payload Manager. Philippe Martimort was, up to 2015, the Sentinel-2 ESA Mission Engineering and Payload Manager and is currently with ESA Future Missions Division. Claudia Isola is the ESA Sentinel-2 Cal/Val and Instrument Performance Engineer. Conflicts of Interest: The authors declare no conflict of interest.

References

1. Drusch, M.; del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P.; et al. Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services. Remote Sens. Environ. 2012, 120, 25–36. [CrossRef] 2. Martin-Gonthier, P.; Magnan, P.; Corbiere, F.; Rolando, S.; Saint-Pe, O.; Breart de Boisanger, M.; Larnaudie, F. CMOS detector for space applications: from R & D to operational program with large volume foundry. Proc. SPIE 2010, 7826.[CrossRef] 3. Chorvalli, V.; Cazaubiel, V.; Bursch, S.; Welsch, M.; Sontag, H.; Martimort, P.; del Bello, U.; Sy, O.; Laberinti, P.; Spoto, F. Design and development of the Sentinel-2 Multi Spectral Instrument and satellite system. Sens. Syst. Next-Gener. Satell. XIV 2010, 7826.[CrossRef] 4. Dariel, A.; Chorier, P.; Leroy, C.; Maltère, A.; Bourrillon, V.; Terrier, B.; Molina, M.; Martino, F. Development of a SWIR multi-spectral detector for GMES/Sentinel-2. Proc. SPIE 2009, 7474.[CrossRef] 5. Chorvalli, V.; Espuche, S.; Delbru, F.; Haas, C.; Astrium, G.; Martimort, P.; Fernandez, V.; Kirchner, V. The multispectral instrument of the Sentinel 2 EM program results. Available online: http://esaconferencebureau. com/custom/icso/2012/papers/FP_ICSO-023.pdf (accessed on 22 May 2017). 6. Copernicus Open Access Hub. Available online: https://scihub.copernicus.eu/ (accessed on 22 May 2017). 7. Thuillier, G.; Hersé, M.; Labs, D.; Foujols, T.; Peetermans, W.; Gillotay, D.; Simon, P.C.; Mandel, H. The solar spectral irradiance from 200 to 2400 nm measured by the SOLSPEC spectrometer from the ATLAS and EUREKA missions. Sol. Phys. 2003, 214, 1–22. [CrossRef] 8. Keyhole Mark-up Language (KML). Available online: https://sentinels.copernicus.eu/documents/247904/ 1955685/S2A_OPER_GIP_TILPAR_MPC__20151209T095117_V20150622T000000_21000101T000000_B00. kml (accessed on 22 May 2017). 9. Sentinel-2 Toolbox. Available online: http://step.esa.int/main/toolboxes/sentinel-2-toolbox (accessed on 6 December 2016). 10. Louis, J. Sentinel 2 MSI–Level 2A Product Definition. Available online: https://sentinel.esa.int/documents/ 247904/1848117/Sentinel-2-Level-2A-Product-Definition-Document.pdf (accessed on 7 September 2016). 11. Schläpfer, D.; Borel, C.C.; Keller, J.; Itten, K.I. Atmospheric precorrected differential absorption technique to retrieve columnar water vapour. Remote Sens. Environ. 1998, 65, 353–366. [CrossRef] 12. Kaufman, Y.J.; Sendra, C. Algorithm for automatic atmospheric corrections to visible and near-IR satellite imagery. Int. J. Remote Sens. 1988, 9, 1357–1381. [CrossRef] 13. Richter, R.; Louis, J.; Müller-Wilm, U. Sentinel-2 MSI—Level 2A Products Algorithm Theoretical Basis Document; S2PAD-ATBD-0001, Issue 2.0; Telespazio VEGA Deutschland GmbH: Darmstadt, Germany, 2012. 14. Data Quality Report. Available online: https://sentinel.esa.int/web/sentinel/data-product-quality-reports (accessed on 22 May 2017). 15. Maisonobe, L.; Pommier-Maurussane, V. Orekit: An Open-source Library for Operational Flight Dynamics Applications. In Proceedings of the 4th ICATT, Madrid, Spain, 3–6 May 2010. 16. Gordon, H.R.; Wang, M. Retrieval of Water-Leaving Radiance and Aerosol Optical Thickness over the Oceans with SeaWiFS: A Preliminary Algorithm. Appl. Opt. 1994, 33, 443–452. [CrossRef][PubMed] Remote Sens. 2017, 9, 584 80 of 81

17. Morel, A. Optical modeling of the upper ocean in relation to its biogenous matter content (case 1 water). J. Geophys. Res. 1988, 93, 10749–10768. [CrossRef] 18. Morel, A.; Prieur, L. Analysis of Variations in Ocean Color. Limnol. Oceanogr. 1977, 22, 709–722. [CrossRef] 19. Hagolle, O.; Goloub, P.; Deschamps, P.Y.; Cosnefroy, H.; Briottet, X.; Bailleul, T.; Nicolas, J.-M.; Parol, F.; Lafrance, B.; Herman, M. Results of POLDER in-flight Calibration. IEEE Trans. Geosci. Remote Sens. 1999, 37, 1550–1566. [CrossRef] 20. Vermote, E.; Santer, R.; Deschamps, P.Y.; Herman, M. In-flight Calibration of Large Field-of-View Sensors at Short Wavelengths using Rayleigh Scattering. Int. J. Remote Sens. 1992, 13, 3409–3429. [CrossRef] 21. Morel, A.; Maritorena, S. Bio-optical properties of oceanic waters: A reappraisal. J. Geophys. Res. 2001, 106, 7763–7780. [CrossRef] 22. Morel, A.; Gentili, B. Diffuse reflectance of oceanic waters. III. Implication of bidirectionality for the remote-sensing problem. Appl. Opt. 1996, 35, 4850–4862. [CrossRef][PubMed] 23. Fougnie, B.; Patrice Henry, P.; Morel, A.; Antoine, D.; Montagner, F. Identification and characterization of stable homogenous oceanic zones: Climatology and impact on in-flight calibration of space sensors over rayleigh scattering. In Proceedings of Ocean Optics XVI, Santa Fe, NM, USA, 18–22 November 2002. 24. Fougnie, B.; Bach, R. Monitoring of Radiometric Sensitivity Changes of Space Sensors Using Deep Convective Clouds: Operational Application to PARASOL. IEEE Trans. Geosci. Remote Sens. 2009, 47, 851–861. [CrossRef] 25. Slater, P.N.; Biggar, S.F.; Holm, R.G.; Jackson, R.D.; Mao, Y.; Moran, M.S.; Palmer, J.M.; Yuan, B. Reflectance- and radiance-based methods for the in-flight absolute calibration of multispectral sensors. Remote Sens. Environ. 1987, 22, 11–37. [CrossRef] 26. Thome, K.J.; Helder, D.L.; Aaron, D.; Dewald, J.D. Landsat-5 TM and Landsat-7 ETM+ absolute radiometric calibration using the reflectance-based method. IEEE Trans. Geosci. Remote Sens. 2004, 42, 2777–2785. [CrossRef] 27. Czapla-Myers, J.; McCorkel, J.; Anderson, N.; Thome, K.; Biggar, S.; Helder, D.; Aaron, D.; Leigh, L.; Mishra, N. The Ground-Based Absolute Radiometric Calibration of . Remote Sens. 2015, 7, 600–626. [CrossRef] 28. Markham, B.; Barsi, J.; Kvaran, G.; Ong, L.; Kaita, E.; Biggar, S.; Czapla-Myers, J.; Mishra, N.; Helder, D. Landsat-8 operational land imager radiometric calibration and stability. Remote Sens. 2014, 6, 12275–12308. [CrossRef] 29. Bouvet, M. Radiometric comparison of multispectral imagers over a pseudo-invariant calibration site using a reference radiometric model. Remote Sens. Environ. 2014, 140, 141–154. [CrossRef] 30. Alhammoud, B.; Bouvet, M.; Jackson, J.; Arias, M.; Thepaut, O.; Lafrance, B.; Gascon, F.; Cadau, E.; Berthelot, B.; Francesconi, B. On the vicarious calibration methodologies in DIMITRI: Applications on Sentinel-2 and Landsat-8 products and comparison with in-situ measurements. In Proceeding of the ESA Living Planet Symposium, Prague, Czech Republic, 9–13 May 2016. 31. Holben, B.N.; Kaufman, Y.J.; Kendall, J.D. NOAA-11 AVHRR visible and near IR in-flight calibration. Int. J. Remote Sens. 1990, 11, 1511–1519. [CrossRef] 32. CEOS-PICS. Available online: http://calvalportal.ceos.org (accessed on 22 May 2017). 33. Cosnefroy, H.; Leroy, M.; Briottet, M. Selection and characterisation of saharan and Arabian desert sites for the calibration of optical satellite sensors. Remote Sens. Environ. 1996, 58, 2713–2715. [CrossRef] 34. International Ocean-Colour Coordinating Group (IOCCG). In-Flight Calibration of Satellite Ocean-Colour Sensors; Frouin, R., Ed.; IOCCG: Dartmouth, Canada, 2013. 35. Kääb, A.; Winsvold, S.H.; Altena, B.; Nuth, C.; Nagler, T.; Wuite, J. Glacier Remote Sensing Using Sentinel-2. Part I: Radiometric and Geometric Performance, and Application to Ice Velocity. Remote Sens. 2016, 8, 598. [CrossRef] 36. Lacherade, S.; Fougnie, B.; Henry, P.; Gamet, P. Cross calibration over desert sites: Description, methodology, and operational implementation. IEEE Trans. Geosci. Remote Sens. 2013, 51, 1098–1113. [CrossRef] 37. Bouvet, M. Intercomparison of imaging spectrometer over the Salar de Uyuni (Bolivia). In Proceedings of the 2006 MERIS AATSR Validation Team Workshop, Frascati, Italy, 20–24 March 2006. 38. Adriaensen, S.; Barker, K.; Bourg, L.; Bouvet, M.; Fougnie, B.; Govaerts, Y.; Henry, P.; Kent, C.; Smith, D.; Sterckx, S. CEOS IVOS Working Group 4: Intercomparison of vicarious calibration methodologies and radiometric comparison methodologies over pseudo-invariant calibration sites. Available online: http: //calvalportal.ceos.org/ceos-wgcv/ivos/wg4/final-report (accessed on 22 May 2017). Remote Sens. 2017, 9, 584 81 of 81

39. Barsi, J.A.; Schott, J.R.; Hook, S.J.; Raqueno, N.G.; Markham, B.L.; Radocinski, R.G. Landsat-8 Thermal Infrared Sensor (TIRS) Vicarious Radiometric Calibration. Remote Sens. 2014, 6, 11607–11626. [CrossRef] 40. Van der Werff, H.; van der Meer, F. Sentinel-2A MSI and Landsat 8 OLI Provide Data Continuity for Geological Remote Sensing. Remote Sens. 2016, 8, 883. [CrossRef] 41. Vermote, E.F.; Tanré, D.; Deuzé, J.L.; Herman, M.; Morcrette, J.-J. Second Simulation of the Satellite Signal in the Solar Spectrum, 6S: An Overview. IEEE Trans. Geosci. Remote Sens. 1997, 35, 675–686. [CrossRef] 42. Berk, A.; Anderson, G.P.; Acharya, P.K.; Bernstein, L.S.; Muratov, L.; Lee, J.; Fox, M.; Adler-Golden, S.M.; Chetwynd, J.H.; Hoke, M.L.; et al. MODTRAN™ 5, A Reformulated Atmospheric Band Model with Auxiliary Species and Practical Multiple Scattering Options: Update. Algorithms Technol. Multispectr. Hyperspectr. Ultraspectr. Imag. 2005, 5655.[CrossRef] 43. Viallefont-Robinet, F.; Léger, D. Improvement of the edge method for on-orbit MTF measurement. Opt. Express 2010, 18, 3531–3545. [CrossRef][PubMed] 44. Fourest, S.; Briottet, X.; Lier, P.; Valorge, C. Satellite Imagery From Acquisition Principle to Processing of Optical Images for Observing the Earth; CEPADUES Editions: Toulouse, France, 2012. 45. Holben, B.N.; Eck, T.F.; Slutker, I.; Tanré, D.; Buis, J.P.; Setzer, A.; Vermote, E.; Reagan, J.A.; Kaufman, Y.J.; Nakajima, T.; et al. AERONET—A federated instrument network and data archive for aerosol characterization. Remote Sens. Environ. 1998, 66, 1–16. [CrossRef] 46. S2MPC—Calibration and Validation Plan for the MPC Routine Operation Phase; S2-PDGS-MPC-CCVP, issue 12; Telespazio VEGA Deutschland GmbH: Darmstadt, Germany, 2017. 47. Immitzer, M.; Vuolo, F.; Atzberger, C. First Experience with Sentinel-2 Data for Crop and Tree Species Classifications in Central Europe. Remote Sens. 2016, 8, 166. [CrossRef] 48. Pflug, B.; Main-Knorn, M.; Makarau, A.; Richter, R. Validation of aerosol estimation in atmospheric correction algorithm ATCOR. In Proceedings of the 36th International Symposium on Remote Sensing of Environment (ISRSE), Berlin, Germany, 11–15 May 2015. [CrossRef] 49. Breon, F.M.; Vermeulen, A.; Descloitres, J. An evaluation of satellite aerosol products against sunphotometer measurements. Remote Sens. Environ. 2011, 115, 3102–3111. [CrossRef] 50. Vuolo, F.; Zółtak, M.; Pipitone, C.; Zappa, L.; Wenng, H.; Immitzer, M.; Weiss, M.; Baret, F.; Atzberger, C. Data Service Platform for Sentinel-2 Surface Reflectance and Value-Added Products: System Use and Examples. Remote Sens. 2016, 8, 938. [CrossRef] 51. Vermote, E.; Justice, C.; Claverie, M.; Franch, B. Preliminary Analysis of the Performance of the Landsat 8/OLI Land Surface Reflectance Product. Remote Sens. Environ. 2015, 185, 46–56. [CrossRef] 52. Dörnhöfer, K.; Göritz, A.; Gege, P.; Pflug, B.; Oppelt, N. Water Constituents and Water Depth Retrieval from Sentinel-2A—A First Evaluation in an Oligotrophic Lake. Remote Sens. 2016.[CrossRef] 53. Toming, K.; Kutser, T.; Laas, A.; Sepp, M.; Paavel, B.; Nõges, T. First Experiences in Mapping Lake Water Quality Parameters with Sentinel-2 MSI Imagery. Remote Sens. 2016.[CrossRef]

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).