<<

Article Dynamic Water Surface Detection Algorithm Applied on PROBA-V Multispectral Data

Luc Bertels *, Bruno Smets and Davy Wolfs

Center for Remote Sensing and Observation Processes, Flemish Institute for Technological Research (VITO), Boeretang 200, B-2400 Mol, Belgium; [email protected] (B.S.); [email protected] (D.W.) * Correspondence: [email protected]; Tel.: +32-14-336-861

Academic Editors: Clement Atzberger, Magda Chelfaoui and Prasad S. Thenkabail Received: 10 August 2016; Accepted: 1 December 2016; Published: 10 December 2016

Abstract: Water body detection worldwide using spaceborne remote sensing is a challenging task. A global scale multi-temporal and multi-spectral image analysis method for water body detection was developed. The PROBA-V microsatellite has been fully operational since December 2013 and delivers daily near-global synthesis with a spatial resolution of 1 km and 333 m. The Red, Near-InfRared (NIR) and Short Wave InfRared (SWIR) bands of the atmospherically corrected 10-day synthesis images are first Hue, Saturation and Value (HSV) color transformed and subsequently used in a decision tree classification for water body detection. To minimize commission errors four additional data layers are used: the Normalized Difference Vegetation Index (NDVI), Water Body Potential Mask (WBPM), Permanent Glacier Mask (PGM) and Volcanic Soil Mask (VSM). Threshold values on the hue and value bands, expressed by a parabolic function, are used to detect the water bodies. Beside the water bodies layer, a quality layer, based on the water bodies occurrences, is available in the output product. The performance of the Water Bodies Detection Algorithm (WBDA) was assessed using Landsat 8 scenes over 15 regions selected worldwide. A mean Commission Error (CE) of 1.5% was obtained while a mean Omission Error (OE) of 15.4% was obtained for minimum Water Surface Ratio (WSR) = 0.5 and drops to 9.8% for minimum WSR = 0.6. Here, WSR is defined as the fraction of the PROBA-V pixel covered by water as derived from high spatial resolution images, e.g., Landsat 8. Both the CE = 1.5% and OE = 9.8% (WSR = 0.6) fall within the user requirements of 15%. The WBDA is fully operational in the Copernicus Global Land Service and products are freely available.

Keywords: PROBA-V; water body detection; color space transformation; HSV; decision tree classification; occurrence estimation

1. Introduction Inland water surface mapping is important to budget freshwater supply for human and animals [1–3], and agriculture [4–6], but also to monitor ecological issues and to perform ecosystem management [7–10]. However, water body detection worldwide using spaceborne remote sensing is a challenging task because many objects on the Earth’s surface have similar spectral properties (within the spectral range of the applied sensor) as water bodies. Nevertheless, several spaceborne missions have been used to address this. The Moderate Resolution Imaging Spectrometer (MODIS) sensor was used to create the 250 m land-water mask (MOD44W) for the year 2000 [11]. Recently, MODIS data were used to assess global inland water body dynamics on a daily basis [12]. The SPOT VEGETATION (VGT) sensor was first used to create the dynamic Small Water Bodies (SWB) monitoring product for the African continent [13]. The same sensor is used to construct the global water bodies products for the whole archive, years 1999 to 2014, based on the work presented in this paper and becoming available by the end of 2016. The thematic Mapper (TM) and Enhanced Thematic Mapper Plus (ETM+) sensors ( and , respectively), which allow high-resolution mapping of small water

Remote Sens. 2016, 8, 1010; doi:10.3390/rs8121010 www.mdpi.com/journal/remotesensing Remote Sens. 2016, 8, 1010 2 of 25 bodies, were used for water quality assessment [14] and water bodies extraction [15]. The latter were also used for mapping inland surface water at global scale for the years 2000 and 2010 resulting in the Global Land 30-water 2000 and Global Land 30-water 2010 datasets, respectively [16]. The Global Land Survey (GLS) Landsat data collection was used to produce a global, 30-m-resolution inland surface water body dataset (GIW) for circa-2000 [17], as well as to produce the Global 3 arc-second Water Body Map (G3WBM) [18] and to study Earth’s surface water change over the past 30 years [19]. Progress was made with Landsat 8, containing the nine-band Operational Land Imager (OLI) sensor and the Thermal InfraRed Sensor (TIRS) featuring two spectral bands. The cirrus band (band 9) allows for a better cloud screening while the coastal band might improve water detection [20]. Recently, all Landsat 8 acquisitions between June 2013 and June 2014 were processed using the Engine platform to map water surfaces at global scale [21]. Although these sensors have a high spatial resolution (30 m), their revisit time is long (16 days), which makes near real-time water body mapping at global scale impossible. Medium resolution missions, like PROBA-V, with a near-daily revisit time are more suited for near real-time water body mapping at global scale. The PROBA-V microsatellite was launched in May 2013 as a successor for the SPOT-VEGETATION missions. The instrument has been fully operational since December 2013 and delivers daily near-global synthesis with a spatial resolution of 1 km and 333 m. The high revisit time and the four spectral bands (Blue, Red, NIR and SWIR) make this instrument ideal for climate impact assessments, agricultural monitoring, food security estimates and surface water resources management. Several techniques are available for water surface extraction. Supervised classification methods make use of ground reference data [22], while unsupervised classification methods first search for suitable endmembers [22,23]. Linear spectral unmixing uses endmembers to unravel the image spectra and, in this case, decide on the fraction of water within each pixel spectrum [24]. Spectral indices are the most frequently used. Well known is the normalized difference water index (NDWI) using the reflectance of the green and NIR bands [25]. Bands thresholding makes use of thresholds that are applied directly on the reflectance bands, on derived spectral indices or on transformed bands [1,26,27]. An overview of global-scale water body products developed in recent years using different techniques is presented by Yamazaki [18]. The objective of the research presented was to develop a global, multi-temporal and multi-spectral image analysis method for near real-time dynamic water body detection using PROBA-V 10-day composite images and which is based on the approach developed by Pekel [28]. This algorithm, which applies bands thresholding, was previously developed for MODIS and, after extensive validation, proved to be very successful [28]. The final output product (i.e., “Area of Water Bodies detected worldwide”) identifies the pixels covered by water on a global scale. The areas of water bodies are understood here with respect to the instrument resolution, i.e., surfaces more or less covered by water with a size of one PROBA-V pixel, about 1 km2 [29].

2. Materials and Methods

2.1. Overview As shown by the general overview in Figure1, the Water Body Detection Algorithm (WDBA) uses five different sets of input data, which are processed in four different steps. Firstly, the PROBA-V daily Top-Of-Canopy synthesis products (S1-TOC) are assembled into a 10-day mean composite (MC10). During the mean compositing process, the MC10 Status Map (MC10-SM) is constructed, which, in turn, is used as input for the water body detection algorithm. Secondly, the SWIR, NIR and Red bands are transformed to HUE, SATURATION and VALUE using a RGB to HSV color transformation, a technique often used in image processing. Thirdly, the application of specific threshold values on HUE and VALUE applied per pixel allows water body detection. To minimize commission errors, four additional data layers are taken into account, i.e., the MC10 Status Map, Water Body Potential Remote Sens. 2016, 8, 1010 3 of 25 Remote Sens. 2016, 8, 1010 3 of 25

MaskMap, (WBPM),Water Body Permanent Potential Glacier Mask Mask(WBPM), (PGM) Permanent and the VolcanicGlacier Mask Soils Mask(PGM) (VSM). and the Finally, Volcanic to obtain Soils qualitativeMask (VSM). information Finally, aboutto obtain the detected qualitative water information bodies, an occurrence about the is calculateddetected whichwater isbodies, provided an inoccurrence the output is productcalculated as which a quality is provided layer. The in input the ou datatput and product the different as a quality steps layer. of the The algorithm input data are describedand the different in more steps detail of in the the algorithm subsequent are paragraphs. described in more detail in the subsequent paragraphs.

FigureFigure 1.1. General overview of thethe WaterWater BodiesBodies DetectionDetection Algorithm.Algorithm. Five sets of inputinput datadata areare processedprocessed inin fourfour main main steps steps to to finally finally obtain obtain two two output output layers, layers, i.e., i.e., the the detected detected water water bodies bodies and and the waterthe water bodies bodies occurrence. occurrence.

2.2. Input Data 2.2. Input Data

2.2.1.2.2.1. Daily Top-Of-Canopy SynthesisSynthesis ProductProduct (S1-TOC)(S1-TOC) TheThe PROBA-VPROBA-V instrumentinstrument ensuresensures aa dailydaily coveragecoverage betweenbetween Lat.Lat. 3535°N◦N andand 7575°N,◦N, andand 3535°S◦S andand 5656°S,◦S, and a fullfull coveragecoverage every two days at the equator.equator. The S1-TOC product is provided at 1 kmkm spatialspatial resolution. Surface Surface reflectance reflectance is isavailable available for for four four spectral spectral bands: bands: Blue, Blue, 0.447–0.493 0.447–0.493 µm, Red,µm, Red,0.610–0.690 0.610–0.690 µm; µNIR,m; NIR, 0.770–0.893 0.770–0.893 µm;µ andm; and SWIR, SWIR, 1.570–1.650 1.570–1.650 µm.µ m.The The atmospheric atmospheric correction correction is isperformed performed using using SMAC SMAC 4.0 4.0 [30]. [30]. Standard Standard inpu inputt data data layers layers includes NormalizedNormalized DifferenceDifference VegetationVegetation IndexIndex (NDVI),(NDVI), geometricgeometric viewingviewing andand illuminationillumination conditions,conditions, referencereference toto datedate andand timetime ofof observations,observations, statusstatus mapmap (containing(containing identificationidentification ofof snow,snow, ice,ice, clouds,clouds, land/sealand/sea for every pixel) andand fourfour reflectancereflectance bands.bands. MoreMore infoinfo cancan bebe foundfound inin thethe ProductsProducts UserUser ManualManual [[31].31].

2.2.2.2.2.2. MC10 Status MapMap (MC10-SM)(MC10-SM) AtAt thethe final final stage stage of theof the mean mean compositing compositing step, st aep, status a status map is map created. is created. The status The map status indicates map whichindicates observation which observation type was type used was in theused mean in the compositing mean compositing step, i.e., step, un-obscured, i.e., un-obscured, cloud orcloud snow. or Itsnow. also hasIt alsoa flag has thata flag indicates that indicates if theobservation if the observat wasion done was abovedone above land or land sea. or Finally, sea. Finally, it has it for has each for reflectanceeach reflectance band anband additional an additional quality quality flag that flag is setthat to is 0 set in caseto 0 in no case compositing no compositing is done. is This done. is causedThis is whencaused there when was there no validwas no observation valid observation to be used to inbe the used 10-day in the period. 10-day The period. water The bodies water detection bodies processdetection only process takes only un-obscured takes un-obscured land pixels land into pixels account. into account.

2.2.3.2.2.3. Water BodyBody PotentialPotential MaskMask (WBPM)(WBPM) PixelsPixels locatedlocated in in mountainous mountainous areas areas often often have have lowered lowered reflectance reflectance values values due todue shadow to shadow or dark or vegetationdark vegetation and are and as suchare as confused such confused with water with bodies water because bodies theirbecause VALUE their dropsVALUE below drops the below threshold the level.threshold Moreover, level. Moreover, water bodies water cannot bodies exist cannot or appear exis fort or pixels appear located for pixels in hilly located terrain in havinghilly terrain steep slopeshaving and steep a large slopes height and differencea large he withight theirdifference neighboring with their pixels. neighboring To minimize pixels. the commission To minimize errors the incommission hilly terrain errors a Water in hilly Bodies terrain Potential a Water Mask Bodies (WBPM), Potential derived Mask from (WBPM), the Global derived Land from Survey the DigitalGlobal ElevationLand Survey Model Digital (GLSDEM) Elevation [32 Model] (data (GLSDEM) collected in[32] the (d earlyata collected 2000s), isin usedthe early as an 2000s), extra is input used in as the an waterextra input bodies in detection the water process. bodies The detection GLSDEM process. has a 90The m GLSDEM spatial and has 1 m a vertical90 m spatial resolution. and 1 Comparedm vertical resolution. Compared to the PROBA-V 1 km product the spatial resolution of the GLSDEM is much

Remote Sens. 2016, 8, 1010 4 of 25

Remote Sens. 2016, 8, 1010 4 of 25 to the PROBA-V 1 km product the spatial resolution of the GLSDEM is much higher. To prepare the GLSDEMhigher. To for prepare use in the water GLSDEM body detection, for use flatin water areas body were considereddetection, flat and areas extracted were as considered a WBPM. Thisand maskextracted was as constructed a WBPM. This in three mask steps. was constructed in three steps. • First the pixels having the lowest elevation in their local neighborhood are located, i.e., a pixel is indicated as a lowest pointpoint whenwhen itsits elevationelevation isis lowerlower thanthan oror equalequal toto itsits eighteight neighbors.neighbors. • Next, thethe detecteddetected lowest lowest points points are are filtered filtered and and expanded expanded depending depending on the on topography the topography in order in toorder obtain to obtain a 90 m a spatial90 m spatial resolution resolution potential pote waterntial bodywater mask.body mask. For each For detected each detected lowest lowest point, anpoint, imaginary an imaginary water levelwater is level raised is inraised steps in of steps 1 m tillof the1 m maximumtill the maximum rise of 5 rise m or of the 5 m flooding or the conditionflooding condition is reached. is Floodingreached. Flooding in this context in this means context detection means ofdetection a neighboring of a neighboring pixel having pixel an elevationhaving an that elevation is lower that than is lower the rise than level. the As rise long level. as theAs long edge as of the edg potentiale of the water potential body water is not flooded,body is not its flooded, area is extended its area is according extended to accordin the raisedg to level, the raised i.e., all level, neighboring i.e., all neighboring pixels having pixels the additionalhaving the elevation additional are addedelevation to theare potential added waterto the body potential area. Thiswater is schematicallybody area. This shown is inschematically a two dimensional shown representationin a two dimensional in Figure representation2. The initial detected in Figure lowest 2. The point initial contains detected two pixels.lowest Risingpoint contains the water two level pixels. 1 m will Rising expand the the wa potentialter level water1 m will body expand with an the additional potential5 pixels.water Becausebody with no an flooding additional occurs 5 pixels. the water Because level no isflooding increased occurs again the with water 1 m.level Now, is increased an additional again 3with pixels 1 m.are Now, added an to additional the potential 3 pixels water are body. added Further to the rising potential the water water level body. will Further start flooding rising the water bodylevel becausewill start the flooding first pixel the elevation water body right tobecause the expanded the first water pixel body elevation is lower. right If the to sizethe ofexpanded the final water expanded body water is lower. body If isthe less size than of ninethe final pixels, expanded it is removed water from body further is less processing than nine andpixels, therefore it is removed not considered from further as a potential processing water and body. therefore Otherwise, not considered the pixels as initially a potential detected water as lowestbody. pointOtherwise, and surrounded the pixels by initially eight neighbors detected of as equal lowest height point are markedand surrounded as level-1. Theby othereight initiallyneighbors detected of equal lowest height points are marked and the as expanded level-1. The pixels other are initially marked detected as level-2. lowest The points two levels and arethe expanded used in the pixels laststep are marked when resizing as level-2. the The potential two levels water are body used map in the to last the step PROBA when -V resizing spatial resolutionthe potential to obtainwater body the final map WBPM. to the PROBA -V spatial resolution to obtain the final WBPM. • In thethe last last step, step, the the 90 m90 spatial m spatial resolution resolution potential potential water bodywater mask body is resampledmask is resampled to the PROBA-V to the 1PROBA-V km spatial 1 resolution.km spatial Forresolution. each pixel For in each the georeferenced pixel in the georeferenced PROBA-V image, PROBA-V the corresponding image, the pixelscorresponding in the georeferenced pixels in the potentialgeoreferenced water pote bodyntial mask water are located.body mask The are 1 kmlocated. resized The pixel 1 km is indicatedresized pixel as a is potential indicated water as a bodypotential when water at least body one when of the at correspondingleast one of the90 corresponding m pixels is labeled 90 m aspixels level-1 is labeled or when as atlevel-1 least nineor when of the at correspondingleast nine of the 90 corresponding m pixels are labeled 90 m pixels as level-2. are labeled as level-2.

Figure 2. Expanding the initially detected lowest point by systematically rising an imaginary water level in in steps steps of of 1 1 m. m. The The corresponding corresponding 90 90 m mspatial spatial resolution resolution pixels pixels are are indi indicatedcated by the by thedots dots at the at thebottom. bottom.

An example of the final final obtained WBPM, for the Rift Valley test area in Ethiopia (Table A1 in AppendixA A),), isis shownshown inin FigureFigure3 .3. A A profile profile according according to to the the red red line line on on the the images images is is shown shown in in the the elevation plot. Seven potential water bodies are lo locatedcated along the transect and are indicated by the blue shaded boxes. Two of those potential water bodies are real lakes, Lake Abijato and Lake Langano, and are indicated in Figure 3a.

Remote Sens. 2016, 8, 1010 5 of 25 blue shaded boxes. Two of those potential water bodies are real lakes, Lake Abijato and Lake Langano, Remoteand are Sens. indicated 2016, 8, 1010 in Figure3a. 5 of 25

Figure 3. (a) PROBA-V mean composite for the dekad 21 October 2013 taken over the Rift Valley in Figure 3. (a) PROBA-V mean composite for the dekad 21 October 2013 taken over the Rift Valley in Ethiopia (SWIR, NIR and Red bands assigned to the Red, Green and Blue (RGB) channels, respectively). Ethiopia (SWIR, NIR and Red bands assigned to the Red, Green and Blue (RGB) channels, Some of the larger lakes can be easily recognized. (b) The Water Body Probability Mask for the same respectively). Some of the larger lakes can be easily recognized. (b) The Water Body Probability Mask area. The white pixels indicate locations of potential water bodies. The plot shows the vertical profile for the same area. The white pixels indicate locations of potential water bodies. The plot shows the according to the red line. Marked with the blue boxes are the seven larger potential water bodies. vertical profile according to the red line. Marked with the blue boxes are the seven larger potential The arrows indicate the location of: Lake Abijato (1); and Lake Langano (2). water bodies. The arrows indicate the location of: Lake Abijato (1); and Lake Langano (2).

2.2.4.2.2.4. Permanent Permanent Glacier Glacier Mask Mask (PGM) (PGM) ToTo avoidavoid commissioncommission errors errors due due to confusionto confusion with with permanent permanent glaciers, glaciers, which which often have often spectral have spectralproperties properties similar tosimilar water to bodies, water bodies, a permanent a permanent glacier maskglacier was mask constructed. was constructed. The most The recentmost recentpermanent permanent glacier glacier data were data downloaded were downloaded from the from National the National Snow and Snow Ice Dataand Ice Centre Data [33 Centre]. The [33]. data Thewere data provided were provided as a shape as file, a shape which file, was which subsequently was subsequently rasterized to rasterized the PROBA-V to the pixel PROBA-V size to obtainpixel sizethe permanentto obtain the glacier permanent mask atglacier global mask scale. at global scale. 2.2.5. Volcanic Soil Mask (VSM) 2.2.5. Volcanic Soil Mask (VSM) Locations comprised of dark volcanic soils which are spread over wider flat areas are easily Locations comprised of dark volcanic soils which are spread over wider flat areas are easily confused with water bodies because their pixel VALUE drops below the threshold level. To account confused with water bodies because their pixel VALUE drops below the threshold level. To account for these commission errors, a volcanic soil mask was made. A list of volcanoes worldwide was for these commission errors, a volcanic soil mask was made. A list of volcanoes worldwide was obtained from the Smithsonian Institution, National Museum of Natural History, Global Volcanism obtained from the Smithsonian Institution, National Museum of Natural History, Global Volcanism Program. The Holocene Volcano List was used as a reference for the construction of a volcanic soil Program. The Holocene Volcano List was used as a reference for the construction of a volcanic soil mask. Volcanic soils could be easily located on Google Earth using the information from the Holocene mask. Volcanic soils could be easily located on Google Earth using the information from the Volcano List. These areas were manually delineated on Google Earth and were subsequently exported Holocene Volcano List. These areas were manually delineated on Google Earth and were and converted to shape files from which a volcanic soil mask was made by rasterizing the shape files subsequently exported and converted to shape files from which a volcanic soil mask was made by to the PROBA-V pixel size at global scale. rasterizing the shape files to the PROBA-V pixel size at global scale.

2.3. Mean Compositing In the first processing step, temporal mean compositing over a 10-day window is performed to obtain (to the degree possible) a cloud free image. The MC10 composites (SWIR, NIR, Red, and Blue) are obtained using the mean compositing method [34] applied on the PROBA-V S1 daily reflectance values over a 10-day window. The mean compositing algorithm improves the radiometric quality of the temporal synthesis by averaging the reflectances and therefore reduces the random component

Remote Sens. 2016, 8, 1010 6 of 25

2.3. Mean Compositing In the first processing step, temporal mean compositing over a 10-day window is performed to obtain (to the degree possible) a cloud free image. The MC10 composites (SWIR, NIR, Red, and Blue) are obtained using the mean compositing method [34] applied on the PROBA-V S1 daily reflectance Remote Sens. 2016, 8, 1010 6 of 25 values over a 10-day window. The mean compositing algorithm improves the radiometric quality of the temporalof the noise. synthesis The bymean averaging composite the reflectances is calculated, and pixel therefore by reducespixel, across the random the four component spectral ofband the noise.simultaneously. The mean compositeFor each pixel is calculated, location, pixelthe input by pixel, data across over a the time four period spectral of band10 days simultaneously. are handled Forsequentially. each pixel location,Only valid the inputreflectances data over of aa timegiven period pixel of for 10 daysa given are handleddate in sequentially.the 10-day Onlyperiod valid are reflectancesconsidered in of the a given computation. pixel for aThese given valid date inreflecta the 10-daynces are period determined are considered using the in actual the computation. reflectance Thesevalue. validThe PROBA-V reflectances S1 are input determined images define using a the “No actual data” reflectance reflectance value. value The for PROBA-Veach spectral S1 inputband. imagesAll four define spectral a “No bands data” must reflectance have valuea valid for refl eachectance spectral value band. otherwise All four spectralthe given bands day must does have not acontribute valid reflectance to the valuemean otherwise composition the given for daythat doespixel not location. contribute To to achiev the meane consistency composition with for thatthe pixelSPOT-VGT location. sensor, To achieve and enable consistency the creation with theof a SPOT-VGT large time-series, sensor, a and spectral enable co therrection creation is applied of a large on time-series,the different a spectralspectral correctionbands before is applied calculating on the th differente mean reflectance spectral bands values. before The calculating mean compositing the mean reflectancealgorithm always values. tries The to mean return compositing the most optimal algorithm result. always To achieve tries to this, return it only the mostcomposites optimal the result. most Tooptimal achieve daily this, observations. it only composites For each the day most in optimal the 10-day daily period, observations. a pixel’s For observation each day in class the 10-dayis first period,determined a pixel’s as depicted observation in Figure class is4. first determined as depicted in Figure4.

FigureFigure 4.4. TheThe MC10MC10 algorithmalgorithm assignsassigns anan observationobservation classclass toto eacheach pixelpixel locationlocation eacheach andand everyevery dayday thatthat isis computed.computed.

Un-obscured land pixels are preferred before pixels that are classified as snow or cloudy, with Un-obscured land pixels are preferred before pixels that are classified as snow or cloudy, with cloudy pixels having the least priority. Input pixels not marked as land are not processed. To cloudy pixels having the least priority. Input pixels not marked as land are not processed. To compute compute the mean value of all reflectance bands and the SZA, the algorithm keeps track of the the mean value of all reflectance bands and the SZA, the algorithm keeps track of the reflectance bands reflectance bands and SZA sum of the corresponding pixel values and the number of pixels and SZA sum of the corresponding pixel values and the number of pixels contributing to these sums, contributing to these sums, for a given observation class. If a more preferred observation class is for a given observation class. If a more preferred observation class is detected for a pixel location, detected for a pixel location, the sum and number of observation of the less preferred class are the sum and number of observation of the less preferred class are discarded and a new sum and discarded and a new sum and number of observations is calculated, containing only pixel values of number of observations is calculated, containing only pixel values of the new class. Once all days in the new class. Once all days in the compositing window are computed, the reflectance band sum and the compositing window are computed, the reflectance band sum and SZA sum are divided by the SZA sum are divided by the number of observations to obtain the mean reflectance values and an number of observations to obtain the mean reflectance values and an averaged SZA. averaged SZA. 2.4. Water Bodies Detection by Decision Tree Classification 2.4. Water Bodies Detection by Decision Tree Classification An essential step before the actual water bodies detection takes place is the transformation of the Red, NIRAn essential and SWIR step bands before to the HSVactual (HUE, water SATURATION bodies detection and takes VALUE) place coloris the space transformation as described of the Red, NIR and SWIR bands to the HSV (HUE, SATURATION and VALUE) color space as described by Pekel [28]. The detection of water bodies involves three main steps: (i) invalid data filtering; (ii) incompatibility masking; and (iii) water body detection. The decisions made in these three steps were implemented in a Decision Tree Classifier (DTC). Once water bodies are detected, the occurrence of the detection is estimated. An overview of the water body detection algorithm is shown in Figure 5.

Remote Sens. 2016, 8, 1010 7 of 25 by Pekel [28]. The detection of water bodies involves three main steps: (i) invalid data filtering; (ii) incompatibility masking; and (iii) water body detection. The decisions made in these three steps were implemented in a Decision Tree Classifier (DTC). Once water bodies are detected, the occurrence ofRemote the Sens. detection 2016, 8, is 1010 estimated. An overview of the water body detection algorithm is shown in Figure7 of 525.

Figure 5. The Water Body Detection Algorithm consists of three main steps: (i) invalid data filtering; Figure 5. The Water Body Detection Algorithm consists of three main steps: (i) invalid data filtering; (ii) incompatibility masking; and (iii) water body detection. Qualitative information about the (ii) incompatibility masking; and (iii) water body detection. Qualitative information about the detected detected water bodies is expressed by the occurrence estimation. Both layers, i.e., the detected water water bodies is expressed by the occurrence estimation. Both layers, i.e., the detected water bodies and bodies and the occurrences, are available in the Water Bodies V2 product. the occurrences, are available in the Water Bodies V2 product.

2.4.1. Invalid Data Filtering 2.4.1. Invalid Data Filtering A low sun elevation causes elongated shadows, which are observed in the remote sensing A low sun elevation causes elongated shadows, which are observed in the remote sensing imagery imagery as darkened Earth surfaces. These might be detected as water bodies when the VALUE as darkened Earth surfaces. These might be detected as water bodies when the VALUE drops below drops below the threshold level. A check on the MC10’s averaged Solar Zenith Angle (SZA) the threshold level. A check on the MC10’s averaged Solar Zenith Angle (SZA) prohibits the detection prohibits the detection of invalid WBs, i.e., when a MC10 pixel has a SZA larger than 65° no WB of invalid WBs, i.e., when a MC10 pixel has a SZA larger than 65◦ no WB detection is done and the detection is done and the pixel is classified as “No Data”. pixel is classified as “No Data”. The next step in the WBDA is filtering the invalid data comprising snow, cloud and cloud The next step in the WBDA is filtering the invalid data comprising snow, cloud and cloud shadow, shadow, using the information from the MC10 Status Map. Each image pixel has a bit array assigned using the information from the MC10 Status Map. Each image pixel has a bit array assigned reflecting reflecting its status, pixels indicated as cloud or snow are not considered in the WB detection its status, pixels indicated as cloud or snow are not considered in the WB detection algorithm. Cloud algorithm. Cloud shadow on the Earth’s surface often has similar spectral properties as WBs and is shadow on the Earth’s surface often has similar spectral properties as WBs and is as such regularly as such regularly classified as WB. Therefore, to avoid confusion due to cloud shadow in the classified as WB. Therefore, to avoid confusion due to cloud shadow in the immediate neighborhood immediate neighborhood of the initial defined cloud pixel, an additional dilate filtering is of the initial defined cloud pixel, an additional dilate filtering is performed. In the case a pixel is performed. In the case a pixel is indicated as cloud, its twelve neighboring pixels (the circular indicated as cloud, its twelve neighboring pixels (the circular neighborhood with radius 2 pixels) are neighborhood with radius 2 pixels) are indicated as cloud as well and are therefore not taken into indicated as cloud as well and are therefore not taken into account for WB detection. account for WB detection.

2.4.2. Incompatibility Masking In addition to glaciers, dark soils and shadows, vegetated surfaces are also often a cause of confusion with WBs. Therefore, the NDVI value, computed as Equation (1) from the Red (ρRed) and NIR (ρNIR) reflectances of MC10, is used to identify the vegetated areas. − = , (1) + Pixels having an NDVI value higher or equal to 0.32 and having a WBPM pixel set (locating lowland areas where holding a WB is possible) are considered to be “Lowland Vegetation”. Some WBs have a high NDVI value (due to algae load or adjacency effect of the surrounding vegetation)

Remote Sens. 2016, 8, 1010 8 of 25

2.4.2. Incompatibility Masking In addition to glaciers, dark soils and shadows, vegetated surfaces are also often a cause of confusion with WBs. Therefore, the NDVI value, computed as Equation (1) from the Red (ρRed) and NIR (ρNIR) reflectances of MC10, is used to identify the vegetated areas.

ρNIR − ρRed NDVI = , (1) ρNIR + ρRed

Pixels having an NDVI value higher or equal to 0.32 and having a WBPM pixel set (locating lowland areas where holding a WB is possible) are considered to be “Lowland Vegetation”. Some WBs have a high NDVI value (due to algae load or adjacency effect of the surrounding vegetation) and could as such wrongly be classified as “Lowland Vegetation”. To avoid these omission errors an additional check on VALUE is added, this threshold level is denoted as NV. If the VALUE of a pixel which is designated as “Lowland Vegetation”, is less or equal to 0.11 (a value empirically defined) it is classified as “Water”. These thresholds were empirically defined over the Rift Valley test area. The Rift Valley area is unique in the sense that it contains some larger lakes with a wide variety of reflectance values, some are bright blue, others are dark blue while others are almost black (see Figure3). Besides, these lakes are surrounded with different shades of green vegetation (especially after the rain season). This situation easily allows observing how different thresholds levels influence the detection of WBs.

2.4.3. Water Body Detection Threshold values on the HUE and VALUE bands are used for detecting WBs worldwide. The thresholds were iteratively and empirically defined by decision tree classifications using different settings for the thresholds levels. First, initial fixed thresholds were obtained by observing the HUE and VALUE for the lakes in the Ethiopian Rift Valley test region shown in Figure3a. The initial fixed threshold for HUE and VALUE was set to 100 and 0.13, respectively. Subsequently, the threshold levels were iteratively adapted in different steps (VALUE = 0.13, 0.14; HUE = 100, 95; NDVI = 0.32, 0.42; and NV = 0.11, 0.13) while introducing an innovative refinement using quadratic functions, which were empirically defined. The performance of the various cases were evaluated and the thresholds were calibrated over 19 test regions worldwide (Table A2 of AppendixA) according to the Water Surface Ratio (WSR) and the metrics Commission Error (CE) and Omission Error (OE) as defined by Equations (C1)–(C3) in AppendixC. The choice of the final threshold levels is a balance between omission and commission errors: the best settings in terms of fixed thresholds are VALUE = 0.14, HUE = 100, NDVI = 0.32, and NV = 0.11 while the best quadratic functions (shown by the red lines in Figure6) are defined as:

1. The left part (for the HUE value = 0 till 34) can be written as:

(34 − Hue )2 × Scale Value = L L + O f f set , (2) L 2 L

HueL = [0 . . . 34], (3) 0.41 Scale = , (4) L 342

O f f setL = 0.14, (5)

2. The right part (for the HUE value = 34 till 100) is a bit more complex because it is based on a parabola rotated over 0.2◦ for which the rotated coordinates can be written as:

x = Hue × cos(α) − Hue2 × sin(α), (6)

y = Hue × sin(α) + Hue2 × cos(α), (7) Remote Sens. 2016, 8, 1010 9 of 25

This equation can be solved using the standard form: √ −b ± b2 − 4ac x = , (8) 2a

From which, only one solution is used to calculate the corresponding VALUE threshold: q 2 −(−cos(α)) − (−cos(α)) − 4 × sin(α) × HueR x = , (9) 2 × sin(α) Remote Sens. 2016, 8, 1010 9 of 25 2  x × sin(α) + x × cos(α) × ScaleR Value = + O f f set , (10) R 2 R =0.2°, (11) with α = 0.2◦, (11) = 34 … 360 −28.5, (12) Hue = [34 . . . 360] − 28.5, (12) R 1 = 1 , (13) Scale = 95000, (13) L 95000 =0.14, (14) O f f setR = 0.14, (14) TheThe division division by by two two in in both both Equation Equation (2) (2) and and Equation Equation (10) (10) is doneis done to to lower lower the the parabolic parabolic curve curve inin order order to reduceto reduce the the commission commission errors. errors. The The value valu 28.5e in28.5 Equation in Equation (12) was (12) empirically was empirically defined, defined, and is usedand is to used broaden to broaden the parabolic the parabolic curve (higher curve (highe valuesr willvalues broaden will broaden the curve). the Thecurve). value The 95,000 value in 95,000 the scalein the factor scale was factor empirically was empirically deined, and deined, defines and the de heightfines ofthe the height parabolic of the curve parabolic (higher curve values (higher will lowervalues the will curve). lower the curve).

FigureFigure 6. 2D6. 2D scatter scatter plots plots of of the the Vietnam Vietnam test test area area showing showing the the final final threshold threshold levels. levels. Land Land pixels pixels are are indicatedindicated by theby the dark dark grey grey dots, dots, reference reference WBs (obtainedWBs (obtained from Landsat from Landsat 8) are indicated 8) are indicated by the colored by the pluscolored symbols plus for symbols different for ranges different of water ranges surface of wate ratios.r surface The fixed ratios. threshold The fixe levelsd threshold are indicated levels by are

theindicated blue lines by (V maxthe andblue H minlines). The(Vmax refined and thresholdHmin). The levels, refined indicated threshold by the levels, red parabolic indicated lines, by allowthe red theparabolic detection lines, of WBs allow outside the detection the fixed of threshold WBs outside levels th ase fixed indicated threshold by the levels blue arrow.as indicated The red by arrowthe blue locatesarrow. the The intermixed red arrow water locates and the land intermixed pixels. water and land pixels.

The 2D scatter plot in Figure 6 gives a visual representation of the threshold levels. The fixed threshold levels (Hmin = 100, Vmax = 0.14) are indicated by the blue lines and the refined threshold levels are indicated by the red parabolic lines. The plot shows the reference WBs for the Vietnam test area indicated by the colored plus symbols and this for three ranges of Water Surface Ratio (WSR), i.e., blue for WSR = 0.95 to 1, cyan for WSR = 0.8 to 0.95 and magenta for WSR = 0.7 to 0.8. The other WSR ranges are not indicated as they would overrule the plot. The non-WB pixels, which are denoted as land, are indicated by the dark grey dot symbols. The refined thresholds allow the detection of WBs outside the fixed threshold levels as indicated by the blue arrow. Although the refined thresholds are tuned to detect as much WBs as possible, not all WBs can be detected because they are intermixed with land pixels as indicated by the red arrow. The 2D scatter plots of the eight first test areas are shown in Appendix B.

Remote Sens. 2016, 8, 1010 10 of 25

The 2D scatter plot in Figure6 gives a visual representation of the threshold levels. The fixed threshold levels (Hmin = 100, Vmax = 0.14) are indicated by the blue lines and the refined threshold levels are indicated by the red parabolic lines. The plot shows the reference WBs for the Vietnam test area indicated by the colored plus symbols and this for three ranges of Water Surface Ratio (WSR), i.e., blue for WSR = 0.95 to 1, cyan for WSR = 0.8 to 0.95 and magenta for WSR = 0.7 to 0.8. The other WSR ranges are not indicated as they would overrule the plot. The non-WB pixels, which are denoted as land, are indicated by the dark grey dot symbols. The refined thresholds allow the detection of WBs outside the fixed threshold levels as indicated by the blue arrow. Although the refined thresholds are tuned to detect as much WBs as possible, not all WBs can be detected because they are intermixed with land pixels as indicated by the red arrow. The 2D scatter plots of the eight first test areas are shown in AppendixB.

2.5. Water Bodies Occurrence Estimation To qualify the occurrence of the detected WBs, a Water Body Occurrence (WBO) is calculated for each detected WB pixel in each dekad. This measure gives an idea about the permanency or seasonality of the detected WBs. To obtain a WBO map for each dekad, per pixel statistics are calculated by temporal-sequential processing of the available dekads. To get an idea of the WB occurrence, the observation period over which the statistics are calculated has to be long enough. An observation period over two years requires 72 dekads. However, for computational reasons, only 64 dekads are observed; that is, information that can be stored in one “long word” (64 bits). If more than 64cloud free MC10 composite observations are available, only the last 63 cloud free observations and the actual observation are used for calculating the statistics. The per-pixel temporal-sequential statistics are calculated over the available observation dekads (max 64) and can be summarized as:

• Total number of temporal cloud free observations (ntObs); • Total number of temporal WB detections (ntWBs); and • Maximum number of continuous temporal WB detections (mctWBs).

From these statistics, the frequency of detected WBs can be calculated as:

ntWBs WB f = × 100(%), (15) ntObs As shown in Figure7, the calculated WB frequency ( WBf ) and the maximum number of continuous temporal WB detections (mctWBs) can be visualized in a 2D scatter plot, which, in turn, was used for defining the occurrence threshold levels. The different occurrence levels are indicated by the different colored regions in the figure. The slope and position of the four threshold lines, marked L (Low), M (Medium), H (High) and VH (Very High), define the WBO threshold levels and were empirically defined by observations of WB time series as shown in Section3. WBs that appear with a high frequency (WBf ) or with long continuous observation periods (mctWBs) are most likely stable WBs that exist for a longer period and therefore have a higher occurrence than those with lower frequency (WBf ) or shorter continuous observation periods (mctWBs). The defined occurrence threshold levels can be expressed as:    Very Low = (mctWBs > 0) & mctWBs < 2 + αL × WB f , (16)

      Low = mctWBs ≥ 2 + αL × WB f & mctWBs < 3 + αM × WB f , (17)       Medium = mctWBs ≥ 3 + αM × WB f & mctWBs < 4 + αH × WB f , (18)       High = mctWBs ≥ 4 + αH × WB f & mctWBs < 5 + αVH × WB f , (19) Remote Sens. 2016, 8, 1010 11 of 25

  Very High = mctWBs ≥ 5 + αVH × WB f , (20)

Permanent = WB f ≥ 95, (21) where αx is the slope of the threshold line x, for example,

5 − 0 αVH = , (22) 0 − 60

The red line (o1) in Figure7 shows the minimum WBf in relation to the mctWBs. The slope depends on the maximum number of observations (ntObs). The plot shows that a certain pixel is classified as a “Very High” WBO if it is identified as WB in at least five consecutive dekads (point “m” in the plot). Note that this number is independent of the ntObs. As an example, if a pixel was detected as a WB in three consecutive dekads (with nObs = 31), its WBf will be 9.7% and therefore will have a WBO “Medium” (point “a”). If this pixel would have subsequently four additional WB detections in separate single dekads, its WBf will rise to 22.6% and its WBO will increase to “High” (point “b”). RemoteNote alsoSens. 2016 that, if8, 1010 pixels have a WBf of at least 95% their WBO will be “Permanent”. 11 of 25

Figure 7. ExampleExample of of a a 2D scatter plot of the WB frequency (WBf)) and and the the number of continuous temporaltemporal WB WB observations observations ( (mctWBs).). The The plus plus symbols symbols belong belong to to pixels pixels of of detected detected WBs WBs for for the the Rift Rift Valley testtest areaarea calculatedcalculated for for the the 31 31 available available dekads. dekads. The The Water Water Body Body Occurrence Occurrence threshold threshold levels levels are areindicated indicated by theby the different different colored colored areas. areas.

2.6. Minimum Minimum Size Size of of Detectable Detectable Water Bodies A WBWB hashas to to have have a minimuma minimum spatial spatial size insize order in order to be detectedto be detected by the WB by detectionthe WB algorithm.detection algorithm.To find the To minimum find the sub-pixelminimum size sub- ofpixel a detectable size of a detectable WB, linear WB, spectral linear mixture spectral analysis mixture was analysis used. wasTherefore, used. fiveTherefore, water spectrafive water (Figure spectra8a) with (Figure an increasing 8a) with reflectance an increasing level werereflectance spectrally level mixed were spectrallywith a dark mixed and brightwith a vegetation dark and bright and soil vegetation spectrum and (Figure soil spectrum8b). Only (Figure the Red, 8b). NIR Only and the SWIR Red, bands NIR andwere SWIR used asbands these were are transformedused as these to are HUE transf andormed VALUE to after HUE the and mixture. VALUE The after mixing the mixture. was done The by mixingincreasing was the done sub-pixel by increasing fraction the of thesub-pixel WB from fraction 0 to 1of in the steps WB of from 0.05. 0 to 1 in steps of 0.05.

(a) (b)

Figure 8. Spectra used in the spectral mixture analysis to define the minimum WB size in relation to the PROBA-V pixel size: (a) five water spectra with an increasing intensity; and (b) two vegetation and two soil spectra having a low and high reflectance.

The WB detection threshold levels were set fixed to HUE = 100 and VALUE = 0.14. By analyzing the HUE and VALUE of the mixed spectra in relation to the detection threshold levels, the minimum fraction of the pixel to be covered by water could be derived. Figure 9 summarizes the results obtained after the mixed pixel analysis. In general, the dark WBs (WB 1, WB 2) needs larger

Remote Sens. 2016, 8, 1010 11 of 25

Figure 7. Example of a 2D scatter plot of the WB frequency (WBf) and the number of continuous temporal WB observations (mctWBs). The plus symbols belong to pixels of detected WBs for the Rift Valley test area calculated for the 31 available dekads. The Water Body Occurrence threshold levels are indicated by the different colored areas.

2.6. Minimum Size of Detectable Water Bodies A WB has to have a minimum spatial size in order to be detected by the WB detection algorithm. To find the minimum sub-pixel size of a detectable WB, linear spectral mixture analysis was used. Therefore, five water spectra (Figure 8a) with an increasing reflectance level were spectrally mixed with a dark and bright vegetation and soil spectrum (Figure 8b). Only the Red, NIR Remoteand Sens. SWIR2016, 8bands, 1010 were used as these are transformed to HUE and VALUE after the mixture. The12 of 25 mixing was done by increasing the sub-pixel fraction of the WB from 0 to 1 in steps of 0.05.

(a) (b)

Figure 8. Spectra used in the spectral mixture analysis to define the minimum WB size in relation to Figure 8. Spectra used in the spectral mixture analysis to define the minimum WB size in relation to the PROBA-V pixel size: (a) five water spectra with an increasing intensity; and (b) two vegetation the PROBA-Vand two soil pixel spectra size: having (a) five a waterlow and spectra high reflectance. with an increasing intensity; and (b) two vegetation and two soil spectra having a low and high reflectance. The WB detection threshold levels were set fixed to HUE = 100 and VALUE = 0.14. By analyzing Thethe HUE WB detectionand VALUE threshold of the mixed levels spectra were in set relation fixed to to the HUE detection = 100 andthreshold VALUE levels, = 0.14. the minimum By analyzing the HUEfraction and of VALUE the pixel of theto be mixed covered spectra by water in relation could tobe thederived. detection Figure threshold 9 summarizes levels, the the results minimum obtained after the mixed pixel analysis. In general, the dark WBs (WB 1, WB 2) needs larger fraction of the pixel to be covered by water could be derived. Figure9 summarizes the results obtained after the mixed pixel analysis. In general, the dark WBs (WB 1, WB 2) needs larger sub-pixel water Remote Sens. 2016, 8, 1010 12 of 25 coverage than the bright WBs (WB 4, WB 5) in order to be detected. E.g., the dark WB 1 needs a water coveragesub-pixel of 45% water when coverage mixed than with the “Dark bright Veg” WBs (WB and 4, this WB increases 5) in order to to a be water detected. coverage E.g., the of dark 85% when mixed withWB 1 the needs “Bright a water Soil”. coverage The brightof 45% WBwhen 5 needsmixed awith water “Dark coverage Veg” and of this 15% increases when mixed to a water with “Dark Veg” andcoverage 40% when of 85% mixed when withmixed “Bright with the Soil”.“Bright As Soil”. could Thebe bright observed WB 5 needs on different a water coverage WB detection of 15% results when mixed with “Dark Veg” and 40% when mixed with “Bright Soil”. As could be observed on worldwide, small (single pixel) WBs are mostly dark and therefore need a large sub-pixel coverage in different WB detection results worldwide, small (single pixel) WBs are mostly dark and therefore order toneed be detected. a large sub-pixel coverage in order to be detected.

Figure 9. The minimum fraction of the pixel to be covered by water in order to be detected as WB. Figure 9. The minimum fraction of the pixel to be covered by water in order to be detected as WB. 3. Results 3. Results Figure 10 shows an example of the Decision Tree Classification (DTC) result using the final Figurethreshold 10 shows levels. anThe exampleimage shows of thethe Decisiondetection ob Treetained Classification for the 10-days (DTC) composite result of usingthe first the final dekad of December 2013 (MC10_20131201) over the Rift Valley test area in Ethiopia for which the threshold levels. The image shows the detection obtained for the 10-days composite of the first dekad visual representation is given in Figure 3a. As shown in detail, some smaller lakes with size in the of Decemberorder of 2013 one PROBA-V (MC10_20131201) pixel are de overtected. the Figure Rift 11 Valley shows testthe Water area inBody Ethiopia Occurrences forwhich (WBO) thefor visual representationthe sameis area. given Because in Figure the larger3a. Aslakes shown are detected in detail, in each some cloud smaller free dekad, lakes they with have size a detection in the order of one PROBA-Vfrequency pixel of more are than detected. 95% and Figure are therefore 11 shows indicated the Water with a Body “Permanent” Occurrences occurrence. (WBO) As shown for the same area. Becausein the detail the larger window lakes in Figure are detected 11, the smaller in each lakes cloud have free a WBO dekad, ranging they from have “Very a detection High” to “Very frequency of more thanLow” 95% due and to temporal are therefore variability indicated and are, withas such a, “Permanent” not detected in occurrence.each dekad. This As is shown indicated in thein detail the time series analysis shown in Figure 12. window in FigureFigure 12 11 shows, the smallerthe time lakesseries analysis have a WBOfor severa rangingl smaller from lakes, “Very some High”of them to are “Very indicated Low” in due to detail in Figure 11. Lake Wedecha consists of 2 pixels that have a “Very High” occurrence because they were detected in more than five consecutive dekads. A “Very High” occurrence can also be found for several other lakes which have frequent WBs detected, i.e., K’oftu (pixel b), Chelekleka, Hora and Bishoftu. One pixel (a) of Lake K’oftu has a “Very Low” occurrence because it was detected in two non-consecutive dekads only. Lake Chilotes has a “High” occurrence because it was detected in three and two consecutive dekads. Lake Budamada, indicated by the red arrow in Figure 11, is a small lake of about 1 km diameter. It was detected in only one dekad and therefore has a “Very Low” occurrence. Nevertheless, this crater lake is permanent, i.e., it will not dry up during the drought period. Its irregular detection is due to its small size which is in the order of one PROBA-V pixel. The small size causes mixed pixels, i.e., the water body spectral response is mixed with the spectral response of the surroundings, and therefore it is not detected as WB. This phenomenon accounts for all WBs with size in the order of one PROBA-V km pixel.

Remote Sens. 2016, 8, 1010 13 of 25 temporal variability and are, as such, not detected in each dekad. This is indicated in the time series analysisRemote shown Sens. 2016 in Figure, 8, 1010 12. 13 of 25 Remote Sens. 2016, 8, 1010 13 of 25

Figure 10.FigureDecision 10. Decision Tree ClassificationTree Classification (DTC) (DTC) result for for the the part part of image of image MC10_20131201 MC10_20131201 taken over taken over Rift Valley in Ethiopia (center: 7°11′31′′N, 38°51′10′′E) for which the visual representation is given in Rift ValleyFigure in 10. Ethiopia Decision (center:Tree Classification 7◦1103100 (DTC)N, 38 result◦51010 for00E) the for part which of image the MC10_20131201 visual representation taken over is given RiftFigure Valley 3a. in Beside Ethiopia the larger(center: lakes 7°11 in′31 the′′N, area, 38°51 some′10′′ smallerE) for which lakes arethe detected visual representation as shown by the is detailgiven in in Figure3a. Beside the larger lakes in the area, some smaller lakes are detected as shown by the Figurewindow. 3a. Beside the larger lakes in the area, some smaller lakes are detected as shown by the detail detail window.window.

Figure 11. The Water Body Occurrence (WBO) for part of image MC10_20131201 taken over Rift Valley in Ethiopia (center: 7°11′31′′N, 38°51′10′′E).

FigureFigure 11. The 11. Water The Water Body Body Occurrence Occurrence (WBO) (WBO) for for part part of of image image MC10_20131201 MC10_20131201 taken taken over over Rift Rift Valley Valley in Ethiopia (center: 7°11′31′′N, 38°51′10′′E). in Ethiopia (center: 7◦1103100N, 38◦5101000E).

Figure 12 shows the time series analysis for several smaller lakes, some of them are indicated in detail in Figure 11. Lake Wedecha consists of 2 pixels that have a “Very High” occurrence because they were detected in more than five consecutive dekads. A “Very High” occurrence can also be found Remote Sens. 2016, 8, 1010 14 of 25 for several other lakes which have frequent WBs detected, i.e., K’oftu (pixel b), Chelekleka, Hora and Bishoftu. One pixel (a) of Lake K’oftu has a “Very Low” occurrence because it was detected in two non-consecutive dekads only. Lake Chilotes has a “High” occurrence because it was detected in three and two consecutive dekads. Lake Budamada, indicated by the red arrow in Figure 11, is a small lake of about 1 km diameter. It was detected in only one dekad and therefore has a “Very Low” occurrence. Nevertheless, this crater lake is permanent, i.e., it will not dry up during the drought period. Its irregular detection is due to its small size which is in the order of one PROBA-V pixel. The small size causes mixed pixels, i.e., the water body spectral response is mixed with the spectral response of the surroundings, and therefore it is not detected as WB. This phenomenon accounts for all WBs with size in the order of one PROBA-V km pixel. Remote Sens. 2016, 8, 1010 14 of 25

Dekads

Lake Occurrence 20131021 20131101 20131111 20131121 20131201 20131211 20131221 20140101 20140111 20140121 20140201 20140211 20140221 20140301 20140311 20140321 20140401 20140411 20140421 20140501 20140511 20140521 20140601 20140611 20140621 20140701 20140711 20140721 20140801 20140811 20140821 a Very high Wedecha b Very high a Very low K'oftu b Very high Chilotes High Guda Very high a Very high Chelekleka b Very high a Very high Hora b Very high Bishoftu Very high Budamada Very low No detection Water body detected Cloud Figure 12. Time series analysis for several small lakes showing the detections per dekad and the Figure 12.resultingTime seriesWB occurrence. analysis Each for row several shows the small behavior lakes of showingone pixel of the the water detections body during per dekadthe 31 and the resulting WBdekads occurrence. starting from Each 21 October row 2013 shows to 21 the August behavior 2014. The of blue one marked pixel ofdekads the show water the body dekads during the 31 dekadswherestarting the pixel from is 21 detected October as WB. 2013 to 21 August 2014. The blue marked dekads show the dekads where the pixel is detected as WB. The rain season over the Ethiopian Rift Valley peaks in the months July and August while the dry season appears in the period between mid-October and mid-March. This seasonal variation can The rainbe observed season in over the appearance the Ethiopian of some Rift WBs Valley as shown peaks in Figure in the 13. The months wetland July area and near August Wererso while the contains no WB pixels at the end of March, i.e., at the end of the dry season. In the next three dekads, dry season appears in the period between mid-October and mid-March. This seasonal variation the number of detected WB pixels increases. In the last dekad of July and the first dekad of August, can be observedan extremely in thelarge appearance number of pixels of someindicate WBs WBs for as this shown wetland in area Figure and its13 .neighborhood. The wetland The area near Wererso containstwo other no lakes, WB Lake pixels Boio at and the the end Lake of Koka, March, show i.e., a similar at the endbehavior, of the i.e., dry their season. size grows In theover next three dekads, thetime, number which is of in detectedaccordanceWB with pixelsthe progression increases. of the In rain the season. last dekad of July and the first dekad of August, an extremely large number of pixels indicate WBs for this wetland area and its neighborhood. The two other lakes, Lake Boio and the Lake Koka, show a similar behavior, i.e., their size grows over time, which is in accordance with the progression of the rain season. Figure 14 shows the Water Body Occurrence (WBO) time series for the Rift Valley test area. The plot shows per dekad the cumulated number of pixels for the occurrences ranging from “Very Low” to “Very High”. The WBs with a “Permanent” occurrence are very large and would overrule the plot, therefore they are not indicated. Note that cloud cover may mask WBs with the result that those WBs are not accounted for in the plot and therefore the plot may show some irregularity. Nevertheless, the plot shows some correlation with the seasonal variation as described before. From the first observed dekad, end of October 2013, to mid-February 2014, there is a gradual decrease of detected WB pixels which is in accordance with the dry season. The amount of detected WB pixels with a “Very High” occurrence stays more or less stable. This period is followed by a more random increase of detected WBs as a result of the upcoming rain season. The pixels that have a “Low” and “Very Low” occurrences show the largest variability. In general the number of detected WBs clearly correlates with the seasonal variability, i.e., the rain season, which also correlates with the cloud cover.

Figure 13. Water body evolution for three WBs in the Rift Valley, the wetland near Wererso, Lake Boio and Lake Koka. The water bodies grow in size during the rainy season, which peaks in the months July and August.

Remote Sens. 2016, 8, 1010 14 of 25

Dekads

Lake Occurrence 20131021 20131101 20131111 20131121 20131201 20131211 20131221 20140101 20140111 20140121 20140201 20140211 20140221 20140301 20140311 20140321 20140401 20140411 20140421 20140501 20140511 20140521 20140601 20140611 20140621 20140701 20140711 20140721 20140801 20140811 20140821 a Very high Wedecha b Very high a Very low K'oftu b Very high Chilotes High Guda Very high a Very high Chelekleka b Very high a Very high Hora b Very high Bishoftu Very high Budamada Very low No detection Water body detected Cloud Figure 12. Time series analysis for several small lakes showing the detections per dekad and the resulting WB occurrence. Each row shows the behavior of one pixel of the water body during the 31 dekads starting from 21 October 2013 to 21 August 2014. The blue marked dekads show the dekads where the pixel is detected as WB.

The rain season over the Ethiopian Rift Valley peaks in the months July and August while the dry season appears in the period between mid-October and mid-March. This seasonal variation can be observed in the appearance of some WBs as shown in Figure 13. The wetland area near Wererso contains no WB pixels at the end of March, i.e., at the end of the dry season. In the next three dekads, the number of detected WB pixels increases. In the last dekad of July and the first dekad of August, an extremely large number of pixels indicate WBs for this wetland area and its neighborhood. The Remote Sens. two2016 ,other8, 1010 lakes, Lake Boio and the Lake Koka, show a similar behavior, i.e., their size grows over 15 of 25 time, which is in accordance with the progression of the rain season.

Remote Sens. 2016, 8, 1010 15 of 25

Figure 14 shows the Water Body Occurrence (WBO) time series for the Rift Valley test area. The plot shows per dekad the cumulated number of pixels for the occurrences ranging from “Very Low” to “Very High”. The WBs with a “Permanent” occurrence are very large and would overrule the plot, therefore they are not indicated. Note that cloud cover may mask WBs with the result that those WBs are not accounted for in the plot and therefore the plot may show some irregularity. Nevertheless, the plot shows some correlation with the seasonal variation as described before. From the first observed dekad, end of October 2013, to mid-February 2014, there is a gradual decrease of Figure 13.detectedFigureWater 13.WB body Water pixels evolution body which evolution is in for accordance threefor three WBs WBswith in inthe the the dry RiftRi season.ft Valley, Valley, The thethe amount wetland wetland of near detected nearWererso, WB Wererso Lakepixels (Centre: ◦ 0 withBoio a◦ and“Very0 Lake High” Koka. occurrence The water stays bodies more grow◦ or less 0in sistzeable. during This◦ 0 theperiod rainy is season,followed which by a peaksmore randomin the ◦ 0 8 52 46”N,increasemonths 38 22 of July46”E), detected and August. Lake WBs Boioas a result (Centre: of the 7 upcoming30 48”N, rain 38 season.2 24”E) The and pixels Lake that Koka have (Centre:a “Low” and 8 23 50”N, ◦ 0 39 4 33”E).“Very The Low” water occurrences bodies show grow the in largest size during variabilit they. In rainy general season, the number which of detected peaks inWBs the clearly months July

and August.correlates with the seasonal variability, i.e., the rain season, which also correlates with the cloud cover.

Figure 14. The Water Body Occurrence (WBO) time series for the Rift Valley test area. The Figure 14. The“Permanent”Water Body larger Occurrence lakes are not (WBO)shown in timethe gr seriesaph because for the their Rift large Valley size would test area.overrule The the “Permanent” larger lakes areplot. not shown in the graph because their large size would overrule the plot.

4. Assessment of Algorithm Performance 4. Assessment ofTo Algorithm demonstrate the Performance maturity of the PROBA-V WBDA, reference water bodies were extracted from high spatial resolution Landsat 8 images over 15 regions selected worldwide (Table A3 of To demonstrateAppendix theA). The maturity workflow of and the algorithm PROBA-V to derive WBDA, the reference reference WBs water from Landsat bodies 8 wereimages extractedis from high spatial resolutiondescribed in Appendix Landsat C. 8 images over 15 regions selected worldwide (Table A3 of AppendixA). The workflow andThe reference algorithm water to bodies derive extracted the from reference Landsat WBs8 imagery from might Landsat also contain 8images omission and is described in commission errors and might influence the PROBA-V performance analysis. The reference AppendixC. commission errors occur at the Landsat 8 spatial resolution and it concerns groups of pixels ranging The referencefrom one water to several bodies tens. The extracted corresponding from calc Landsatulated WSR 8 imagery is therefore might very small also (a contain PROBA-V omission and commission errorspixel for and which might the WSR influence is calculated the PROBA-Vcomprises a few performance hundred Landsat analysis. 8 pixels). The Typically, reference the commission WSR for the wrongly detected WB is much lower than the minimum WSR of 0.5 used in the analysis. errors occur atOmission the Landsat errors are 8 much spatial more resolution difficult to quantify. and it concernsNevertheless, groups it can be of said pixels that rangingthese also from one to several tens.concern The corresponding small groups of pixels calculated resulting in WSR a very is small therefore reference very error smallnot influencing (a PROBA-V the analysis. pixel for which the WSR is calculatedTable 1 comprisesshows the commission a few hundred and omission Landsat errors on 8 pixels).the PROBA-V Typically, images for the six WSR minimum for the wrongly WSR ranges and this for the 15 test areas. detected WB is muchAs seen lower in Table than 1, the the commission minimum and WSRomission of errors 0.5 used vary for in the different analysis. test Omissionareas. An errors are much more difficultaverage commission to quantify. error Nevertheless, of 1.5% was obtained it can with be saidan average that theseomission also error concern of 9.8% small for groups of pixels resultingminimum in a veryWSR small0.6 and reference15.4% for mi errornimum not WSR influencing 0.5. Notice that the the analysis. user requirements define a

Remote Sens. 2016, 8, 1010 16 of 25

Table1 shows the commission and omission errors on the PROBA-V images for six minimum WSR ranges and this for the 15 test areas. As seen in Table1, the commission and omission errors vary for the different test areas. An average commission error of 1.5% was obtained with an average omission error of 9.8% for minimum WSR 0.6 and 15.4% for minimum WSR 0.5. Notice that the user requirements define a maximum omission and commission error of 15% [35]. The acceptable omission errors for the highest minimum WSR are highlighted in the table. Though the Argentina test area shows the best results for the omission errors it has the highest commission error of all test areas (10.7%). This high commission error is due to dark soil pixels, which were wrongly detected as water bodies. The commission error of 4.9% for the Zambia test area is also due to dark soil pixels while the commission error of 4.0% for the Canada test area is due to dark vegetation pixels, which were detected as water bodies. The Finland test area shows the worst performance for the omission error, i.e., an acceptable omission error of 13.6% for minimum WSR 0.95. These high omission errors are due to the presence of many small and narrow lakes, in the order of the size of a PROBA-V 1 km pixel, by which they were not detected. This effect accounts for all other regions as well, i.e., for the Russian test area: 12.2% for minimum WSR 0.8; and, for the USA test area: 14.9% for minimum WSR 0.7. The other test areas show the highest omission errors for minimum WSR 0.6 and 0.5.

Table 1. Assessment results for the 15 test areas. The user requirements define a maximum omission and commission error of 15%. All commission errors fall within the requirements. The acceptable omission errors for the highest WSR are highlighted.

OE (%)

Area CE (%) Minimum WSR 0.95 0.9 0.8 0.7 0.6 0.5 Argentina 10.7 0 0 0 0 0.4 0.8 Australia 1.4 0 0 0 0 0.8 1.1 Bolivia 0.7 0 0 0.1 0.7 2.8 9.4 Canada 4.0 0 0 0.2 0.3 1.5 5.0 China 0 0.1 0.2 0.4 1.2 3.3 7.3 Finland 1 13.6 17.3 26.8 36.3 46.6 56.2 India 0 0 0 0 1.3 7.6 16 New Zealand 0 0.2 0.2 1.4 3.9 8.1 12.3 Russia 0.1 5.3 6.9 12.2 19.0 27.6 39.8 Swiss 0 0.6 1.1 3.5 6.5 11.8 18.9 Thailand 0 0 0 0.4 2.2 7.4 16.9 Chad 0 0 0 0 0 0.5 1.7 Turkmenistan 0.3 0 0 0.1 0.8 2.6 6.5 USA 0 0.4 0.9 6.7 14.9 24.7 32.6 Zambia 4.9 0 0 0 0 1.8 7.1 Average 1.5 1.3 1.8 3.5 5.8 9.8 15.4

5. Discussion The user requirements define a maximum omission and commission error of 15% [36]. As shown in this work, a mean commission error of 9.8% is obtained for minimum WSR = 0.6 while the mean omission error is 1.5%. Though minimization of omission and commission errors is achieved by the use of additional data layers, WBPM, VSM, PGM and NDVI, they sometimes are inevitably. Small sized WBs in the order of one PROBA-V 1km pixel and WBs with spectral properties that fall outside the defined thresholds are the cause of omission errors. Commission errors are mostly caused by anthropogenic structures having spectral signatures equal to WBs, e.g., some agricultural fields, green houses, build-up areas, etc. In addition, natural surfaces like salt planes, dark soils, mining sites, etc. might cause commission errors. An important source of commission errors are dark surfaces, e.g., due to undetected cloud shadows, dark soils due to heavy industry, dark surfaces due to shadow of high structures, etc. Therefore, to minimize confusion the user should use the water body product with attention to the quality layer which reflects the history of the detected WBs. Remote Sens. 2016, 8, 1010 17 of 25

Special attention has to be paid to the three layers used during incompatibility masking. The WBPM was made based on the GLSDEM for which data was collected in the early 2000s. Because of the high resolution of the GLSDEM dataset (90 m spatial and 1 m vertical), the derived WBPM (1 km spatial resolution) reveals the finest detail (as shown in Figure3b). Although the Earth’s topography changes only notably over large geological time scales, natural events (e.g., earthquakes) or anthropogenic activities (e.g., dam building) can influence the topography on shorter time scales. Therefore, when such events take place, the WBPM needs to be updated. Just to give an idea; the PROBA-V 1 km world image contains 40,320 samples × 15,680 lines or 632.22 × 106 pixels. Of those, 29% (183.5 × 106 pixels) consist of landmass (i.e., not ocean), and 50.5% (92.7 × 106 pixels) of the landmass is indicated as potential water body. Because the extent of glaciers worldwide are strongly influenced by climate (global warming), the PGM needs to be frequently updated. Frequent updates of the permanent glacier data are available at the National Snow and Ice Data Center (NSIDC). The geological situation in the volcanic areas of South America, i.e., Chili and Peru, is very complex and delineating the volcanic soils in these areas was not always easy. It is therefore very well possible that some smaller regions in this area are still missing in the VSM. On the other hand, future volcanic eruptions might produce new dark volcanic soils leading to false detected WBs. In both cases, an update of the VSM is needed.

6. Conclusions This water bodies detection algorithm is implemented in the operational processing chain of the Copernicus Global Land Service (http://land.copernicus.eu/global/). It delivers 10-day global water bodies detection maps accompanied by a quality layer map. Although some minor artifacts were introduced during the algorithm assessment using Landsat 8 reference WBs, the assessment gives a good idea about the algorithm performance. The analysis, performed over 15 test regions, shows that the product reaches a mean commission error of 1.5% while the mean omission error for minimum WSR = 0.6 is 9.8%. The developed methodology can easily be applied to the 333 m and 100 m PROBA-V products and 300 m Sentinel-3 products, by which smaller water bodies will be detected. Applying the methodology on higher resolution sensors as Sentinel-2 is possible, but its dynamics will be hampered by the lower temporal frequency of such sensors. The water bodies detection algorithm is also tuned to detect water bodies on 1 km SPOT-VGT dataset. The whole SPOT-VGT archive is being processed and water body detection maps from 1999 to 2014 will become available by the end of 2016. An exhaustive validation exercise of the PROBA-V and SPOT/VGT 1 km WB products has been performed by independent entity and the report is available on the Copernicus Global Land Service website. The water bodies products can be downloaded from the Copernicus Global Land Service data portal: http://land.copernicus.vgt. vito.be. Additional information is available at: http://land.copernicus.eu/global/products/wb.

Acknowledgments: The product was generated by the land service of Copernicus, the Earth Observation programme of the European Commission. The research leading to the current version of the product has received funding from various European Commission Research and Technical Development programmes. The product is based on PROBA-V 1km data ((c) ESA and distributed by VITO). Author Contributions: Luc Bertels designed the water bodies detection algorithm. Davy Wolfs implemented the algorithm in the operational processing chain. Bruno Smets managed the project and revised the paper. Conflicts of Interest: The authors declare no conflict of interest.

Appendix A Table A1 lists the region for fixed thresholds definition, Table A2 lists the 19 test regions used for refining the threshold levels, and Table A3 lists the 10 test regions selected worldwide to assess the performance of the detection algorithm. Remote Sens. 2016, 8, 1010 18 of 25

Table A1. The Ethiopian Rift Valley test area used for fixed threshold definition, the upper left corner coordinate and number of samples (ns) and lines (nl) are given.

Upper Left Coordinate PROBA-V Area Dekad Lat Lon ns nl Rift Valley 20131201 8.96429722 37.08032222 400 400

Table A2. Landsat 8 scenes of the 19 test regions selected worldwide were used to refine the threshold levels. As much as possible, cloud free scenes were selected. The corresponding PROBA-V dekad and the upper left corner coordinate are given with the corresponding number of samples (ns) and lines (nl).

Upper Left Coordinate PROBA-V Area Landsat 8 Scene Dekad Lat Lon ns nl Argentina 1 LC82270862014111LGN00 20140421 −36.40625000 −63.94196429 299 239 Argentina 2 LC82310932014011LGN00 20140111 −46.29910714 −73.83482143 376 257 Australia LC81110782014131LGN00 20140511 −24.94196429 118.49553571 267 236 Bangladesh LC81380442014112LGN00 20140421 24.17410714 87.74553571 244 240 Bolivia LC82330692013310LGN00 20131101 −11.95089286 −66.85267857 232 239 Brazil LC82220742014028LGN00 20140121 −19.17410714 −51.55803571 248 237 Canada LC80180272014151LGN00 20140521 48.53125000 −80.06696429 332 246 China LC81210382014121LGN00 20140501 32.79910714 116.05803571 271 240 Congo LC81740662014140LGN00 20140521 −7.62053571 25.22767857 235 238 India LC81460422014072LGN00 20140311 27.04910714 76.05803571 249 240 Mali LC81950502013332LGN00 20131121 15.51339286 −2.25446429 234 239 Mexico LC80270422014118LGN00 20140421 27.04017857 −100.12053571 260 238 Poland LC81870222014087LGN00 20140401 55.60267857 21.60267857 393 250 Rusia LC81770212014129LGN00 20140501 56.99553571 37.60267857 426 248 South Africa LC81690802014121LGN00 20140501 −27.77232143 28.12946429 277 246 Spain LC82010332014121LGN00 20140501 39.95089286 −5.66517857 306 239 Turkey LC81730332013306LGN00 20131101 39.95982143 37.65625000 299 240 USA LC80430342014134LGN00 20140511 38.54017857 −121.87946429 281 242 Vietnam LC81240512014062LGN00 20140301 14.08482143 107.11160714 241 242

Table A3. Landsat 8 scenes of the 15 test regions selected worldwide were used to assess the water bodies detection algorithm. As much as possible, cloud free scenes were selected. The corresponding PROBA-V dekad and the upper left corner coordinate are given with the corresponding number of samples (ns) and lines (nl).

Upper Left Coordinate PROBA-V Area Landsat 8 Scene Dekad Lat Lon ns nl Argentina LC82310872014011LGN00 20140111 −37.83418611 −70.58413611 303 241 Australia LC80950832013304LGN00 20131021 −32.12027778 141.27138333 288 238 Bolivia LC82330692014233LGN00 20140821 −11.95757500 −66.85950000 233 239 Canada LC80180262014295LGN00 20141021 49.94476111 −79.54769167 340 247 Chad LC81830512014011LGN00 20140111 14.06290278 15.97345556 234 238 China LC81210392014121LGN00 20140501 31.35326944 115.68581667 270 239 Finland LC81920132014154LGN00 20140601 68.02264167 21.52997222 686 253 India LC81420442014028LGN00 20140121 24.16315278 81.55028611 247 239 New Zealand LC80750912014294LGN00 20141021 −43.52644444 168.42483889 329 243 Russia LC81100132014252LGN00 20140901 68.03188333 148.26286944 676 255 Swiss LC81950272014079LGN00 20140311 48.49976111 6.28843889 360 241 Thailand LC81280482014090LGN00 20140321 18.38631389 101.85821667 247 237 Turkmenistan LC81570332014133LGN00 20140511 39.95980833 62.34785278 298 242 USA LC80260372014207LGN00 20140721 34.21517222 −96.81486667 290 237 Zambia LC81730712014133LGN00 20140511 −14.84697222 25.17750000 240 238 Remote Sens. 2016, 8, 1010 19 of 25

Remote Sens. 2016, 8, 1010 19 of 25 Appendix B Appendix B

FigureFigure B1. B1.2D 2D scatter scatter plots plots of of the the firstfirst eight test test areas. areas.

Remote Sens. 2016, 8, 1010 20 of 25 Remote Sens. 2016, 8, 1010 20 of 25

Appendix C Appendix C Figure C1 shows the workflow for extracting the reference WBs and for defining the Water Surface Figure C1 shows the workflow for extracting the reference WBs and for defining the Water Ratio (WSR). Surface Ratio (WSR).

FigureFigure C1.C1.Workflow Workflow for for the the extraction extraction and and definition definition of reference of reference water water bodies bodies using Landsatusing Landsat 8 scenes. 8 TOA:scenes. Top TOA: Of Atmosphere; Top Of Atmosphere; AS Temp: At AS Temp: Brightness At satellite Temperature; Brightness ACCAm: Temperature; Automatic ACCAm: Cloud CoverAutomatic Assessment–modified. Cloud Cover Assessment–modified.

AppendixAppendix C.1.C.1. Building the ReferenceReference DatasetDataset TheThe USGSUSGS “EarthExplorer”“EarthExplorer” waswas usedused toto searchsearch forfor LandsatLandsat 88 OLI/TIRSOLI/TIRS cloud free scenes over 1919 selectedselected testtest areas areas worldwide. worldwide. These These test test areas areas were we selectedre selected based based on visual on visual observations observations using Google using EarthGoogle (water Earth bodies(water mustbodies be must present). be present). Subsequently Subseq oneuently cloud one freecloud scene, free orscene, one or which one which is as much is as asmuch possible as possible cloud free, cloud was downloadedfree, was downloaded and unpacked. and The unpacked. data was subsequentlyThe data was pre-processed subsequently to convertpre-processed the eleven to convert bands tothe TOA eleven radiance, bands to the TOA 9 OLI radiance, bands to the TOA 9 OLI reflectance bands to and TOA the reflectance 2 TIRS bands and tothe at-satellite 2 TIRS bands brightness to at-satellite temperature. brightness temperature. TheThe “Automatic“Automatic Cloud Cover Assessment-modified”Assessment–modified” (ACCAm) was used for cloud detection. ACCAmACCAm isis aa modifiedmodified versionversion ofof thethe ACCAACCA [[36]36] developeddeveloped forfor LandsatLandsat 7.7. InIn thisthis modifiedmodified cloudcloud detectiondetection algorithmalgorithm only the spectralspectral cloudcloud identificationidentification is retainedretained andand anan additionaladditional checkcheck onon thethe cirruscirrus bandband isis added.added. InIn thethe nextnext stepstep thethe “SWIR“SWIR 1”,1”, “NIR”“NIR” andand “Red”“Red” bandsbands areare colorcolor transformedtransformed toto thethe HSV-colorHSV-color system.system. This allows for an easy water bodies detection using thresholds on the HUE and VALUE band.band. As shown by the Decision Tree ClassifierClassifier (DTC)(DTC) inin FigureFigure C2C2,, thethe classclass “No“No Data”Data” isis basedbased onon aa zerozero valuevalue inin thethe TIRTIR 1,1, SWIRSWIR 1,1, NIRNIR oror RedRed band.band. The “Cloud” class is set according the cloud mask generatedgenerated by ACCAmACCAm (1 meansmeans cloudcloud detected).detected). In thethe finalfinal step, water is separatedseparated from land byby aa thresholdthreshold valuevalue onon HUEHUE ((≥≥160)160) and and VALUE VALUE (<0.4).

Remote Sens. 2016, 8, 1010 21 of 25 Remote Sens. 2016, 8, 1010 21 of 25

FigureFigure C2.C2. Decision Tree ClassifierClassifier forfor WBWB detectiondetection onon LandsatLandsat 88 images.images. AA checkcheck onon different bandsbands andand onon cloudcloud cover cover removes removes incompatible incompatible pixels pixels from from the the scene. scene. A check A check on HUE on HUE and VALUEand VALUE is used is toused detect to detect water water bodies. bodies.

AppendixAppendix C.2.C.2. CalculationCalculation ofof thethe WaterWater SurfaceSurface RatioRatio andand EvaluationEvaluation MetricsMetrics

For each pixel (xy) of the PROBA-V WB product, the Water Surface Ratio (WSRxy) is calculated For each pixel (xy) of the PROBA-V WB product, the Water Surface Ratio (WSRxy) is calculated as theas the ratio ratio of the of numberthe number of Landsat of Landsat 8 water 8 pixelswater (DTCpixels results) (DTC byresults) the total by numberthe total of number valid Landsat of valid 8 pixelsLandsat contained 8 pixels incontained the PROBA-V in the pixelPROBA-V (Equation pixel (C1)).(Equation (C1)).

(° ) = (N◦ of water pixels) , (C1) ( ° Kxy) WSRxy = ◦ , (C1) (Total N of valid pixels)Kxy The WSRxy has a value greater than zero and less than or equal to one when the PROBA-V pixel has “Water”The WSR classifiedxy has a value pixels greater present than in zerothe corresponding and less than orLandsat equal to8 oneDTC when pixels. the The PROBA-V WSRxy is pixel not hascalculated “Water” when classified the cloud pixels cover present of the in thecorresponding corresponding Landsat Landsat 8 DTC 8 DTC pixels pixels. is more The thanWSR 10%xy is or not if calculatedmore than when10% of the the cloud pixels cover contain of theno correspondingdata. Pixels in the Landsat final WSR 8 DTC image pixels are is indicated more than as 10% “Cloud” or if more(WSR than = 3) or 10% “No of Data” the pixels (WSR contain = 0), respectively. no data. Pixels When in theno “Water” final WSR classified image are pixels indicated are present as “Cloud” in the (WSRcorresponding = 3) or “No DTC Data” pixels (WSR (WSR = 0),xy = respectively. 0), the final WSR When pixel no “Water” is set toclassified “Land” (WSR pixels = are2). present in the correspondingFigure C3a,c DTC shows pixels (theWSR Landsatxy = 0), the 8 finalscene WSR LC82270862014319LGN00, pixel is set to “Land” (WSR downloaded = 2). for test locationFigure Argentina1 C3a,c shows (37°S, the 63°W). Landsat Figure 8 scene C3b,d LC82270862014319LGN00, shows the Decision Tree downloaded Classifier (DTC) for test result location for Argentina1the same scene. (37◦ S,The 63 details◦W). Figure in Figure C3b,d C3c,d shows show thes Decisionfour corresponding Tree Classifier PROBA-V (DTC) pixels, result forindicated the same by scene.the blue The squares. details inIn Figure this particular C3c,d shows case, four each corresponding PROBA-V pixel PROBA-V contains pixels, 858 indicatedpixels, however by the bluethis squares.number Invaries this depending particular case, on the each latitude. PROBA-V pixel contains 858 pixels, however this number varies dependingFigure on C4a the shows latitude. the corresponding PROBA-V WB product extracted for the second dekad of NovemberFigure 2014.C4a shows Figure the C4b corresponding shows the calculated PROBA-V WSR. WB The product detail extracted in Figure for C4c the indicates second dekadthe same of Novemberfour pixels 2014.as in Figure Figure C3c,d.C4b shows These the pixels calculated have a WSR.WSR of: The 0.7 detail (a); 0.03 in Figure (b); 0.02 C4 (c);c indicates and 0.09 the (d). same four pixels as in Figure C3c,d. These pixels have a WSR of: 0.7 (a); 0.03 (b); 0.02 (c); and 0.09 (d).

Remote Sens. 2016, 8, 1010 22 of 25 Remote Sens. 2016, 8, 1010 22 of 25 Remote Sens. 2016, 8, 1010 22 of 25

Figure C3. (a) The Landsat 8 scene LC82270862014319LGN00, downloaded for location: 37°S, 63°W Figure C3. (a) The Landsat 8 scene LC82270862014319LGN00, downloaded for location: 37◦S, 63◦W (Argentina),Figure C3. (wasa) The acquired Landsat on 8 15scene November LC82270862014319LGN00, 2014. (b) Decision downloadedTree Classifier for (DTC) location: result 37°S, for 63°W the b (Argentina),same(Argentina), Landsat was 8was scene. acquired acquired (c,d) onDetails on 15 15 November showingNovember in 2014.blue2014. the (b )four Decision PROBA-V Tree Tree pixelsClassifier Classifier overlaid, (DTC) (DTC) which result result are for also for the the sameindicatedsame Landsat Landsat in Figure 8 scene.8 scene. C4. (c (,dc,)d Details) Details showing showing inin blueblue thethe four PROBA-V PROBA-V pixels pixels overlaid, overlaid, which which are are also also indicatedindicated in in Figure Figure C4 C4..

Figure C4. (a) The PROBA-V WB area extracted from 2nd dekad of November 2014, which Figure C4. (a) The PROBA-V WB area extracted from 2nd dekad of November 2014, which Figurecorresponds C4. (a) Thewith PROBA-Vthe downloaded WB area Landsat extracted 8 area. from ( 2ndb) The dekad Water of NovemberSurface Ratio 2014, (WSR) which (PROBA-V corresponds corresponds with the downloaded Landsat 8 area. (b) The Water Surface Ratio (WSR) (PROBA-V withspatial the downloadedresolution) which Landsat is calculated 8 area. (b using) The the Water Landsat Surface 8 DTC Ratio result. (WSR) (c (PROBA-V) The four indicated spatial resolution) pixels havespatial a different resolution) WSR: which (a) 0.7; is calculated(b) 0.03; (c) using 0.02; andthe Landsat(d) 0.09. 8These DTC fourresult. pixels (c) The are alsofour overlaidindicated on pixels the which is calculated using the Landsat 8 DTC result. (c) The four indicated pixels have a different imageshave a in different the previous WSR: figure.(a) 0.7; (b) 0.03; (c) 0.02; and (d) 0.09. These four pixels are also overlaid on the WSR: (a) 0.7; (b) 0.03; (c) 0.02; and (d) 0.09. These four pixels are also overlaid on the images in the images in the previous figure. previous figure.

Remote Sens. 2016, 8, 1010 23 of 25

In the final step, the omission errors (OE) and commission errors (CE) are derived from the confusion matrix [37,38]. The omission error is calculated per minimum WSR range, i.e., 0.95, 0.9, 0.8, 0.7, 0.6 and 0.5 (e.g., a minimum WSR of, e.g., 0.95 means only water body pixels with a WSR ≥ 0.95 are considered as reference water body) and per test area. The confusion matrix can be written as shown in Table C1.

Table C1. Confusion matrix frim which the Commission and omission errors are calculated.

Reference WB No WB WB p11 p12 WBV2 No WB p21 p22 where p11 = the number of pixels indicated as WB in both reference and WBV2 and having a WSR larger or equal to the minimum WSR; p12 = the number of pixels not indicated as WB in the reference but indicated as WB in WBV2 and having a WSR larger or equal to the minimum WSR; p21 = the number of pixels indicated as WB in the reference but not in WBV2 and having a WSR larger or equal to the minimum WSR; p22 = the number of pixels not indicated as WB in both reference and WBV2 and having a WSR larger or equal to the minimum WSR.

From which, the CE and OE are calculated:

p12 CE = × 100, (C2) p11 + p12

p21 OE = × 100. (C3) p11 + p21

References

1. Vörösmarty, C.J.; Green, P.; Salisbury, J.; Lammers, R.B. Global water resources: Vulnerability from climate change and population growth. Science 2000, 289, 284–288. [CrossRef][PubMed] 2. Alsdorf, D.E.; Lettenmaier, D.P. Tracking fresh water from space. Science 2003, 301, 1491–1494. [CrossRef] [PubMed] 3. Lacaux, J.-P.; Tourre, Y.M.; Vignolles, C.; Ndione, J.-A.; Lafaye, M. Classification of ponds from high-spatial resolution remote sensing: Application to Rift Valley fever epidemics in Senegal. Remote Sens. Environ. 2007, 106, 66–74. [CrossRef] 4. Breman, H.; Groot, J.J.R.; Van Keulen, H. Resource limitations in Sahelian agriculture. Glob. Environ. Change Hum. Policy Dimens. 2001, 11, 59–68. [CrossRef] 5. Haas, E.M.; Bartholomé, E.; Combal, B. Time series analysis of optical remote sensing data for the mapping of temporary surface water bodies in sub-Saharan Western Africa. J. Hydrol. 2009, 370, 52–63. [CrossRef] 6. Hein, L. The impacts of grazing and rainfall variability on the dynamics of a Sahelian rangeland. J. Arid Environ. 2006, 64, 488–504. [CrossRef] 7. Shevyrnogov, A.P.; Kartushinsky, A.V.; Vysotskaya, G.S. Application of satellite data for investigation of dynamic processes in inland water bodies: Lake Shira (Khakasia, Siberia), a case study. Aquat. Ecol. 2002, 36, 153–164. [CrossRef] 8. Carroll, M.L.; Townshend, J.R.G.; DiMiceli, C.M.; Loboda, T.; Sohlberg, R.A. Shrinking lakes of the Arctic: Spatial relationships and trajectory of change. Geophys. Res. Lett. 2011, 38, 1–5. [CrossRef] Remote Sens. 2016, 8, 1010 24 of 25

9. Cole, J.J.; Prairie, Y.T.; Caraco, N.F.; McDowell, W.H.; Tranvik, L.J.; Striegl, R.G.; Duarte, C.M.; Kortelainen, P.; Downing, J.A.; Middelburg, J.J.; et al. Plumbing the global carbon cycle: Integrating inland waters into the terrestrial carbon budget. Ecosystems 2007, 10, 172–185. [CrossRef] 10. Craglia, M.; de Bie, K.; Jackson, D.; Pesaresi, M.; Remetey-Fülöpp, G.; Wang, C.; Annoni, A.; Bian, L.; Campbell, F.; Ehlers, M.; et al. Digital Earth 2020: Towards the vision for the next decade. Int. J. Digit. Earth 2012, 5, 4–21. [CrossRef] 11. Carroll, M.L.; Townshend, J.R.; DiMiceli, C.M.; Noojipady, P.; Sohlberg, R.A. A new global raster water mask at 250 m resolution. Int. J. Digit. Earth 2009, 2, 291–308. [CrossRef] 12. Klein, I.; Dietz, A.; Gessner, U.; Dech, S.; Kuenzer, C. Results of the Global WaterPack: A novel product to assess inland water body dynamics on a daily basis. Remote Sens. Lett. 2015, 6, 78–87. [CrossRef] 13. Bartholomé, E. Monitoring the environment in Africa: The VGT4Africa and the AMESD projects. In Proceedings of the 2nd International Workshop on “Crop and Rangeland Monitoring in East Africa”, Nairobi, Kenya, 27–29 March 2007; pp. 303–317. 14. Sudheer, K.; Chaubey, I.; Garg, V. Lake water quality assessment from Landsat thematic mapper data using neural network: An approach to optimal band combination selection 1. J. Am. Water Resour. Assoc. 2006, 42, 1683–1695. [CrossRef] 15. Wang, Y.N.; Huang, F.; Wei, Y.C. Water body extraction from Landsat ETM+ image using MNDWI and KT transformation. In Proceedings of the 21st International Conference on Geoinformatics, Kaifeng, China, 20–22 June 2013. 16. Liao, A.; Chen, L.; Chen, J.; He, C.; Cao, X.; Chen, J.; Peng, S.; Sun, F.; Gong, P. High-resolution remote sensing mapping of global land water. Sci. China Earth Sci. 2014, 57, 2305–2316. [CrossRef] 17. Feng, M.; Sexton, J.O.; Channan, S.; Townshend, J.R. A global, high-resolution (30-m) inland water body dataset for 2000: First results of a topographic-spectral classification algorithm. Int. J. Digit. Earth 2015, 9, 113–133. [CrossRef] 18. Yamazaki, D.; Trigg, M.A.; Ikeshima, D. Development of a global ~90 m water body map using multi-temporal Landsat images. Remote Sens. Environ. 2015, 171, 337–351. [CrossRef] 19. Donchyts, G.; Baart, F.; Winsemius, H.; Gorelick, N.; Kwadijk, J.; Giesen, N. Earth’s surface water change over the past 30 years. Nat. Clim. Change 2016, 6, 810–813. [CrossRef] 20. Ko, B.C.; Kim, H.H.; Nam, J.Y. Classification of potential water bodies using Landsat 8 OLI and a combination of two boosted random forest classifiers. Sensors 2015, 15, 13763–13777. [CrossRef][PubMed] 21. Pekel, J.F.; Cottam, A.; Clerici, M.; Belward, A.; Dubois, G.; Bartholome, E.; Gorelick, N. A global scale 30 m water surface detection optimized and validated for Landsat 8. AGU Fall Meet. Abstr. 2014, 1, 01P. 22. Nath, R.K.; Deb, S.K. Water-body area extraction from high resolution satellite images—An introduction, review, and comparison. Int. J. Image Process. 2010, 3, 353–372. 23. Sivanpillai, R.; Miller, S.N. Improvements in mapping water bodies using ASTER data. Ecol. Inf. 2010, 5, 73–78. [CrossRef] 24. Sethre, P.R.; Rundquist, B.C.; Todhunter, P.E. Remote detection of prairie pothole ponds in the Devils Lake Basin, North Dakota. GISci. Remote Sens. 2005, 42, 277–296. [CrossRef] 25. McFeeters, S.K. The use of the normalized difference water index (NDWI) in the delineation of open water features. Int. J. Remote Sens. 1996, 17, 1425–1432. [CrossRef] 26. Work, E.A., Jr.; Gilmer, D.S. Utilization of satellite data for inventorying prairie ponds and lakes. Photogramm. Eng. Remote Sens. 1976, 42, 685–694. 27. Rundquist, D.C.; Lawson, M.P.; Queen, L.P.; Cerveny, R.S. The relationship between summer season rainfall events and lake-surface area. J. Am. Water Resour. Assoc. 1987, 23, 493–508. [CrossRef] 28. Pekel, J.-F.; Vancutsem, C.; Bastin, L.; Clerici, M.; Vanbogaert, E.; Bartholomé, E.; Defourny, P. A near real-time water surface detection method based on HSV transformation of MODIS multi-spectral time series data. Remote Sens. Environ. 2014, 140, 704–716. [CrossRef] 29. Bartholomé, E.; Combal, B. VGT4Africa User Manual, 1st ed.; Office for Official Publication of the European Communities: Luxembourg, 2006. 30. Rahman, H.; Dedieu, G. SMAC: A simplified method for the atmospheric correction of satellite measurements in the solar spectrum. Int. J. Remote Sens. 1994, 15, 123–143. [CrossRef] 31. Products User Manual. Available online: http://proba-v.vgt.vito.be/ (accessed on 25 May 2016). Remote Sens. 2016, 8, 1010 25 of 25

32. Global Land Cover Facility: 90m Scene GLSDEM_p123r024_utmz13; Global Land Cover Facility, University of Maryland: College Park, Maryland, 2008; Available online: http://glcf.umd.edu/data/glsdem/ (accessed on 17 July 2015). 33. Global Land Ice Measurements from Space (GLIMS) Glacier Database; National Snow and Ice Data Center: Boulder CO, USA, 2014; Available online: http://glims.colorado.edu/glacierdata/ (accessed on 28 July 2015). 34. Vancutsem, C.; Pekel, J.-F.; Bogaert, P.; Defourny, P. Mean compositing, an alternative strategy for producing temporal syntheses. Concepts and performance assessment for SPOT VEGETATION time series. Int. J. Remote Sens. 2007, 28, 5123–5141. [CrossRef] 35. GCOS-154. Systematic Observation Requirements for Satellite-Based Products for Climate: Supplemental Details to the Satellite Based Component of the “Implementation Plan for the Global Observing System for Climate in Support of the UNFCCC (2010 Update)”; World Meteorological Organization: Geneva, Switzerland, 2011. 36. Irish, R.R.; Barker, J.L.; Goward, S.N.; Arvidson, T.J. Characterization of the Landsat-7 ETM+ automated cloud-cover assessment (ACCA) algorithm. Photogramm. Eng. Remote Sens. 2006, 72, 1179–1188. [CrossRef] 37. Padilla, M.; Stehman, S.V.; Litago, J.; Chuvieco, E. Assessing the temporal stability of the accuracy of a time series of burned area products. Remote Sens. 2014, 6, 2050–2068. [CrossRef] 38. Roy, D.P.; Boschetti, L. Southern Africa validation of the MODIS, L3JRC, and GlobCarbon burned-area products. IEEE Trans. Geosci. Remote Sens. 2009, 47, 1032–1044. [CrossRef]

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).