UCL – Geomatics – 2013 Issue 0 – Rev 1 (v0.1)

Sentinel-2 Agriculture

VOLUME 1

Function Name Signature Date

Prepared by Project coordinator Pierre Defourny & Signed on original project partners

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

UCL – Geomatics – 2013 Issue 0 – Rev 1 (v0.1)

This page is intentionally blank

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

UCL – Geomatics – 2013 Issue 0 – Rev 1 (v0.1)

Sentinel-2 Agriculture

CHAPTER 1 – Executive summary

Function Name Signature Date Pierre Defourny & Prepared by Project coordinator project partners Signed on original

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

UCL – Geomatics – 2013 Issue 0 – Rev 1 (v0.1) Achieving sustainable food security for all people will need to grow agricultural production by 70% and up to 100% in developing countries relative to 2009 levels [RD.17]. In addition, it will require paying attention to the fact that agriculture not only provides food and feed but also generates products for the energy (biofuels), materials (wood, fibers, textiles) and chemicals industries and that the competition among these sectors (Food, Feed, Fuel and Fiber) for agricultural resources is increasing [RD.4]. The need to adapt agriculture to climate change and the importance of improving the efficiency of water and soil use in a sustainable manner are also major challenges ahead. To this end, the development of information technologies to monitor the agriculture and all related practices are essential to support policy makers and to provide a report on science-based options to improve the resources use efficiency (water, land, fertilizers, pesticides) in agriculture including for small farms. Today the development of better agricultural monitoring capabilities is clearly considered as a critical tool for strengthening food production information and market transparency thanks to timely data and information about crop status, crop area and yield forecasts. The enhanced understanding of global production will contribute to reduced price volatility by allowing local, national and international operators to make decisions and anticipate market trends with reduced uncertainty. This is also a prerequisite for the definition and monitoring of any agricultural policy. Satellite remote sensing is clearly a major source of information but there is large gap between the current practices of operational system and the scientific literature in the field of crop remote sensing. On one hand, this gap can be explained by the fact that most scientific experiments cover very limited test sites areas and that the scaling-up to national or international level is a very distinct research effort. On the other hand, the poor availability of suitable in-situ and satellite data over large scale hampers large scale demonstrations. The Sen2-Agri project is designed to develop, demonstrate and facilitate the Sentinel-2 time series contribution to the satellite EO component of the agriculture monitoring for many agricultural systems distributed all over the world. The overall objective is to provide to the international user community validated algorithms to derive EO products relevant for crop monitoring, open source software and best practices to process Sentinel-2 data in an operational manner for major worldwide representative agriculture systems. In the context of the Data User Element (DUE) programme, a user-oriented approach will drive the entire project in order to address concrete user needs and requirements and to develop the ownership of operational users around the world. In order to achieve such an ambitious objective in a limited time frame, the project has to rely on already well-established components (i) for user community involvement and the global network of agriculture sites, (ii) for the practice and knowledge of image time series pre- processing and (iii) for the processing toolbox development. First, the international agriculture community of practices which led to the development of the JECAM network and the GEOGLAM design is very well known to the consortium which is actively committed since the early days. Second, our consortium implemented multi-sensor pre-processing tools that fully use the multi-temporal dimension of the Sentinel-2 mission and has gathered a deep practice of the difficulties inherent to this work through the processing of thousands of images (Formosat-2, Landsat, SPOT4-Take5). Third, the consortium is also committed in the development and tuning of multi-temporal processing methods in the open source OrfeoTool Box. These points are a major asset of the proposed partnerships to bring further the current capabilities for agriculture monitoring in many countries thanks to the Sentinel-2 mission.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

UCL – Geomatics – 2013 Issue 0 – Rev 1 (v0.1) The project outputs are of different natures. First, it will deliver a core of processing strategies combining advanced algorithms to produce four types of EO agriculture products and able to deal with the large range of agricultural landscapes observed around the world. Second, an open source and portable solution will develop from the OrfeoToolbox to convert the Sentinel-2 L1c data into cloud free multispectral surface reflectance mosaic and to relevant EO products thanks to the efficient implementation of the processing strategies. Using this software solution, the project will produce a set of 4 validated Sentinel-2 derived products for each of the 8 demonstration cases, including surface reflectance cloud free composite, cropland mask, crop type map and their respective area estimate, and crop specific vegetation status map. Each product will be delivered with quality flags which characterize their uncertainty. At least 5 sites well distributed over the world will provided as service demonstration for a full Sentinel-2 scene (290 x 290 km) by the consortium. For at least 3 demonstration cases at national scale, these sets will be locally produced using a system running at the premises of the end-user in the country. The performances of these products and the end-users assessment will be analysed and discussed in the validation report. More precisely, the suite of the Sen2-Agri EO products consists of a complementary outputs building on each other. While the generic users requirements in term of timeliness and accuracy reported in the Statement of Work are quite interesting, it is expected that they will have to be further defined according to the respective context, i.e. farming system complexity, already existing crop information, expected use of the information in the existing decision making process. Such an ambitious project is organized in three consecutive phases over a total period of 36 months. Each phase has specific objectives:  Phase 1 corresponds to an initial design phase. It is dedicated to user requirements consolidation (task 1), data collection and pre-processing (task 2) and agricultural EO products specification, algorithms benchmarking and development and system development (task 3). Its duration is set to 12 months.  Phase 2 aims at implementing the Sen2-Agri system, prototyping the agricultural EO products and assessing the performance of the developed system and products. It will also serve to prepare the phase 3 by defining the plan that will be applied to demonstrate the agricultural EO products with Sentinel-2 data. It corresponds to the task 4 and lasts 10 months.  Phase 3 is mainly made by the final demonstration phase (task 5), which focuses on the demonstration of the Sentinel-2 agricultural EO products with the champion user groups in real life conditions. This phase also includes activities that promote the project (task 7) and draw the main conclusions from this project and deliver recommendations both to users and ESA (task 6). Length of phase 3 is planned to be 14 months. It is intended to start after the commissioning phase of the Sentinel-2 mission and could thus be delayed. The project will consider at least 4 types of sites.  for phase 1: a set of 13 test sites for the phase 1, to be used in methods benchmarking and S2 data simulation;  for phase 3: at least 5 sites among the 13 for local scale demonstration and in situ validation of the different EO agricultural products with Sentinel-2 data;

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

UCL – Geomatics – 2013 Issue 0 – Rev 1 (v0.1)  for phase 3 also: 3 demonstration cases dealing with national coverage and national diversity, 2 of them corresponding to African countries.  additional voluntary sites either for local scale demonstration or nationwide production thanks to a proactive networking and appropriate technical support, “voluntary” meaning that these sites can use the developed S2-Agri processing chain but without receiving any funding (working on a voluntary basis) or technical support (even if their participation to the training workshop could be foreseen). The set of selected test sites will allow benchmarking the methods and algorithms with regard to the diversity of agro-ecological context, the various landscape patterns, the different agriculture practices and the actual satellite observation conditions (atmospheric pertubations, sun zenith angle and cloud coverage). The challenge is to allow identifying the most appropriate methods to scale up to the national and possibly the global scale. One of the most challenging aspect for automated algorithm is probably the spatial heterogenity of growing plants both within the field and across fields for a given crop due to local heterogenity of condition (diversity of practices, non synchronisation of practices, soil and weather heterogeneity, etc). For programmatic reason, it is also very important to select sites where in-situ and satellites time series are already available to start from the very beginning of the project with these data. The JECAM sites and the SPOT4-Take5 experiment are key elements in this context. During the phase 1, high spatial resolution optical data available for each selected site and which might contribute to the corresponding Test Data Sets (TDS) will be identified. The target sensors will be: Spot 4 (data collected in the frame of the Take 5 experiment), Landsat 5, 7 or 8 and RapidEye. These data set will serve to perform a benchmarking of candidates strategies to use satellite image time series (SITS) suitable for operational production of cloud free composites, cropland maps (crop dynamics, crop types, crop area) and vegetation status (i.e. vegetation indices, LAI, fAPAR). An important step in the project is the development of the software applications along with the execution chain used to generate the prototype products. In order to maximize the result with the same development effort, already existing open source software will be used by several Sen2-Agri prototypes. Additional development will be done around these open source libraries for the functionalities that are not covered. The libraries used will be under Open Source Public license. The baseline for our proposal is the ORFEO Toolbox (OTB) developed by CNES and which is distributed as an open source library of image processing algorithms. OTB implements a set of algorithmic components, adapted to large remote sensing images, which allow capitalizing the methodological know how, and therefore use an incremental approach to benefit from the results of the methodological research. Most functionalities are also adapted to process huge images without the need for a supercomputer using streaming and multi-threading as often as possible. The Sen2-Agri software proposed by the consortium is composed of:  S2Agri-SC: processing components which build image processing chains to obtain the products required by the users;  S2Agri-SC Orchestrator: a tool which launches the different processing chains following a Data Driven processes and monitors the different chain.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

UCL – Geomatics – 2013 Issue 0 – Rev 1 (v0.1) For the processing components, two tools have been selected for their quality, re-use degree and perfect mastering by the partnerships of the project:  Orfeo ToolBox (OTB): for composite generation, biophysical and added value products generation  Sentinel Exploitation Tools (BEAM ToolBox + S2PAD module): for atmospheric corrections and cloud detection. For the S2Agri-SC Orchestrator, we have selected the SLURM tool to manage the resource of the S2Agri system. This tool allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job) on a set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work. Even this tools is mainly used for cluster task management, it is very useful also on a single platform with multiple cores. The data driven mechanism will be done by a main script which scan the input directory. The validation of the Sentinel-2 EO products will rely on 3 complementary pillars: (i) the confidence-building, (ii) the statistical accuracy assessment and (iii) the comparison with existing products. In addition, a user-oriented assessment will be set up, in order to assess the Sentinel-2 demonstration products utility and benefit. This approach has the advantages of:  reinforcing the overall acceptance of the product by users by removing macroscopic errors;  providing accuracy figures obtained from an independent quantitative validation in line with current standards;  characterizing the strengths and weaknesses of the new Sen2-Agri EO products with respect to other existing agricultural products;  involving on a real user dialogue in which users feedbacks feed into final discussions and recommendations. A demonstration plan will be prepared at the end of Phase 2 in direct interaction with ESA with regard to the updated planning of Sentinel-2 data availability and the selection of final national scale demonstration cases. This demonstration will be completed at two different levels: 1) the local scale demonstration sites selected among the test sites used in the Phase 1 plus any voluntary test sites interested to join the test and demonstration effort (a Netherlands partner being already identified and committed - see his letter in section 16); 2) the national scale demonstration sites also selected among the test sites used for Phase 1 based on a set of criteria; at this stage, four countries have been already identified as potential candidates. According to ESA specifications, the consortium plans to set up the Sen2-Agri system at user premises for at least 3 different organisations. Staff from these organisations will be implicated in these installations and the work carried out by the consortium in this frame will include four main components:

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

UCL – Geomatics – 2013 Issue 0 – Rev 1 (v0.1)  First, some tests and an installation procedure update will be done if necessary during an internal Factory Deployment Test (at consortium premises);  Second, the Sen2-Agri system will be installed at user premises (after a complete user material and system configuration checking);  Third, a set of tests will be run on the installed system based on the Acceptance Document to check the consistency of the system regarding the internal one;  Fourth, a phase of Sen2-Agri system installation training (with a user staff dedicated) will be planned to make users able to make such kind of installation themselves.

The structure to ensure a continuous dialogue with the users community during the different phases of the project includes three main areas: 1) Identification of specific user needs for product specifications (SoW and WP 1100): a. Initial user’s requirements established during a consultation exercise organized by ESA in April 2012 with about 50 members of the agricultural user and expert communities; b. Users’ requirement consolation by an additional questionnaire sent to every users to address mainly (i) the main priorities of the user, (ii) the most wished products and (iii) the technical details of every product, including the delivery mode; c. Users’ requirement consolation by direct contact with the users 2) Critical user review of the Sen2-Agri production process: a. Selected users will be invited to participate the product development, implementation and validation and asked to provide input and feedback at different points during the process. 3) User application and feedback mechanism from the users on the use of the products and related potentials and limitations (WP 4400-5500-6300): a. Users will be using the products generated in their applications to provide first indications on their impact, added-value, potentials and limitations; b. Final discussions with the users will yield feedback on the products and results in a set of recommendations to further improve agricultural monitoring beyond this part of the project.

The support for the engagement of the global crop monitoring community is a core strategy of this Sen2Agri and their assessment of the EO products and system performances will be the key criteria for this project success which aims to support the use of Sentinel-2 data around the world for agriculture monitoring.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

UCL – Geomatics – 2013 Issue 0 – Rev 1 (v0.1)

Sentinel-2 Agriculture

CHAPTER 2 - Technical proposal for the European Space Agency

Function Name Signature Date Pierre Defourny & Prepared by Project coordinator project partners Signed on original

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 1 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

This page is intentionally blank Page 2 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Table of contents

Sentinel-2 Agriculture ...... 1 VOLUME 1 ...... 1 CHAPTER 1 – Executive summary ...... 3 CHAPTER 2 - Technical proposal for the European Space Agency...... 9 1 Introduction ...... 12 1.1 Purpose and scope ...... 12 1.2 Structure of the document ...... 12 1.3 References ...... 13 1.3.1 Applicable documents ...... 13 1.3.2 Reference documents ...... 13 1.3.3 Acronyms and abbreviations ...... 25 2 Understanding of the requirements ...... 31 2.1 Background ...... 31 2.1.1 Context ...... 31 2.1.2 Review of current agriculture monitoring practices ...... 34 2.1.3 Review of agriculture monitoring state of the art ...... 37 2.2 Objectives ...... 44 2.2.1 Sentinel-2 Agriculture objectives ...... 44 2.2.2 Perspectives beyond Sentinel-2 Agriculture ...... 46 2.3 Project outputs ...... 47 3 Problem understanding ...... 51 3.1 Addressing the large diversity of agricultural systems ...... 51 3.2 Timeliness of the EO products derived from S2 large volume ...... 53 3.3 Portable open source solution for operational production...... 55 3.4 Algorithms selection and solution development before S2 era ...... 56 3.5 EO products validation ...... 57 4 Work description ...... 58 4.1 Task 1 - Users requirements consolidation ...... 58 4.1.1 Consolidation of the initial UR by survey with champion user group (WP 1100) 59 4.1.2 Selection of potential sites and main crops (WP 1200)...... 60

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 3 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

4.1.3 Sentinel-2 exploitation scenario development (WP 1300) ...... 61 4.2 Task 2 - Data acquisition and pre-processing ...... 64 4.2.1 Site selection (WP 2100) ...... 64 4.2.2 Design and collection of in-situ data for all sites (WP 2200) ...... 70 4.2.3 Collection of high spatial resolution optical time series (WP 2300) ...... 75 4.2.4 Development of pre-processing chains (WP 2400) ...... 76 4.2.5 Preprocessing of high resolution optical time series (WP 2500) ...... 80 4.2.6 Validation of test data set (WP 2600) ...... 80 4.3 Task 3 - EO Products specification and algorithm design ...... 82 4.3.1 Products and system specifications (WP3100) ...... 83 4.3.2 Candidate algorithm selection based on literature and best practices review (WP3200) ...... 85 4.3.3 Algorithms inter-comparison, benchmarking and selection (WP3300) ...... 91 4.3.4 Operational system design (WP3400) ...... 91 4.3.5 Operational system validation protocol development (WP3500) ...... 92 4.4 Task 4 - System development ...... 94 4.4.1 System implementation (WP4100) ...... 95 4.4.2 System documentation (WP4200) ...... 96 4.4.3 Generation of prototype products (WP4300) ...... 97 4.4.4 Assessment of prototype products and system performance (WP4400) ...... 98 4.4.5 Demonstration plan (WP4500) ...... 101 4.5 Task 5 - Demonstration use cases ...... 102 4.5.1 In-situ Sen-2 Agri system installation (WP 5100) ...... 102 4.5.2 Capacity development and training (WP 5200) ...... 103 4.5.3 Internal Sen2-Agri EO production (WP 5300) ...... 104 4.5.4 In-situ Sen-2 Agri EO production support (WP 5400)...... 105 4.5.5 Field data collection in minimum 8 sites (WP 5500) ...... 106 4.5.6 Use cases validation and assessment (WP 5600) ...... 106 4.6 Task 6 - Conclusions and recommendations ...... 108 4.6.1 Data and results dissemination (WP 6100) ...... 109 4.6.2 Users training on the developed system (WP 6200) ...... 109 4.6.3 Sentinel-2 products added-value assessment (WP 6300) ...... 110 4.6.4 Synthesis and recommendations for further improvement and generalization (WP 6400) ...... 111

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 4 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

4.7 Task 7 - Promotional activities ...... 111 4.7.1 Website development and maintenance (WP 7100) ...... 112 4.7.2 Promotional material development (WP 7200) ...... 113 4.7.3 User community federation and scientific communication (WP 7300) ...... 114 4.8 Task 8 ...... 115 5 Coordination with Ongoing and Complementary Activities ...... 116 6 User-oriented approach ...... 118 6.1 User consultation plan ...... 118 6.2 Link with user requirements ...... 119 7 Specification of satellite and in-situ data ...... 123 7.1 Satellite data ...... 123 7.1.1 SPOT ...... 123 7.1.2 LANDSAT ...... 124 7.1.3 Rapideye ...... 124 7.2 In-situ data ...... 125 8 Preliminary list of test sites ...... 127 9 Preliminary list of algorithms ...... 129 9.1 Cloud-free composite generation ...... 129 9.1.1 Step 1: compositing ...... 131 9.1.2 Gap filling ...... 132 9.2 Crop and vegetation status mapping ...... 137 9.2.1 Existing classification system ...... 138 9.2.2 Products ...... 144 9.2.3 Needs in terms of reference data and product validation ...... 146 9.2.4 Processing chains ...... 149 10 Algorithm benchmarking concept and plan ...... 162 10.1 Benchmarking exercise within the project framework ...... 162 10.2 Benchmarking plan ...... 163 11 Technical description of tools and analysis system ...... 165 11.1 Solution overview ...... 165 11.2 Solution description ...... 167 11.3 Presentation of proposed re-use software ...... 167 11.3.1 Sentinel Exploitation Tools (BEAM + S2PAD) ...... 167

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 5 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

11.3.2 Orfeo ToolBox ...... 168 11.3.3 GDAL/OGR library ...... 174 11.3.4 SLURM ...... 175 11.4 Global architecture ...... 176 11.4.1 S2Agri-SC component ...... 178 11.4.2 Orchestrator component ...... 179 12 Preliminary proposal for use case studies ...... 180 12.1 In-situ 1 : Senegal ...... 180 12.1.1 User presentation ...... 180 12.1.2 Expected result and impact ...... 180 12.1.3 Work to be performed ...... 181 12.1.4 Interaction with other project activities ...... 182 12.1.5 Expected value of the demonstration ...... 182 12.2 In-situ 2 : Kenya ...... 183 12.2.1 User presentation ...... 183 12.2.2 Expected result and impact ...... 183 12.2.3 Work to be performed ...... 183 12.2.4 Interaction with other project activities ...... 183 12.2.5 Expected value of the demonstration ...... 184 12.3 In-situ 3 : Russia ...... 184 12.3.1 User presentation ...... 185 12.3.2 Work to be performed ...... 185 12.3.3 Interaction with other project activities ...... 186 12.3.4 Expected value of the demonstration ...... 187 12.4 In-situ 4 : Morocco ...... 187 12.4.1 User presentation ...... 187 12.4.2 Expected result and impact ...... 188 12.4.3 Proposed approach ...... 188 12.5 Local production ...... 188 13 Proposed approach for the sentinel-2 EO product processing ...... 190 13.1 Sentinel-2 EO product processing ...... 190 13.1.1 Huge data volume to consider ...... 190 13.1.2 Hardware configuration ...... 190

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 6 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

13.1.3 System processing ...... 191 13.1.4 Sentinel-2 EO product dissemination ...... 191 13.2 Sentinel-2 EO product validation ...... 192 14 Appendix A - Draft software development plan ...... 193 14.1 Introduction ...... 193 14.1.1 Purpose of the annex ...... 193 14.1.2 Structure of the annex ...... 193 14.2 Project overview ...... 193 14.3 Management approach ...... 194 14.4 Development approach ...... 194 14.4.1 Software development strategy ...... 194 14.4.2 Development life cycle ...... 196 14.4.3 Software engineering standards and techniques ...... 208 14.4.4 Software development and software testing environment ...... 209 14.4.5 Software documentation plan ...... 209 15 Appendix B - Sofware Reuse File ...... 211 15.1 Introduction ...... 211 15.1.1 Purpose of the annex ...... 211 15.1.2 Scope of the annex ...... 211 15.1.3 Structure of the annex ...... 211 15.2 Third Party products and required Software licenses ...... 211 15.2.1 Software Licence and Intellectual property on the proposed solution ...... 212 15.2.2 Presentation of the software intended to be reused ...... 216 15.2.3 [Orfeo ToolBox] ...... 217 15.2.4 [Sentinel2 ToolBox and Sentinel Exploitation Tools] ...... 217 15.2.5 [Geospatial Data Abstraction Library] ...... 218 15.2.6 [OpenJPEG] ...... 219 15.2.7 [SLURM] ...... 219 15.2.8 Others ...... 220 15.2.9 Compatibility of existing software items with project requirements ...... 220 15.3 Conclusion ...... 224 16 Appendix C - Document Requirement Definition ...... 226 16.1 User Requirements Document (URD) ...... 227

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 7 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

16.2 Technical Specifications (TS) ...... 227 16.3 Design Justification File ...... 229 16.4 Design Definition File ...... 230 16.5 Acceptance Test Document ...... 231 16.6 Qualification Review Report ...... 232 16.7 Prototype Validation and Assessment Report ...... 233 16.8 Demonstration Plan ...... 234 16.9 Capacity Building Plan ...... 235 16.10 Validation Report ...... 236 16.11 Exploitation Report ...... 236 16.12 Final Report ...... 237 17 Appendix D - Risk register ...... 239 18 Appendix E - Traceability and compliance matrix ...... 242 19 Appendix F - Letters of support ...... 243

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 8 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

List of figures Figure 2-1: The observation requirements as defined by the GEO Ag CoP for CEOS (2012) 33 Figure 2-2 : A synthesis of the agriculture applications of satellite remote sensing as endorsed by the GEO Agriculture CoP (RD.189)...... 35 Figure 2-3 : Example of operational precision farming system using satellite EO data (credit: Farmstar) ...... 37 Figure 2-4 : Example of final products for croplands extension for 1990-2000-2010 from Landsat 10x10 km extracts using semi-automated processing methods to contribute to the EU country profile. (as completed by UCL/JRC/GISAT in Geoland-2)...... 49 Figure 3-1: High resolution image and derived crop specific mask for six different agricultural landscapes distributed on different agro-climatic zones (RD.207)...... 52 Figure 4-1: Proposed Sentinel-2 Agriculture facility and network bandwidth ...... 62 Figure 4-2 : Examples of Level 1C, 2A, 2B products simulated with a time series of Formosat-2 data (CESBIO). The Level 3A is a cloud free composite of the images gathered during 15 days around the date...... 77 Figure 4-3: Comparison of cloud masks (circled in red) obtained from MACCS (left) and from LEDAPS (LANDSAT USGS cloud mask) (Right). MACCS nearly classifies the whole image as cloudy whereas LEDAPS only classifies 30% of the image as cloudy. The difference between both approaches should be reduced with Landsat 8, thanks to the 1.38µm spectral band that can easily detect high and thin clouds...... 79 Figure 4-4: This kind of display enables a quick evaluation of the various masks generated by ourp processors. The masks are outined in different colours (green for clouds, black for cloud shadows, blue for water and pink for snow). Here, one of the SPOT4 (Take5) images processed with MACCS in a difficult case (presence of thin cirruses (RD.168)) ...... 81 Figure 4-5: Comparison of surface reflectances derived from FORMOSAT-2 image time series over the Crau calibration site (small symbols connected by line) with surface reflectance provided by the measuring station ROSAS (big points) ...... 82 Figure 4-6 :Block diagram for a land cover map production system ...... 88 Figure 4-7: Choices of algorithms leading to strategy comparisons ...... 89 Figure 4-8 : Overall organization of the validation and related user assessment activities ..... 99 Figure 7-1 : Example of thumbnail SPOT4-Take5 images gathered above the Argentina site. The date is provided in the filename...... 124 Figure 9-1 : Comparison of the noise of a Max NDVI composite (left), with a CYCLOPES composite (right), from VEGETATION data. The noise is much higher in the Max-NDVI composite (the black spots on the CYCLOPES composite are due to missing data since the algorithm required at least 3 observations) ...... 130 Figure 9-2 : Comparison of level 2 product (left) and “most recent pixel” Level 3 composite, at the same date, from SPOT4-Take5 data. Places where clouds or shadows are found in the level 2A are replaced by pixels from an older date, for which the ground was much wetter (after some rain), large discontinuities can be observed...... 131

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 9 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 9-3 : Weighted average composite derived from FORMOSAT-2 data. The two available images are both partly cloudy. The resulting shows no artefacts. Some pixels in the composite were flagged as cloudy in both input products. They are flagged as cloudy in the composite, but their reflectance is filled using a minimum blue reflectance criterion...... 132 Figure 9-4 : Initial information with clouds and cloud shadows ...... 133 Figure 9-5 : Cloud data mask obtained with the MTCD method ...... 133 Figure 9-6 : Resulting cloudless data after interpolation ...... 134 Figure 9-7: One-pixel evolution example: data values in time identified as erroneous or missing (in red, outside the fitted lines), are interpolated to their corresponding new values (in green, on the fitted lines). Reflectance values are given here against corresponding day-of- year calculation since the midnight Coordinated Universal Time (UTC), January 1st 1970 (Unix time) ...... 134 Figure 9-8 : Initial data affected by clouds ...... 135 Figure 9-9 : Resulting cloudless information after processing ...... 135 Figure 9-10 : Initial data affected by clouds ...... 136 Figure 9-11 : Resulting cloudless information after processing ...... 136 Figure 9-12 : Land-cover map of the South-West of France ...... 138 Figure 9-13 : Block diagram of CESBIO’s classification chain ...... 139 Figure 9-14 : Example of land-cover classification in the South of France ...... 140 Figure 9-15 : Example of land cover classification ...... 140 Figure 9-16 : Example of land cover classification ...... 141 Figure 9-17 : General scheme of the GlobCover classification methodology ...... 142 Figure 9-18 : Overview of the CCI land cover products, over US ...... 142 Figure 9-19 : Overview of the CCI land cover products, over Central and South America .. 143 Figure 9-20 : Planting dates for winter wheat ...... 145 Figure 9-21 : Planting dates for maize ...... 145 Figure 9-22 : Harvest dates for maize ...... 146 Figure 9-23 : Example of phenological descriptors which can be extracted from a temporal profile ...... 151 Figure 9-24: Results of the BVNET evaluation [RD.161]. X-axis : in-situ GAI, FAPAR and FCOVER estimated by the use of hemispherical photographs and CAN-EYE software. Y- axis: variables estimated by the use of the BV-NET tool...... 160 Figure 10-1: Schematic illustration of the articulation of the processing modules that will be tested in the benchmarking exercise ...... 163 Figure 11-1: Orfeo ToolBox ecosystem ...... 171 Figure 11-2: Dashboard of the MACCS project ...... 174 Figure 11-3: Proposal architecture ...... 178

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 10 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 11-4: General organization of a S2Agri Software component...... 178 Figure 12-1: Agro-pastoral bulletin of August 2012 as delivered by CSE (http://svr- web.cse.sn/IMG/pdf/Suivi_de_la_campagne_agro- pastorale_2012_bilan_de_fin_de_saison_des_pluies-2.pdf) ...... 181 Figure 12-2: Point sampling frame for visual interpretation of annual aerial coverage of the country in order to deliver area statistics (C. Situma, AGRISAT 2010, Brussels) ...... 184 Figure 12-3: Example of IKI output for a Southern part of Russia (RD.208) ...... 186 Figure 14-1 : Principle of incremental integration ...... 201 Figure 14-2 : First V0 “STUB” version of the Sen2Agri Software ...... 202 Figure 14-3 : Submission of source code based on ATBD and prototype source code ...... 202 Figure 14-4 : Regular submission of source code for the updated V1 modules ...... 203

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 11 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

List of tables Table 1-1: Applicable documents ...... 13 Table 1-2: Applicable documents ...... 25 Table 1-3: Acronyms ...... 30 Table 4-1 : Preliminary list of test sites ...... 70 Table 4-2 : Validation operations for each prototype products ...... 100 Table 4-3 : Validation operations for each Sentinel-2 demonstration product ...... 107 Table 6-1: Participants of the Sen2-Agri user group ...... 118 Table 6-2 : Outline Users’ Requirements consolidated from the initial requirements of the users listed in Table 6-1 ...... 122 Table 7-1 : Number of cloud free (>80%) images obtained for each site and each month, until May 15th...... 123 Table 8-1 : Preliminary list of test sites ...... 128 Table 9-1: Summary of jobs and tasks managed in the pre-processing chain of the CCI Land Cover project ...... 144 Table 11-1: OTB third party ...... 170 Table 11-2: Availability and quality status ...... 171 Table 15-1: Functions implemented ...... 223 Table 15-2: Availability and quality status ...... 224 Table 15-3: Summary of SW reuse ...... 225 Table 16-1: List of deliverables described in the Document Requirement Definition ...... 226 Table 16-2: List of deliverables not described in the Document Requirement Definition .... 226

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 12 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

1 Introduction

1.1 Purpose and scope

This document is the technical proposal to the ESA/ESRIN Invitation To Tender ESRIN/AO/1-7465/13/I–AM - “DATA USER ELEMENT (DUE) - Sentinel-2 Agriculture” concerning the user driven development and demonstration of agricultural Earth Observation products based on validated algorithms dedicated to the Sentinel-2 mission. The present proposal is based on the experience acquired by our key-personnel and on scientific expertise from recognized laboratories in the domain of agriculture monitoring based on Earth Observation data. 1.2 Structure of the document

This document contains 12 sections that describe:  The understanding of the requirements;  The problem understanding;  The work description;  The coordination with on-going and complementary activities;  The user-oriented approach;  The specification of all satellite and in-situ data;  The preliminary list of test sites;  The preliminary list of algorithms;  The algorithm benchmarking concept and plan;  The technical description of tools and analysis system;  The preliminary proposal for use case studies;  The proposed approach for the Sentinel-2 EO product processing. In addition, this document includes the 6 following appendices:  The draft software development plan (Appendix A);  The Software Reuse File (Appendix B);  The Document Requirement Definition (Appendix C);  The Risk Register (Appendix D);  The Traceability and compliance matrix (Appendix E);  The letters of support (Appendix F).

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 13 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

1.3 References

1.3.1 Applicable documents

ID Title Code Issue Date AD.1 Statement of Work for Sentinel-2 EOEP-DUEP-EOPS-SW- 1.0 26/03/2013 Agriculture Project 13-0004 AD.2 Special Conditions of Tender for Sentinel-2 Agriculture project AD.3 ECSS Space Engineering Software ECSS-E-ST-40 C 06/03/2009

Table 1-1: Applicable documents

1.3.2 Reference documents

ID Title UN Millennium Development Goals Report, 2012

RD.1 (http://mdgs.un.org/unsd/mdg/Resources/Static/Products/Progress2012/ English2012.pdf) RD.2 JECAM website RD.3 SPOT4-Take 5 blog GEO-GLAM proposal, 2011 (http://www.earthobservations.org/documents/cop/ag_gams/

RD.4 201106_g20_global_agricultural_monitoring_initiative.pdf) RD.5 GEOSS Community of Practice Ag 0703a () World Bank Food Price Index (http://web.worldbank.org/WBSITE/EXTERNAL/TOPICS/EXTPOVERTY/

RD.6 0,,contentMDK:22838758~pagePK:210058~piPK:210062~theSitePK:336992,00.html) Sentinel-2 Atmospheric Correction ATBD: Sentinel-2 MSI - Level 2A Products Algorihm Theoretical Basis

RD.7 Document, 20/09/2012, ref. S2PAD-ATBD-0001, issue 2.0 Sentinel-2 mission: ESA’s Optical High-Resolution Mission for GMES Operational Services, Remote Sensing of

RD.8 Environment, 120, 25-36, 2012 Sentinel-2 User Consultation: Presentations of the Sentinel-2 User Consultation, Frascati, April 2012

RD.9 (http://due.esrin.esa.int/meetings/meetings281.php) RD.10 DUE (Data User Element) Programme (http://due.esrin.esa.int/) RD.11 BEAM toolbox - Sentinel 2 exploitation tools (https://www.brockmann-consult.de/beam-jira/browse/SUHET) Sentinel-2 spatio-temporal synthesis ATBD: Sentinel-2 MSI - Level 3 Products Algorithm Theoretical Basis

RD.12 Document, Volume A, issue 2.0, 18 Jun 2010, ref. S2PAD-VEGA-ATBD-0005 RD.13 Sentinel-2 ESA’s Optical High-Resolution Mission for GMES Operational Services, March 2012 (ESA SP-1322/2) L1C Product Definition Document: Sentinel-2 PDGS Products Definition Document, issue 2.3, 30 March 2012,

RD.14 ref GMES-GSEG-EPG-TN-09-0029 L1C Product Specification Document: Sentinel-2 Product Specification Document, issue 7, 22 Feb 2013, ref S2-

RD.15 PDGS-TAS-DI-PSD RD.16 FAO, 1999, World Food Summit (http://www.fao.org/docrep/x2051e/x2051e00.HTM) RD.17 FAO, 2011, The state of the world’s land and water resources for food and agriculture

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 14 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

(http://www.fao.org/nr/water/docs/SOLAW_EX_SUMM_WEB_EN.pdf) Pittman, K.; Hansen, M.C.; Becker-Reshef, I.; Potapov, P.V.; Justice, C.O. Estimating global cropland extent with

RD.18 multi-year MODIS data. Remote Sens. 2010, 2, 1844–1863. Biradar, C.M.; Thenkabail, P.S.; Noojipady, P.; Li, Y.; Dheeravath, V.; Turral, H.; Velpuri, M.; Gumma, M.K.; RD.19 Gangalakunta, O.R.P. ; Cai, X.L.; et al. A global map of rainfed cropland areas (GMRCA) at the end of last millennium using remote sensing. Int. J. Appl. Earth Obs. Geoinf. 2009, 11, 114–129. Thenkabail, P.S.; Biradar, C.M.; Noojipady, P.; Dheeravath, V.; Li, Y.J.; Velpuri, M.; Gumma, M.; Reddy, G.P.O.; RD.20 Turral, H.; Cai, X.L. A Global irrigated area map (GIAM) using remote sensing at the end of the last millennium. Int. J. Remote Sens. 2009, 30, 3679–3733. Portmann, F.T.; Siebert, S.; Doll, P. MIRCA2000—Global monthly irrigated and rainfed crop areas around the RD.21 year 2000: A new high-resolution data set for agricultural and hydrological modeling. Glob. Biogeochem. Cy. 2010, doi:10.1029/2008GB003435. Ramankutty, N.; Evan, A.; Monfreda, C.; Foley, J.A. Farming the planet. Part 1: The geographic distribution of

RD.22 global agricultural lands in the year 2000. Glob. Biogeochem. Cy. 2008doi:10.1029/2007GB002952. Friedl, M.A.; Mciver, D.K.; Hodges, J.C.F.; Zhang, X.Y.; Muchoney, D.; Strahler, A.H.; Woodcock, C.E.; Gopal, S.; RD.23 Schneider, A.; Cooper, A.; et al. Global land cover mapping from MODIS: Algorithms and early results. Remote Sens. Environ. 2002, 83, 287–302. Mayaux, P.; Bartholome, E.; Fritz, S.; Belward, A. A new land-cover map of Africa for the year 2000. J. Biogeogr.

RD.24 2004, 31, 861–877. Fritz, S.; You, L.; Bun, A.; See, L.; McCallum, I.; Schill, C.; Perger, C.; Liu, J.; Hansen, M.; Obersteiner, M. RD.25 Cropland for sub-Saharan Africa: A synergistic approach using five land cover data sets. Geophys. Res. Lett. 2011, doi:10.1029/2010GL046213. Defourny, P.; Vancutsem, C.; Pekel, J.F.; Bicheron, P.; Brockmann, C.; Niño, F.; Schouten, L.; Leroy, M. Towards RD.26 a 300 m Global Land Cover Product—The Globcover Initiative. In Proceedings of Second Workshop of the EARSeL Special Interest Group on Land Use and Land Cover, Bonn, Germany, 28–30 September 2006. Vancutsem, C.; Marinho, E.; Kayitakire, F.; See, L.; Fritz, S. Harmonizing and Combining Existing Land RD.27 Cover/Land Use Datasets for Cropland Area Monitoring at the African Continental Scale. Remote Sens. 2013, 5, 19-41. Vancutsem, C.; Pekel, J.-F.; Kayitakire, F. Dynamic Mapping of Cropland Areas in Sub-Saharan Africa Using RD.28 MODIS Time Series. In Proceedings of the 6th International Workshop on the Analysis of Multi-Temporal Remote Sensing Images, Trento, Italy, 12–14 July 2011; pp. 25–28.

Thenkabail, P.S.; Wu, Z. An Automated Cropland Classification Algorithm (ACCA) for Tajikistan by Combining

RD.29 Landsat, MODIS, and Secondary Data. Remote Sens. 2012, 4, 2890-2918 VINTROU, Elodie, DESBROSSE, Annie, BÉGUÉ, Agnès, et al. Crop area mapping in West Africa using landscape RD.30 stratification of MODIS time series and comparison with existing global land products. International Journal of Applied Earth Observation and Geoinformation, 2012, vol. 14, no 1, p. 83-93. SHAO, Yang, LUNETTA, Ross S., EDIRIWICKREMA, Jayantha, et al. Mapping cropland and major crop types RD.31 across the Great Lakes Basin using MODIS-NDVI data. Photogrammetric engineering and remote sensing, 2010, vol. 76, no 1, p. 73-84. ARVOR, Damien, JONATHAN, Milton, MEIRELLES, Margareth Simões Penello, et al. Classification of MODIS EVI RD.32 time series for crop mapping in the state of Mato Grosso, Brazil. International Journal of Remote Sensing, 2011, vol. 32, no 22, p. 7847-7871 BORYAN, Claire et CRAIG, Mike. Multiresolution Landsat TM and AWiFS sensor assessment for crop area

RD.33 estimation in Nebraska. Proceedings from Pecora, 2005, vol. 16, p. 22-27. RD.34 YANG, Chenghai, EVERITT, James H., et MURDEN, Dale. Evaluating high resolution SPOT 5 satellite imagery for

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 15 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

crop identification. Computers and Electronics in Agriculture, 2011, vol. 75, no 2, p. 347-354. MURTHY, C. S., RAJU, P. V., et BADRINATH, K. V. S. Classification of wheat crop with multi-temporal images: RD.35 performance of maximum likelihood and artificial neural networks. International Journal of Remote Sensing, 2003, vol. 24, no 23, p. 4871-4890. SIMONNEAUX, Vincent, DUCHEMIN, Benoît, HELSON, D., et al. The use of high‐resolution image time series for RD.36 crop classification and evapotranspiration estimate over an irrigated area in central Morocco. International Journal of Remote Sensing, 2008, vol. 29, no 1, p. 95-116. WARDLOW, Brian D. et EGBERT, Stephen L. State-level crop mapping in the US Central Great Plains

RD.37 agroecosystem using MODIS 250-meter NDVI data. In : Pecora 16 Symposium. 2005. p. 25-27. LEITE, Paula Beatriz Cerqueira, FEITOSA, Raul Queiroz, FORMAGGIO, Antônio Roberto, et al. Hidden Markov RD.38 Models for crop recognition in remote sensing image sequences. Pattern Recognition Letters, 2011, vol. 32, no 1, p. 19-26. CONRAD, Christopher, FRITSCH, Sebastian, ZEIDLER, Julian, et al. Per-field irrigated crop classification in arid

RD.39 Central Asia using SPOT and ASTER data.Remote Sensing, 2010, vol. 2, no 4, p. 1035-1056. CONRAD, Christopher, COLDITZ, René R., DECH, Stefan, et al. Temporal segmentation of MODIS time series for RD.40 improving crop classification in Central Asian irrigation systems. International Journal of Remote Sensing, 2011, vol. 32, no 23, p. 8763-8778. MURAKAMI, T., OGAWA, S., ISHITSUKA, N., et al. Crop discrimination with multitemporal SPOT/HRV data in the

RD.41 Saga Plains, Japan. International journal of remote sensing, 2001, vol. 22, no 7, p. 1335-1348. LÖW, Fabian, SCHORCHT, Gunther, MICHEL, Ulrich, et al. Per-field crop classification in irrigated agricultural RD.42 regions in middle Asia using random forest and support vector machine ensemble. In : SPIE Remote Sensing. International Society for Optics and Photonics, 2012. p. 85380R-85380R-11. VAN NIEL, Thomas G. et MCVICAR, Tim R. Determining temporal windows for crop discrimination with remote RD.43 sensing: a case study in south-eastern Australia. Computers and Electronics in Agriculture, 2004, vol. 45, no 1, p. 91-108. HELLER, Elizabeth, RHEMTULLA, Jeanine M., LELE, Sharachchandra, et al.Mapping Crop Types, Irrigated Areas, and Cropping Intensities in Heterogeneous Landscapes of Southern India Using Multi-Temporal Medium-

RD.44 Resolution Imagery: Implications for Assessing Water Use in Agriculture.Photogrammetric engineering and remote sensing, 2012, vol. 78, no 8, p. 815-827. MINGWEI, Zhang, QINGBO, Zhou, ZHONGXIN, Chen, et al. Crop discrimination in Northern China with double RD.45 cropping systems using Fourier analysis of time-series MODIS data. International Journal of Applied Earth Observation and Geoinformation, 2008, vol. 10, no 4, p. 476-485. TURKER, M. et ARIKAN, M. Sequential masking classification of multi‐temporal Landsat7 ETM+ images for RD.46 field‐based crop mapping in Karacabey, Turkey. International journal of remote sensing, 2005, vol. 26, no 17, p. 3813-3830. IPPOLITI‐RAMILO, G. A., EPIPHANIO, J. C. N., et SHIMABUKURO, Y. E. Landsat‐5 Thematic Mapper data for RD.47 pre‐planting crop area evaluation in tropical countries. International Journal of Remote Sensing, 2003, vol. 24, no 7, p. 1521-1534. ORMECI, Cankut, ALGANCI, Ugur, et SERTEL, Elif. Identification of Crop Areas Using SPOT–5 Data. In

RD.48 : Proceedings of the FIG Congress. 2010. YANG, Chenghai, EVERITT, James H., FLETCHER, Reginald S., et al. Using high resolution QuickBird imagery for

RD.49 crop identification and area estimation.Geocarto International, 2007, vol. 22, no 3, p. 219-233. OK, Asli Ozdarici, AKAR, Ozlem, et GUNGOR, Oguz. Evaluation of random forest method for agricultural crop

RD.50 classification. European Journal of Remote Sensing, 2012, vol. 45, p. 421-432. RD.51 DE WIT, A. J. W. and CLEVERS, J. G. P. W. Efficiency and accuracy of per-field classification for operational crop

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 16 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

mapping. International journal of remote sensing, 2004, vol. 25, no 20, p. 4091-4112. CASTILLEJO-GONZÁLEZ, Isabel Luisa, LÓPEZ-GRANADOS, Francisca, GARCÍA-FERRER, Alfonso, et al. Object-and RD.52 pixel-based analysis for mapping crops and their agro-environmental associated measures using QuickBird imagery. Computers and Electronics in Agriculture, 2009, vol. 68, no 2, p. 207-215. DELRUE, Josefien, BYDEKERKE, Lieven, EERENS, Herman, et al. Crop mapping in countries with small-scale RD.53 farming: a case study for West Shewa, Ethiopia. International Journal of Remote Sensing, 2013, vol. 34, no 7, p. 2566-2582.Lobell, D.B. ATZBERGER, Clement et REMBOLD, Felix. Portability of neural nets modeling regional winter crop acreages

RD.54 using AVHRR time series. Eur. J. Remote Sens, 2012, vol. 45, p. 371-392. OZDOGAN, Mutlu. The spatial distribution of crop types from MODIS data: temporal unmixing using

RD.55 independent component analysis. Remote Sensing of Environment, 2010, vol. 114, no 6, p. 1190-1204. Asner, G.P. Cropland distributions from temporal unmixing of MODIS data. Remote Sens. Environ. 2004, 93,

RD.56 412–422. VIEIRA, Carlos, MATHER, P. M., et MCCULLAGH, Michael. The Spectral-Temporal Response Surface and its use RD.57 in the multi-sensor, multi-temporal classification of agricultural crops. International Archives of Photogrammetry and Remote Sensing, 2000, vol. 33, no B2; PART 2, p. 582-589. BORYAN, Claire, YANG, Zhengwei, MUELLER, Rick, et al. Monitoring US agriculture: The US Department of RD.58 Agriculture, National agricultural statistics service, cropland data layer program. Geocarto International, 2011, vol. 26, no 5, p. 341-358. GU, Xiaohe, PAN, Yaozhong, WANG, Huifang, et al. Study on measuring the planting area of winter wheat RD.59 based on per-field classification of remote sensing. In : Sixth International Symposium on Multispectral Image Processing and Pattern Recognition. International Society for Optics and Photonics, 2009. p. 74981V-74981V-8. Nitze I., Schulthess U., Asche B., Comparison of machine learning algorithms random forest, artificial neural RD.60 network and support vector machine to maximum likelihood for supervised crop type classification, in Proceedings of the 4th GEOBIA, 2012, p.035 OMKAR, S. N., SENTHILNATH, J., MUDIGERE, Dheevatsa, et al. Crop classification using biologically-inspired RD.61 techniques with high resolution satellite image. Journal of the Indian Society of Remote Sensing, 2008, vol. 36, no 2, p. 175-182. XIAO, Xiangming, BOLES, Stephen, LIU, Jiyuan, et al. Mapping paddy rice agriculture in southern China using

RD.62 multi-temporal MODIS images. Remote Sensing of Environment, 2005, vol. 95, no 4, p. 480-492. JAKUBAUSKAS, Mark E., LEGATES, David R., et KASTENS, Jude H. Crop identification using harmonic analysis of

RD.63 time-series AVHRR NDVI data.Computers and Electronics in Agriculture, 2002, vol. 37, no 1, p. 127-139. BRODSKY, Lukas, SOURKOVA, Lucie, et KODESOVA, Radka. Supervised Crop Classification from Middle- resolution Multitemporal Images. In :Proc. 2nd MERIS-(A) ATSR Workshop, Eds H. Lacoste & L. Ouwehand, ESA

RD.64 SP-666 (CD-ROM), ESA Communication Production Office, European Space Agency, Noordwijk, The Netherlands. 2008 GALFORD, Gillian L., MUSTARD, John F., MELILLO, Jerry, et al. Wavelet analysis of MODIS time series to detect RD.65 expansion and intensification of row-crop agriculture in Brazil. Remote Sensing of Environment, 2008, vol. 112, no 2, p. 576-587. EPIPHANIO, Rui Dalla Valle, FORMAGGIO, Antonio Roberto, RUDORFF, Bernardo Friedrich Theodor, et RD.66 al. Estimating soybean crop areas using spectral-temporal surfaces derived from MODIS images in Mato Grosso, Brazil.Pesquisa Agropecuária Brasileira, 2010, vol. 45, no 1, p. 72-80. PETITJEAN, François, INGLADA, Jordi, et GANÇARSKI, Pierre. Satellite image time series analysis under time

RD.67 warping. Geoscience and Remote Sensing, IEEE Transactions on, 2012a, vol. 50, no 8, p. 3081-3095. RD.68 PETITJEAN, François, KURTZ, Camille, PASSAT, Nicolas, et al. Spatio-temporal reasoning for the classification of

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 17 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

satellite image time series. Pattern Recognition Letters, 2012b. Anderson J.E., 1982, Domestic Crops and Land Cover THEMATIC MAPPER SIMULATOR DATA COLLECTED OVER

RD.69 EASTERN NORTH DAKOTA, Technical report, 22pp Davies C., 2009. Area Frame Design for Agricultural Survey. United States Department of Agriculture, National

RD.70 Agricultural Statistics Service. Research and Development Division. Washington DC. Gallego F. J., 2005. Stratified sampling of satellite images with a systematic grid of points. ISPRS Journal of

RD.71 Photogrammetry & Remote Sensing, 59, 369-376. Gallego F. J., 2012. The efficiency of sampling very high resolution images for area estimation in the European

RD.72 Union. International Journal of Remote Sensing. 33, 1868-1880. Gallego F. J. and Stibig H. J., 2012. Area estimation from a sample of satellite images: The impact of RD.73 stratification on the clustering efficiency. International Journal of Applied Earth observation and Geoinformation, 22, 139-146. MacDonald, R.B. (editor), 1979. The LACIE Symposium, Proceedings of Technical Sessions, 23–26 October 1978,

RD.74 NASA, Lyndon B. Johnson Space Center, Houston, Texas, 1125 p. GEOSS, 2009. Best practices for crop area estimation with Remote Sensing. Edited by Gallego J., Craig M.,

RD.75 Michaelsen J., Bossyns B., Fritz S. Ispra, June 5-6, 2008. Grace K., Husak, G.J. Husak, Harrison L., Pedreros D., Michaelsen J. 2012. Using high resolution satellite imagery

RD.76 to estimate cropped area in Guatemala and Haiti. Applied Geography, 32, 433-440. Husak G. J., Marshall M.T., Michaelsen J., Pedreros D., Funk C., and Galu G., 2008. Crop area estimation using RD.77 high and medium resolution satellite imagery in areas with complex topography. Journal of geophysical research, 113, D141112. Jacques, P. and Gallego, J., 2006. The LUCAS 2006 project - A new methodology. Joint Research Centre,

RD.78 European Commision. Marshall M. T., Husak G. J., Michaelsen J., Funk C., Pedreros D., Adoum A., 2011. Testing a high resolution RD.79 satellite interpretation technique for crop area monitoring in developing countries. International Journal of Remote Sensing., 32:23, 7997-8012. RD.80 MacDonald R.B. and Hall, F.G. 1980. Global crop forecasting. Sciences, 208, 670-679. Ozdogan, M. and Woodcock C. Resolution dependent errors in remote sensing of cultivated areas. 2006.

RD.81 Remote Sensing of Environment, 103, 203-217. Pan, Y., Wang, M., Wei, G., Wei, F., Shi, K., Li, L., & Sun, G. Application of Area-frame sampling for agricultural

RD.82 statistics in China. Stehman S. V., 2009. Sampling designs for accuracy assessment of land cover. International Journal of Remote

RD.83 Sensing, 30:20, 5243-5272. Wu, B. F., & Li, Q. Z., 2004. Crop acreage estimation using two individual sampling frameworks with

RD.84 stratification. Journal of Remote Sensing, 8(6), 551-569. RD.85 FAO, 2012. The State of Food Insecurity in the World. FAO, Rome, Italy. United Nations, 2011. World Population prospects: The 2010 Revision. Department of Economic and Social

RD.86 Affairs Population Division, New York, United States of America. Becker-Reshef I., Vermote E., Lindeman M., Justice C., 2010. A Generalized Regression-based Model for RD.87 Forecasting Winter Wheat Yields in Kansas and Ukraine Using MODIS Data. Remote Sensing of Environment, 114, 1312-1323. Doraiswamy, P. C., Sinclair, T. R., Hollinger, S., Akhmedov, B., Stern, A., and Prueger, J. 2005. Application of RD.88 MODIS derived parameters for regional crop yield assessment. Remote Sensing of Environment, 97(2):192–

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 18 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

202. Holben, B. N. (1986). Characteristics of maximum-value composite images from temporal AVHRR data.

RD.89 International Journal of Remote Sensing, 7(11), 1417-1434. Vancutsem, C., Pekel, J. F., Bogaert, P., & Defourny, P. (2007). Mean Compositing, an alternative strategy for RD.90 producing temporal syntheses. Concepts and performance assessment for SPOT VEGETATION time series. International Journal of Remote Sensing, 28(22), 5123-5141. Hagolle, O., Lobo, A., Maisongrande, P., Cabot, F., Duchemin, B., & De Pereyra, A. (2005). Quality assessment RD.91 and improvement of temporally composited products of remotely sensed imagery by combination of VEGETATION 1 and 2 images. Remote sensing of environment, 94(2), 172-186. Schaaf, C. B., Gao, F., Strahler, A. H., Lucht, W., Li, X., Tsang, T., … & Roy, D. (2002). First operational BRDF,

RD.92 albedo nadir reflectance products from MODIS. Remote sensing of Environment, 83(1), 135-148. RD.93 Item 7, Page 25/32, “S-2 Preparatory Symposium recommendations”, 14/05/2012 Issue 0 Rev 2. Vermote, E., Justice, C. O., & Breon, F. M. (2009). Towards a generalized approach for correction of the BRDF RD.94 effect in MODIS directional reflectances. Geoscience and Remote Sensing, IEEE Transactions on, 47(3), 898- 908. Claverie, M; E.Vermote; M.Weiss; F.Baret, O.Hagolle, V.Demarez, Validation of coarse spatial resolution GAI

RD.95 and FAPAR time series over cropland in southwest France, submitted to Remote sensing of Environment Jin Chen, Per. Jönsson, Masayuki Tamura, Zhihui Gu, Bunkei Matsushita, and Lars Eklundh, “A simple method RD.96 for reconstructing a high-quality NDVI time-series data set based on the Savitzky-Golay filter,” Remote Sensing of Environment, vol. 91, no. 3-4, pp. 332 – 344, 2004. D. Villa O. Hagolle, M. Huc and G. Dedieu, “A multi-temporal method for cloud detection, applied to RD.97 FORMOSAT-2, VENμS, LANDSAT and SENTINEL-2 images,” Remote Sensing of the Environment, vol. 114, pp. 1747–1755, 2010. GAO, Feng, MASEK, Jeff, SCHWALLER, Matt, et al. On the blending of the Landsat and MODIS surface RD.98 reflectance: Predicting daily Landsat surface reflectance. Geoscience and Remote Sensing, IEEE Transactions on, 2006, vol. 44, no 8, p. 2207-2218. RD.99 http://eo.belspo.be/Directory/ProjectDetail.aspx?projID=862 YIN, Tiangang, INGLADA, Jordi, et OSMAN, Julien. Time series image fusion: Application and improvement of RD.100 STARFM for land cover map and production. In : Geoscience and Remote Sensing Symposium (IGARSS), 2012 IEEE International. IEEE, 2012. p. 378-381. Fasbender D., Obsomer V., Bogaert P., and Defourny P. Updating scarce high resolution images with time series RD.101 of coarser images: a Bayesian data fusion solution. In N. Milisavljević, editor, Sensor and Data Fusion, pages 246–261. I-Tech, Vienna (Austria), February 2009. RODES, I., INGLADA, J., HAGOLLE, O., et al. Sampling strategies for unsupervised classification of multitemporal RD.102 high resolution optical images over very large areas. In : Geoscience and Remote Sensing Symposium (IGARSS), 2012 IEEE International. IEEE, 2012. p. 6761-6764. J. Inglada, O. Hagolle, and G. Dedieu, “Assessment of the Land Cover Classification Accuracy of Venus and RD.103 Sentinel-2 Image Time Series with respect to Formosat- 2,” in Third Recent Advances in Quantitative Remote Sensing, 2010, pp. 247–250. J. Inglada, B. Beguet, J-F. Dejoux, et al., “Use of Dense Time Series of High Resoluton Images for Change De- RD.104 tection and Land Use Classification,” in Third Recent Advances in Quantitative Remote Sensing, 2010, pp. 251– 255.

OSMAN, Julien, INGLADA, Jordi, DEJOUX, Jean-Francois, et al. Fusion of multi-temporal high resolution optical RD.105 image series and crop rotation information for land-cover map production. In: Geoscience and Remote Sensing

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 19 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Symposium (IGARSS), 2012 IEEE International. IEEE, 2012. p. 6785-6788. ROBIN, Amandine, MOISAN, Lionel, HEGARAT-MASCLE, Le, et al. An a-contrario approach for subpixel change RD.106 detection in satellite imagery. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2010, vol. 32, no 11, p. 1977-1993. INGLADA, Jordi et MERCIER, Grégoire. A new statistical similarity measure for change detection in RD.107 multitemporal SAR images and its extension to multiscale change analysis. Geoscience and Remote Sensing, IEEE Transactions on, 2007, vol. 45, no 5, p. 1432-1445. HABIB, Tarek, INGLADA, Jordi, MERCIER, Grégoire, et al. Support vector reduction in SVM algorithm for abrupt RD.108 change detection in remote sensing. Geoscience and Remote Sensing Letters, IEEE, 2009, vol. 6, no 3, p. 606- 610. BARALDI, Andrea, PUZZOLO, Virginia, BLONDA, Palma, et al. Automatic spectral rule-based preliminary RD.109 mapping of calibrated Landsat TM and ETM+ images. Geoscience and Remote Sensing, IEEE Transactions on, 2006, vol. 44, no 9, p. 2563-2586. Sacks, W.J., D. Deryng, J.A. Foley, and N. Ramankutty (2010). Crop planting dates: an analysis of global RD.110 patterns. Global Ecology and Biogeography 19, 607-620. http://www.sage.wisc.edu/download/sacks/crop_calendar.html TUIA, Devis, RATLE, Frédéric, PACIFICI, Fabio, et al. Active learning methods for remote sensing image

RD.111 classification. Geoscience and Remote Sensing, IEEE Transactions on, 2009, vol. 47, no 7, p. 2218-2232. PETITJEAN, François, INGLADA, Jordi, et GANÇARSKI, Pierre. Satellite image time series analysis under time

RD.112 warping. Geoscience and Remote Sensing, IEEE Transactions on, 2012, vol. 50, no 8, p. 3081-3095. KOETZ, Benjamin, BARET, Frédéric, POILVÉ, Hervé, et al. Use of coupled canopy structure dynamic and RD.113 radiative transfer models to estimate biophysical canopy characteristics. Remote Sensing of Environment, 2005, vol. 95, no 1, p. 115-124. INGLADA, Jordi et MICHEL, Julien. Qualitative spatial reasoning for high-resolution remote sensing image

RD.114 analysis. Geoscience and Remote Sensing, IEEE Transactions on, 2009, vol. 47, no 2, p. 599-612. ALBOODY, Ahed, SEDES, Florence, et INGLADA, Jordi. Post-classification and spatial reasoning: new approach to RD.115 change detection for updating gis database. In : Information and Communication Technologies: From Theory to Applications, 2008. ICTTA 2008. 3rd International Conference on. IEEE, 2008. p. 1-7. MICHEL, J., GRIZONNET, M., INGLADA, J., et al. Local feature based supervised object detection: Sampling, RD.116 learning and detection strategies. In : Geoscience and Remote Sensing Symposium (IGARSS), 2011 IEEE International. IEEE, 2011. p. 2381-2384.

D. Commaniciu, “Synergy in remote sensing. what’s in a pixel ?,” on International Journal of Remote Sensing,

RD.117 vol. 19, no. 11, pp. 2025–2047, 1998. Y. Z. V. Dey and M. Zhong, “A review on image segmentation techniques with remote sensing pespective,”

RD.118 SPRS Commission VII Symposium : 100 Years ISPRS - Advancing Remote Sensing Science, pp. 5–7, 2010. T. Blaschke and J. Strobl, “What’s wrong with pixels ? some recent developments interfacing remote sensing

RD.119 and gis,” Proceedings of GIS, Zeitschrift fur Geoinformationsysteme, pp. 12–17, 2001. J. Schiewe, “Segmentation of high-resolution remotely sensed data-concepts, applications and problems,” on RD.120 The international Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. XXXIV, no. 4, pp. 358–363, 1998. E. W. A.P. Carleer, O. Debeir, “Assessment of very high spatial resolution satellite image seg- mentations,”

RD.121 Photogrammetric Engineering & Remote Sensing, on, vol. 71(11), pp. 1285–1294, 2005.

S. Beucher and C. D. M. Mathmatique, “The watershed transformation applied to image seg- mentation,” in

RD.122 Scanning Microscopy International, pp. 299–314, 1991.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 20 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

R. Ketting and D. Landgrebe, “Classification of multispectral image data by extraction and classification of

RD.123 homogeneous objets,” IEEE Transactions on Geoscience Electronics, vol. 14(11), pp. 19–26, 1976. M.Baatz and A. S. äpe, “Multiresolution segmentation : an optimization approach for high quality multiscale RD.124 image segmentation,” n : Strobl, J., Blaschke, T. (Eds.), Angewandte Geogr. Informationsverarbeitung, vol. XII. Wichmann, Heidelberg,, pp. 12–23, 2000. D. Comaniciu, “Mean shift : a robust approach toward feature space analysis,” IEEE Transac- tions on Pattern

RD.125 Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603–619, 2002. K. S. R.M. Haralick and I. Dinstein, “Textural features for image classification,” IEEE Transac- tions on Systems,

RD.126 Man and Cybernetics,, vol. SMC-3, no. 6, pp. 610–621, 1973. F. Petitjean, C. Kurtz, N. Passat, and P. GançArski, “Spatio-temporal reasoning for the classi- fication of satellite

RD.127 image time series,” In Pattern Recognition Letters, vol. 33, no. 13, pp. 1805– 1815, 2012. J. Benediktsson, L. Bruzzone, J. Chanussot, M. Dalla Mura, P. Salembier, and S. Valero, “Hie- rarchical analysis of remote sensing data : Morphological attribute profiles and binary parti- tion trees,” in Mathematical

RD.128 Morphology and Its Applications to Image and Signal Processing (P. Soille, M. Pesaresi, and G. Ouzounis, eds.), vol. 6671 of Lecture Notes in Computer Science, pp. 306–319, 2011. G. Hay and G. Castilla, “Object-based image analysis : Strengths, weakness, opportunities and threats.,” In the RD.129 International Archives of the Photogrammetry, Remote Sensing and Spatial Information Science, vol. XXXVI- part 6, 2006. T. Blaschke, “Object based image analysis for remote sensing,” {ISPRS} Journal of Photogram- metry and

RD.130 Remote Sensing, vol. 65, no. 1, pp. 2 – –16, 2010. S. Valero, P. Salembier, and J. Chanussot, “Hyperspectral image representation and processing with binary

RD.131 partition trees,” IEEE Transactions on Image Processing, vol. 22, no. 4, pp. 1430– 1443, 2013. A. Alonso-Gonzalez, S. Valero, J. Chanussot, C. Lopez-Martinez, and P. Salembier, “Processing multidimensional RD.132 sar and hyperspectral images with binary partition tree,” Proceedings of the IEEE, vol. 101, no. 3, pp. 723–747, 2013. HABIB, Tarek, INGLADA, Jordi, MERCIER, Gregoire, et al. Assessment of feature selection techniques for RD.133 support vector machine classification of satellite imagery. In : Geoscience and Remote Sensing Symposium, 2008. IGARSS 2008. IEEE International. IEEE, 2008. p. IV-85-IV-88. HAMPSHIRE, John B. et PEARLMUTTER, Barak. Equivalence proofs for multi-layer perceptron classifiers and the

RD.134 Bayesian discriminant function. In : Proc. of the 1990 Connectionist Models Summer School. 1990. p. 1-17. RD.135 CORTES, C. et VAPNIK, V. Support vector machine. Machine learning, 1995, vol. 20, no 3, p. 273-297. HARTIGAN, John A. et WONG, Manchek A. Algorithm AS 136: A k-means clustering algorithm. Journal of the

RD.136 Royal Statistical Society. Series C (Applied Statistics), 1979, vol. 28, no 1, p. 100-108. RD.137 KOHONEN, Teuvo. The self-organizing map. Proceedings of the IEEE, 1990, vol. 78, no 9, p. 1464-1480. BALL, Geoffrey H. et HALL, David J. ISODATA, a novel method of data analysis and pattern classification.

RD.138 STANFORD RESEARCH INST MENLO PARK CA, 1965. MOUNTRAKIS, Giorgos, IM, Jungho, et OGOLE, Caesar. Support vector machines in remote sensing: A review.

RD.139 ISPRS Journal of Photogrammetry and Remote Sensing, 2011, vol. 66, no 3, p. 247-259. MIAO, Xin, HEATON, Jill S., ZHENG, Songfeng, et al. Applying tree-based ensemble algorithms to the RD.140 classification of ecological zones using multi-temporal multi-source remote-sensing data. International Journal of Remote Sensing, 2012, vol. 33, no 6, p. 1823-1849. VADUVA, Corina, GAVAT, Inge, et DATCU, Mihai. Deep learning in very high resolution remote sensing image RD.141 information mining communication concept. In : Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European. IEEE, 2012. p. 2506-2510.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 21 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

LONGBOTHAM, Nathan, PACIFICI, Fabio, GLENN, Taylor, et al. Multi-modal change detection, application to the RD.142 detection of flooded areas: Outcome of the 2009–2010 data fusion contest. Selected Topics in Applied Earth Observations and Remote Sensing, IEEE Journal of, 2012, vol. 5, no 1, p. 331-342. LICCIARDI, Giorgio, PACIFICI, Fabio, TUIA, Devis, et al. Decision fusion for the classification of hyperspectral RD.143 data: Outcome of the 2008 GRS-S data fusion contest. Geoscience and Remote Sensing, IEEE Transactions on, 2009, vol. 47, no 11, p. 3857-3865. BRUZZONE, Lorenzo, CHI, Mingmin, et MARCONCINI, Mattia. A novel transductive SVM for semisupervised RD.144 classification of remote-sensing images. Geoscience and Remote Sensing, IEEE Transactions on, 2006, vol. 44, no 11, p. 3363-3373. BRUZZONE, Lorenzo et MARCONCINI, Mattia. Domain adaptation problems: A DASVM classification technique RD.145 and a circular validation strategy. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2010, vol. 32, no 5, p. 770-787. PETITJEAN, François, INGLADA, Jordi, et GANÇARSKI, Pierre. Temporal domain adaptation under time warping.

RD.146 In : Geoscience and Remote Sensing Symposium (IGARSS), 2011 IEEE International. IEEE, 2011. p. 3578-3581. Suits, G. (1972), The calculation of the directional reflectance of a vegetative canopy,Remote Sensing of

RD.147 Environment, 2, 117–125. Verhoef, W. (1984), Light scattering by leaf layers with applications to canopy reflectance modeling : the sail

RD.148 model, Remote Sensing of Environment, 16, 125–141. Jacquemoud S, Baret F. PROSPECT: a model of leaf optical properties spectra. Remote Sens. Environ. 34:75–91 RD.149

Baret, F., S. Jacquemoud, G. Guyot, and C. Leprieur (1992), Modeled analysis of the biophysical nature of RD.150 spectral shifts and comparison with information-content of broad bands, Remote Sensing of Environment, 41 (2-3), 133–1 Jacquemoud, S.,W. Verhoef, F. Baret, C. Bacour, P. J. Zarco-Tejada, G. P. Asner, C. Francois,and S. L. Ustin RD.151 (2009), Prospect plus sail models : A review of use for vegetation characterization, Remote Sensing of Environment, 113, S56–S66. Baret, F., Hagolle, O., Geiger, B., Bicheron, P., Miras, B., Huc, M., Berthelot, B., Nino, F., Weiss, M., Samain, O., RD.152 Roujean, J.L., & Leroy, M. (2007). LAI, fAPAR and fCover CYCLOPES global products derived from VEGETATION - Part 1: Principles of the algorithm. Remote Sensing of Environment, 110, 275-286 Bacour, C., F. Baret, D. Beal, M. Weiss, and K. Pavageau (2006), Neural network estimation of lai, fapar, fcover RD.153 and laixc(ab), from top of canopy meris reflectance data : Principles and validation, Remote Sensing of Environment, 105 (4), 313–325 Ganguly, S., A. Samanta, et al. (2008). “Generating vegetation leaf area index Earth system data record from RD.154 multiple sensors. Part 2: Implementation, analysis and validation.” Remote Sensing of Environment 112(12): 4318-4332. McCallum, I., W. Wagner, et al. (2010). “Comparison of four global FAPAR datasets over Northern Eurasia for

RD.155 the year 2000.” Remote Sensing of Environment 114(5): 941-949. Weiss, M., F. Baret, S. Garrigues, and R. Lacaze (2007), Lai and fapar cyclopes global products derived from RD.156 vegetation. part 2 : validation and comparison with modis collection 4 products, Remote Sensing of Environment, 110 (3), 317–331 Duveiller, G., M. Weiss, F. Baret, and P. Defourny (2011), Retrieving wheat green area index during the growing RD.157 season from optical time series measurements based on neural network radiative transfer inversion, Remote Sensing of Environment, 115 (3), 887–896 Rivalland V., Olioso A., Claverie M., Weiss M., Demarty J., Baret F., 2006: Neural Net Techniques Used to RD.158 Estimate Temporal and High Resolution Canopy Biophysical Variables from 3 Remote Sensing Data Sources.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 22 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Poster. The 2nd International Symposium on Recent Advances in Quantitative Remote Sensing: RAQRS’II, Valencia, Spain, 25-29 September 2006 Bsaibes, A., Courault, D., Baret, F., Weiss, M., Olioso, A., Jacob, F., Hagolle, O., Marloie, O, Bertrand, N., RD.159 Desfond, V., & Kzemipour, F. (2009). Albedo and LAI estimates from FORMOSAT-2 data for crop monitoring. Remote Sensing of Environment, 113, 716-72 Verger, A., Baret, F., & Camacho, F. (2011). Optimal modalities for radiative transfer-neural network estimation RD.160 of canopy biophysical characteristics: Evaluation over an agricultural area with CHRIS/PROBA observations. Remote Sensing of Environment, 115, 415-426 Claverie M., Estimation spatialisée de la biomasse et des besoins en eau des cultures à l’aide de données RD.161 satellitales à haute resolution spatial et temporelle: application aux agrosystèmes du Sud-Ouest de la France, Thèse soutenue le 11 janvier 2012, à l’Université Paul Sabatier, Toulouse, 262 p Demarez, V., Duthoit, S., Baret, F., Weiss, M., & Dedieu, G. (2008). Estimation of leaf area and clumping indexes

RD.162 of crops with hemispherical photographs. Agricultural and Forest Meteorology, 148, 644-655. INGLADA, Jordi, HAGOLLE, Olivier, et DEDIEU, Gérard. A framework for the simulation of high temporal RD.163 resolution image series. In : Geoscience and Remote Sensing Symposium (IGARSS), 2011 IEEE International. IEEE, 2011. p. 39-42. FORESTIER, Germain, INGLADA, Jordi, WEMMERT, Cédric, et al. Comparison of optical sensors discrimination

RD.164 ability using spectral libraries. International Journal of Remote Sensing, 2013, vol. 34, no 7, p. 2327-2349. Julien, Y., and J. A. Sobrino. 2009. “Global Land Surface Phenology Trends from GIMMS Database.”

RD.165 International Journal of Remote Sensing 30: 3495–3513. Pettorelli, N., J. O. Vik, A. Mysterud, J.-M. Gaillard, C. J. Tucker, and N. C. Stenseth. 2005. “Using the Satellite- RD.166 Derived NDVI to Assess Ecological Responses to Environmental Change.” Trends in Ecology & Evolution 20: 503–510. Reed, B. C., J. F. Brown, D. Vanderzee, T. R. Loveland, J. W. Merchant, and D. O. Ohlen. 1994. “Measuring

RD.167 Phenological Variability from Satellite Imagery.” Journal of Vegetation Science 5: 703–714. Hagolle, Olivier, Mireille Huc, David Villa Pascual, et Gérard Dedieu. 2010. « A multi-temporal method for cloud RD.168 detection, applied to FORMOSAT-2, VENµS, LANDSAT and SENTINEL-2 images ». Remote Sensing of Environment 114 (8) (août 16): 1747‑1755. doi:10.1016/j.rse.2010.03.002.

MICHEL, Julien, INGLADA, Jordi, et MALIK, Julien. Object based and geo-spatial image analysis: a semi- RD.169 automatic pre-operational system. In : Remote Sensing. International Society for Optics and Photonics, 2010. p. 78300H-78300H-8. Bizzell, R. M. ;Feiveson, A. H. , Hall, F. G. et al. (1975), CROP IDENTIFICATION TECHNOLOGY ASSESSMENT FOR RD.170 REMOTE SENSING (CITARS): INTERPRETATION OF RESULTS Vol. VOLUME X Defourny, P., Bicheron, P., Brockmann, C., Bontemps, S., Van Bogaert, E., Vancutsem, C., Huc, M., Leroy, M., Ranera, F., Achard, F., Di Gregorio, A., Herold, M., and Arino, O., 2009. The first 300-m Global Land Cover Map RD.171 for 2005 using ENVISAT MERIS time series: a Product of the GlobCover System. In: Proceedings of the 33rd International Symposium on Remote Sensing of Environment (ISRSE). Stresa, Italy. Bontemps S., Van Bogaert E., Defourny P., Kalogirou V. and Arino O., GlobCover 2009, “Products description RD.172 manual”, December 2010. Bontemps, S., Defourny, P., Brockmann, C., Herold, M., Kalogirou, V., Arino, O. 2012, New Global Land Cover RD.173 mapping exercise in the framework of the ESA Climate Change Initiative, Proceedings IEEE International Geoscience and Remote Sensing Symposium (IGARSS) Fritz, S., See, L., You, L., Justice, C., Becker-Reshef, I., Bydekerke, L., Cumani, R., Defourny, P., Erb, K., Foley, J., Gilliams, S., Gong, P., Hansen, M., Hertel, T., Herold, M., Herrero, M., Kayitakire, F., Latham, J., Leo, O., RD.174 McCallum, I., Obersteiner, M., Ramankutty, N., Rocha, J., Tang, H., Thornton, P., Vancutsem, C., van der Velde, M., Wood, S. and Woodcock, C. (2013), The Need for Improved Maps of Global Cropland. Eos, 94: 31–32. doi:

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 23 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

10.1002/2013EO030006 Boryan, C. G., Yang, Z., Liping, D., Hunt, K., “A new Cropland Data Layer based automatic stratification method RD.175 for U.S. agricultural area sampling frame construction,” National Agricultural Statistics Service, Washington DC, USDA, February 2013. Baret, F., Weiss, M., Troufleau, D., Prevot, L., Combal, B., & Bryson, R. J. (2000). Maximum information RD.176 exploitation for canopy characterization by remote sensing. In Remote sensing in agriculture, Royal Agricultural College, Cirencester, UK, 26-28 June 2000. (No. 60, pp. 71-82). Chen, J. M., Menges, C. H., & Leblanc, S. G. (2005). Global mapping of foliage clumping index using multi- RD.177 angular satellite data. Remote Sensing of Environment, 97(4), 447-457. Garrigues, S., Allard, D., Baret, F., & Weiss, M. (2006). Influence of landscape spatial heterogeneity on the non- RD.178 linear estimation of leaf area index from moderate spatial resolution remote sensing data. Remote Sensing of Environment, 105(4), 286-298. Hoefsloot, P., Ines, A., van Dam, J., Duveiller, G., Kayitakiret, F., & Hansen, J. (2012). Combining Crop Models RD.179 and Remote Sensing for Yield Prediction: Concepts, Applications and Challenges for Heterogeneous Smallholder Environments. Atzberger, C. (2013). Advances in remote sensing of agriculture: Context description, existing operational RD.180 monitoring systems and major information needs. Remote Sensing, 5(2), 949-981. Chen, J. M., & Black, T. A. (1992). Defining leaf area index for non‐flat leaves. Plant, Cell & Environment, 15(4), RD.181 421-429. Curnel, Y., & Oger, R. Agrophenology Indicators from remote sensing : State of the art. In ISPRS Archives XXXVI- RD.182 8/W48 Workshop Proceedings: Remote Sensing Support to Crop Yield Forecast and Area Estimates, ISPRS, ed. Bach, H., & Mauser, W. (2003). Methods and examples for remote sensing data assimilation in land surface RD.183 process modeling. Geoscience and Remote Sensing, IEEE Transactions on, 41(7), 1629-1637. Launay, M., & Guerif, M. (2005). Assimilating remote sensing data into a crop model to improve predictive RD.184 performance for spatial applications. Agriculture, ecosystems & environment, 111(1), 321-339. Rembold, F., Atzberger, C., Savin, I., & Rojas, O. (2013). Using low resolution satellite imagery for yield RD.185 prediction and yield anomaly detection. Remote Sensing, 5(4), 1704-1733. Bartholomé, E. M. and Belward A. S., 2005, GLC2000; a new approach to global land cover mapping from Earth RD.186 Observation data, International Journal of Remote Sensing, 26, 1959 - 1977 Friedl MA, DK McIver, JCF Hodges, XY Zhang, D. Muchoney, A. H. Strahler et al., 2002. Global land cover RD.187 mapping from MODIS: algorithms and early results, Remote Sensing of Environment 83 (1-2): 287-302 Defourny, P., Blaes, X. Bogaert, P. (2006), Respective contribution of yield and area estimates to the error in RD.188 crop production forecasting, ISPRS Archives XXXVI-8/W48 Workshop proceedings: Remote sensing support to crop yield forecast and area estimates. Defourny, P. (2012), AG-01 Global Agricultural Monitoring and Early Warning, Group on Earth Observations RD.189 (GEO) Work Plan Symposium 2012 Duveiller, G. and Defourny, P. 2010b. A conceptual framework to define the spatial resolution requirements for RD.190 agricultural monitoring using remote sensing. Remote Sensing of Environment, 114(11):2637–2650. Weiss, M., Baret,F., Myneni, RB, Pragnère, A., Knyazikhin, Y. (2000), Investigation of a model inversion RD.191 technique to estimate canopy biophysical variables from spectral and directional reflectance data, Agronomie 20 (1), 3-22. Ahamed, T., Tian, L., Zhang, Y., and Ting, K.C. (2011). A review of remote sensing methods for biomass RD.192 feedstock production. Biomass and Bioenergy In Press, , -. Liu, X., Li, X., Zeng, Q., Mao, J., Chen, Q., and Guan, C. (2010). Validating MODIS surface reflectance based on RD.193 field spectral measurements. International Journal of Remote Sensing 31, 1645–1659. Balint, Z., Mutua, F., and Muchiri, P. (2011). Drought Monitoring with the Combined Drought Index. Nairobi: RD.194 FAO-SWALIM.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 24 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Zhang, X., Friedl, M.A., Schaaf, C.B., Strahler, A.H., Hodges, J.C.F., Gao, F., Reed, B.C., and Huete, A. (2003). RD.195 Monitoring vegetation phenology using MODIS. Remote Sensing of Environment 84, 471–475. Zhang, Y., Chen, J.M., and Miller, J.R. (2005). Determining digital hemispherical photograph exposure for leaf RD.196 area index estimation. Agricultural and Forest Meteorology 133, 166–181. Begue, A., Hanan, N.P., and Prince, S.D. (1994). Radiative transfer in shrub savanna sites in Niger: preliminary RD.197 results from HAPEX-Sahel. 2. Photosynthetically active radiation interception of the woody layer. Agricultural and Forest Meteorology 69, 247–266. Chen, J.M., and Black, T.A. (1992). Defining leaf area index for non-flat leaves. Plant, Cell and Environment 15, RD.198 421–429. Houlès, V., Mary, B., Machet, J., Guérif, M., and Moulin, S. (2001). Do crop characteristics available from RD.199 remote sensing allow to determine crop nitrogen status. Green, D.S., Erickson, J., and Kruger, E. (2003). Foliar morphology and canopy nitrogen as predictors of light- RD.200 use efficiency in terrestrial vegetation. Agricultural and Forest Duveiller, G. (2011). Crop specific green area index retrieval from multi-scale remote sensing for agricultural RD.201 monitoring. Université catholique de Louvain. Global Agricultural Monitoring systems by integration of earth observation and modelling techniques RD.202 (GLOBAM) Project, STEREO 2 Program, SR/00/101 Contract, Belgian Science Policy Office (Belspo) Yannick Curnel, Allard J.W. de Wit, Grégory Duveiller, Pierre Defourny, Potential performances of remotely sensed LAI assimilation in WOFOST model based on an OSS Experiment, Agricultural and Forest Meteorology, RD.203 Volume 151, Issue 12, 15 December 2011, Pages 1843-1855, ISSN 0168-1923, 10.1016/j.agrformet.2011.08.002. Sepulcre-Cantó, Guadalupe; Gellens-Meulenberghs, Françoise; Arboleda, Alirio; Duveiller, Gregory; De Wit, Allard; Eerens, Herman; Djaby, Bakary; Defourny, Pierre; ,Estimating crop-specific evapotranspiration using RD.204 remote-sensing imagery at various spatial resolutions for improving crop growth modelling,International Journal of Remote Sensing,39,9-10,3274-3288,2013,Taylor & Francis de Wit, Allard; Duveiller, Gregory; Defourny, Pierre; ,Estimating regional winter wheat yield with WOFOST RD.205 through the assimilation of green area index retrieved from MODIS observations, Agricultural and Forest Meteorology,164,,39-52,2012,Elsevier Justice C.O. and Becker-Reshef I.(Eds.) (2007) Report from the Workshop on Developing a Strategy for Global RD.206 Agricultural Monitoring in the framework of Group on Earth Observations (GEO), UN FAO, July 2007, Geography Dept., University of Maryland, 66, pp Duveiller, Grégory; Defourny, Pierre; ,A conceptual framework to define the spatial resolution requirements for RD.207 agricultural monitoring using remote sensing,Remote Sensing of Environment,114,11,2637-2650,2010,Elsevier Latham, J., Defourny, P., Korme, T., Savin, I., Beard, L., Agricultural mapping (land use change understanding). RD.208 AGRISAT Wokshop 2010, Brussels Dixon, J., Gulliver, A. Gibbon, D., Farming Systems and Poverty - Improving farmers’ livelihoods in a Changing RD.209 World, FAO and World Bank Verhegghen, A. and Defourny, P. Global land surface vegetation phenology using 13 years of SPOT RD.210 VEGETATION daily observations. (2013) Phd Thesis Van Wart J, van Bussel LGJ, Wolf J, Licker R, Grassini P, Nelson A, Boogaard H, Gerber J, Mueller ND, Claessens RD.211 L, van Ittersum MK, KG Cassman. 2013a. Use of agro-climatic zones to upscale simulated crop yield potential. Field Crops Research. 143, 44-55 MOCCCASIN D’ANDRIMONT, R., DUVEILLER, G.., DEFOURNY, P. 2011. Exploring the capacity to grasp multi- annual seasonal variability of winter wheat in continental climates with MODIS. Multitemp : 6th International RD.212 Workshop on the Analysis of Multi-Temporal Remote Sensing Images. Trento (Italy), 12-14 July 2011. Proceeding. Claverie, M; E.Vermote; M.Weiss; F.Baret, O.Hagolle, V.Demarez, Validation of coarse spatial resolution GAI RD.213 and FAPAR time series over cropland in southwest France, submitted to Remote sensing of Environment

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 25 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

AMESD, 2013 reference : Sustainable developement in Africa & satellites, e-book http://www.satellites-and- RD.214 africa.com/ 115p. Brink, A. ; Bodart, C. ; Brodsky, L. ; Defourny, P. ; Ernst, C.; Donney, F. ; Lupi, A. ; Tuckova, K., Anthropogenic pressure in East Africa - monitoring 20 years of

RD.215 land cover changes by means of medium resolution satellite data, 2013, International Journal of Applied Earth Observation and Geoinformation, Geoland 2 special issue, submitted

Table 1-2: Applicable documents

1.3.3 Acronyms and abbreviations

Acronym Definition AD Applicable Document ADD Architecture Design Document Agricultural and Resources Inventory Surveys through Aerospace Remote AgRISTAR Sensing AMIS Agricultural Market Information System AOD Aerosol Optical Depth API Application Programming Interface AR Acceptance Review AT Aerosol Type ATBD Algorithm Theoretical Basis Document ATD Acceptance Test Document AVHRR Adavanced Very High Resolution Radiometer BEAM Basic ERS & Envisat (A) ATSR and Meris toolbox BGT System Technical Budget Document BIPR Background Intellectual Property Rights BV-NET Biophysical Variable - Network BOA Bottom Of Atmosphere Cab Leaf Chlorophyll content CB Capacity Building Plan CCI Climate Change Initiative CCM Cirrus Cloud Mask CDR Critical Design Review CeCILL CEA CNRS INRIA Logiciel Libre CEOS Committee on Earth Observation Satellites CESBIO Centre d'Etudes Spatiales de la Biosphere CFI Customer Furnished Item

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 26 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

CHRIS-PROBA Compact High Resolution Imaging Spectrometer – PROBA CITARS Crop Identification Technology Assessment for Remote Sensing CLC Corine Land Cover CNES Centre National d'Etudes Spatiales CSK Cosmo-Skymed Cw Leaf water content DDF Design Definition File DDV Dark Dense Vegetation DJF Design Justification File DP Demonstration Plan DRD Document Requirements Definition DUE Data User Element EARS European Association of Remote Sensing EC European Commission ECHO Extraction and Classification of Homogeneous Objects ECSS European Cooperation for Space Standardization EMMAH Environnement Méditerrannéen et Modélisation des Agro-Hydro systèmes EO Earth Observation ER Exploitation Report ESA European Space Agency ESA TPM ESA’s Third Party Mission EU European Union FAO Food and Agriculture Organization fAPAR Fraction of Absorbed Photosynthetically Active Radiation FAS Foreign Agricultural Service FAT Factory Acceptance Test FCOVER Fraction of vegetation Cover FEWS-NET Famine Early Warning System FNEA Fractal Net Evaluation Approach FORMOSAT FORMOsa SATellite FOSS Free Open Source Software FR Full Resolution FTP File Transfer Protocol FVC Fractional Vegetation Cover GAI Green Area Index

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 27 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

GCC General Conditions of Contract GCP Ground Control Point GDAL Geospatial Data Abstraction Library GEO Group on Earth Observations GEO-GLAM Global Agricultural Geo-Monitoring GEOSS Global Earth Observation System of Systems GFOI Global Forest Observation Initiative GIAM Global Irrigated Area Map GIEWS Global Information and Early Warning System GIS Geographic Information System GLAM Global Agriculture Monitoring GLC2000 Global Land Cover 2000 GLCC Generic Land Cover Classification GLCN Global Land Cover Network GMES Global Monitoring for Environment and Security GMFS Global Monitoring for Food Security GMRCA Global Map of Rainfed Cropland Areas GPL General Public Licence GPU Ground Power Unit GRSS Geosciences and Remote Sensing Society HPC High Performance Computing HR High Resolution HTTPS HyperText Transfer Protocol Secure HW Hardware ICD Interface Control Document IEEE International Institute of Electrical and Electronic Engineers IPR Intellectual Property Rights ITK Insight Segmentation and Registration Toolkit ITT Invitation To Tender JECAM Joint Experiment for Crop Assessment and Monitoring JRC Joint Research Center Level 1A (product) : Radiometrically corrected images, with metadata for L1A geometric correction Level 1C (product) : Ortho rectified product expressed in Top of Atmosphere L1C reflectance

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 28 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

L1T Level 1T (product) : Landsat Ortho-rectified product (expressed in radiance) Level 2A (product) : Ortho rectified product expressed in surface reflectance, L2A provided with a cloud mask L3A Level 3A (product) : Cloud free time and space composite of Level 2A products LACIE Large Area Crop Inventory Experiment LAI Leaf Area Index LANDSAT Land Satellite LICOR LAI 2000 sensor LUCAS Land Use/Cover Area-Frame Survey MACCS Multisensor Atmospheric Correction and Cloud Screening MARS Monitoring Agricultural ResourceS MDG Millennium Development Goal MERIS MEdium Resolution Imaging Spectrometer MF Maintenance File MGT Management File MGVI Meris Global Vegetation Index MOCCCASSIN MOnitoring Crops in Continental Climates through Assimilation of Satellite Information MODIS Moderate Resolution Imaging Spectroradiometer MRD Mission Requirement Document MS Multi-Spectral MSI Multi-Spectral Instrument MTCD Multi-Temporal method for Cloud Detection MTCI Meris Terrestrial Chlorophyll Index NASA National Aeronautics and Space Administration NASS National Agricultural Statistics Service NBAR Nadir BRDF-Adjusted Reflectance NDVI Normalized Difference Vegetation Index NDWI Normalized Difference Water Index NgEO Next Generation User Services for Earth Observation OBIA Object Based Image Analysis ODK Open Data Kit OSAT On Site Acceptance Test OSSIM Open Source Security Information Management OTB Orfeo Tool Box

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 29 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

PAP Product Assurance Plan PDGS Payload Data Ground Segment PDR Preliminary Design Review PM Progress Meeting POLDER POLarization and Directionality of the Earth's Reflectances PP Promotional Package PPWS Password-Protected Web Site PROBA-V Project for On-Board Autonomy - Vegetation PSF Point Spread Function PTSC French Land Data Center (French acronym) PVAR Prototype Validation and Assessment Report PWS Public Website QR Qualification Review QRR Qualification Review Report RB Requirements Baseline RD Reference Document RID Review Item Disposition S2 Sentinel 2 S2AGRI-SC Sentinel 2 Agriculture software Component S2AgriP Sentinel 2 Agriculture Prototype S2PAD Sentinel 2 Product and Algorithm Definition Sen2-Agri Sentinel2 - Agriculture S2PAD Sentinel-2 Product and Algorithm definition SAIL Scattering by Arbitrarily Inclined Leaves SAR Synthetic Aperture Radar SCF Software Configuration File SDP Software Development Plan SFTP Secured File Transfer Protocol SITS Satellite Image Time Series SoW Statement of Work SPOT Système Pour l'Observation de la Terre SRF Software Reuse File SRS System Requirements Specification STARFM Spatial and Temporal Adaptive Reflectance Fusion Model STC

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 30 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

SUM Software User Manual SVM Support-Vector Machine SW Software SWIR Short Wave Infra Red TDS Test Data Sets TOA Top Of Atmosphere TS Technical Specification UCL Univertisté catholique de Louvain UN United Nations URD User Requirement Document USAID United States Agency for International Development USDA United States Department of Agriculture USGS United States Geological Survey VALERI Validation of Land European Remote sensing Instruments VALSE2 Exploitation of ESA Campaigns for Sentinel-2 Level 2 Prototype Validation (“VALidation Sentinel-2”) VHR Very High Resolution VI Vegetation Index VNIR Visible Near Infra Red VR Validation Report WFS Web File Service WMO World Meteorological Organization WMS Web Map Service WP Work Package WV Water Vapour Content

Table 1-3: Acronyms

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 31 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

2 Understanding of the requirements

2.1 Background

2.1.1 Context

Achieving sustainable food security for all people is a key priority that was already stressed during the World Food Summit that took place from 13 to 17 November 1996 [13RD.16]. This historic event, convened at the Food and Agriculture Organization (FAO) headquarters in Rome, brought together close to 10.000 participants and provided a forum for debate on one of the most important issues facing world leaders in the new millennium: the eradication of hunger. This priority was re-affirmed and re-highlighted during the Millennium Summit of the United Nations (UN) in 2000, which defined the eradication of extreme poverty and hunger as one of the eight Millennium Development Goals (MDGs). Three years to the MDGs deadline, broad progress was reported in varying topics, including the poverty reduction target. However, hunger remains a global challenge [RD.1]. In June 2012, the declaration of the G20 Mexico summit emphasized the needs for enhancing food security and addressing commodity price volatility. Since several international food supply crisis as first observed in 2008, these top decision-makers recognize that increasing production and productivity on a sustainable basis while considering the diversity of agricultural conditions is one of the most important challenges that the world faces today. The excessive commodity price volatility has significant implications for all countries, increasing uncertainty for actors in the economy and potentially hampering stability of the budgets, and predictability of economic planning. Mitigating the negative effects of commodity price volatility on the most vulnerable is an important component of reducing poverty and boosting economic growth. Recognizing the important contribution of greater transparency to reducing food price volatility, the implementation of the Agricultural Market Information System (AMIS) should lead to a more stable, predictable, distortion-free, open and transparent trading system. The lack of progress on hunger in several regions (even as income poverty has decreased) is well reflected by the most recent FAO estimates of undernourishment, which set the mark at 850 million living in hunger in the world in the 2006-2008 period, corresponding to 15.5 % of the world population [RD.1]. The prevalence of hunger remains uncomfortably high in sub- Saharan Africa and in Southern Asia (outside of India). In the developing regions, the proportion of children under age five who are underweight declined from 29 % in 1990 to 18% in 2010. In this respect, progress was recorded in all regions where comparable data are available, but it remains insufficient to reach the global target by 2015. Food demand is driven by population size and consumption habits, which in turn are influenced by culture and standards of living. With regard to the demographic aspect, the UNs report that the world’s population has increased by 130% during the last 50 years, from an estimated 3.0 billion in 1960 to 4.4 billion in 1980 and 6.9 billion in 2010 [RD.4]. They also project that it will continue to increase by 0.8% per year to reach 9.2 billion people in 2050, with population densities and demographic trends variations from one region to the other.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 32 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Achieving sustainable food security for all people will need to grow agricultural production by 70% and up to 100% in developing countries relative to 2009 levels [RD.17]. In addition, it will require paying attention to the fact that agriculture not only provides food and feed but also generates products for the energy (biofuels), materials (wood, fibers, textiles) and chemicals industries and that the competition among these sectors (Food, Feed, Fuel and Fiber) for agricultural resources is increasing [RD.4]. The need to adapt agriculture to climate change and the importance of improving the efficiency of water and soil use in a sustainable manner are also major challenges ahead. To this end, the development of information technologies to monitor the agriculture and all related practices are essential to support policy makers and to provide a report on science- based options to improve the resources use efficiency (water, land, fertilizers, pesticides) in agriculture including for small farms. Clearly, sustainable agricultural growth is a critical component in efforts to meet the demands and challenges faced by agriculture worldwide and discover new opportunities for poverty and hunger reduction in the developing and transitional world. The policies, practices and technologies needed to boost production and strengthen food security have long been discussed [RD.17]. Institutional mechanisms, the development of trade and markets and the financial facilities needed to raise productivity in a sustainable way have been negotiated at the international level. At national level, measures to raise output and strengthen food security are being put in place, including investment in pro-poor, market- friendly policies, institutions and incentives, as well as the infrastructure and services needed to improve productivity. In the context of the Group on Earth Observations (GEO) supporting the sustainable management of the earth’s resources using satellite remote sensing, the Global Earth Observation System of Systems (GEOSS) identified the development of remote sensing agriculture applications as one of its strategic targets. The GEO Agricultural Monitoring Community of Practice was created in 2007 in the framework of its Agriculture Societal Benefit Area to gather representatives from the various organizations and institutions interested in enhancing international monitoring capabilities around the world, including the FAO and the World Meteorological Organization (WMO). This Community of Practice is fostering increased communication, sharing of experience and engaging institutions in building capacity around the world, through international and bilateral cooperation. Thanks to an international working group meeting conveyed by the Canadian Space Agency in July 2012, we have tentatively defined the short-term observation requirements for CEOS in a global perspective for agriculture monitoring (Figure 2-1). The information requirements identified here do not include the meteorological variables and should be completed with the addition of spatial and temporal components, such as area maps and crop calendars. Such an overall analysis highlights the critical importance of the decameter resolution capabilities to cover the whole diversity of the agricultural landscapes as already identified during the Agriculture User Consultation of the S2 Preparatory Symposium [RD.9].

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 33 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 2-1: The observation requirements as defined by the GEO Ag CoP for CEOS (2012)

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 34 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

In this overall context, the continuous availability of Landsat-8 and Sentinel-2 time series is a unique step for large scale agriculture monitoring capabilities using satellite remote sensing. Indeed while the early days of the Landsat program in the 70’s immediately recognized the great potential of satellite observation for agriculture (the North-American Large Area Crop Inventory Experiment (LACIE), the Agriculture and Resources Inventory Surveys through Aerospace Remote Sensing (AgRISTARS) and later, the first European Union (EU) MARS (Monitoring Agricultural ResourceS (MARS)) initiatives), most of the current operational monitoring still much rely on proxies and qualitative indicators based on medium to coarse spatial and spectral resolutions. Today the development of better agricultural monitoring capabilities is clearly considered as a critical tool for strengthening food production information and market transparency thanks to timely data and information about crop status, crop area and yield forecasts. The enhanced understanding of global production will contribute to tackle price volatility by allowing local, national and international operators to make decisions and anticipate market trends with reduced uncertainty. This is also a prerequisite for the definition and monitoring of any agricultural policy.

2.1.2 Review of current agriculture monitoring practices

As previously mentioned, remote sensing was quickly recognized as a valuable tool for monitoring crop production. A few years after the launch of Landsat-1 in 1972, the first civilian satellite for EO, the North-American Large Area Crop Inventory Experiment (LACIE) demonstrated the feasibility of satellite-based multispectral data for estimating wheat production [RD.80]. In agricultural monitoring, timely and spatialized information about general growing conditions of the vegetated land is of high importance. Several operational agricultural monitoring systems are currently operating at national and international scales. Most are based on the technology inspired from the experiences acquired from the pioneering work done in LACIE, AgRISTARS and MARS, and use remote sensing to compare, on a yearly basis, observed growth conditions in order to estimate the risks of deviation from a normal year. The primary international monitoring systems include the FAO Global Information and Early Warning System (GIEWS), the ongoing MARS project of the European Commission (EC) at the Joint Research Centre (JRC), the United States Agency for International Development (USAID) Famine Early Warning System (FEWS-NET), the Crop Watch Program of the Chinese Academy of Sciences and the United States Department of Agriculture (USDA) Foreign Agricultural Service (FAS) Global Agriculture Monitoring (GLAM) System. National or regional systems have also been developed in Brazil, India, China, Russia, the US and the EU [RD.157]. A series of activities of the GEO Agricultural Monitoring Community of Practice aimed at documenting the current monitoring capabilities at the national and global levels and allowed publishing some best practices documents, a synthesis of all the operational agriculture monitoring systems using remote sensing around the world and an overall diagram depicting the relationships between EO and agriculture applications (Figure 2-2). The black arrows describe most of the current operational systems while the grey ones indicate more advanced use corresponding to specific cases (typically rice), prototypes or pilot scale applications. The empty vertical arrows show the scientific state of the art demonstrated on local test sites.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 35 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 2-2 : A synthesis of the agriculture applications of satellite remote sensing as endorsed by the GEO Agriculture CoP (RD.189). Agriculture monitoring systems require at least three different types of information often combined to deliver production information:  cropland areas, cultivated areas and eventually crop type areas estimates for a given year; these information can be either obtained before, during or after the growing season with increasing accuracy;  crop growth condition or status describing either plant development stages (sowing, emerging, tillering, heading/flowering, ripening phases for instance), plant stress (water and diseases), crop damages (frost killing, storm impact, etc.) or vegetative condition increase (e.g. greenness, above ground biomass, fAPAR, LAI); this information delivered along the season is often used to detect the anomalies in growing cycle (departure from normal conditions) or to drive more or less quantitatively crop growth model;  yield estimate or indicator; in principle this can be only crop specific typically derived from in situ observation or model simulation as grain yield cannot be directly assess by remote sensing. In operational systems, this is however currently often approximated by the overall crop conditions analysis (anomalies, profile similarities with previous years, etc.) or empirical regression (e.g. RD.87; RD.88).

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 36 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The timing of these different information and their respective combinations at various scale allow delivering different level of information from the preliminary area outlook and early warning report to the yield forecast and then the post-harvest production estimate. It is of paramount importance to mention that the value of such information depends very much of its timeliness and is decreasing rapidly over time. In practice, most of large-scale agriculture monitoring and crop-yield forecasting systems generally rely on three pillars: (i) regionalized analyses of cultivated areas, crop type distribution and crop condition based on near-real-time satellite imagery merged with available in-situ observations; (ii) meteorological monitoring and mid-term forecasts based on observation networks and model outputs; and (iii) regionalized knowledge of agricultural systems and their sensitivity to meteorological conditions. EO data provided by advanced satellite constellations are currently used at the global scale for meteorological forecasts and some actors, either public or private, have developed operational geo-information applications for global agricultural monitoring using crop growth model (e.g. CropExplorer, CropWatch, EARS company). These various programs and systems often based on the analysis of local reports, in situ observation, satellite data and model outputcurrently generate crop yield forecasts at the national, regional or global scales through bulletins or as information brokers. The main national or international operational systems are described briefly in the brochure Global Agriculture Monitoring edited by the GEO Agricultural Task (2011). As reported by the preparation document of the GEOGLAM initiative in which we are actively involved, there are today a small number of global agricultural monitoring systems in place and a large number of national monitoring capabilities in different stages of development. The practical expertise for monitoring is found in these institutions that have demonstrated an operational capacity for providing such services. In addition, a number of research organizations and institutes are involved in developing basic research and, in some cases, transitioning proven methods into the operational domain. However, the potential for further improvement of national and global agricultural monitoring capacity is high, and it raises various challenges. Sharing of experiences, enhancing capacity building and transferring advanced methods taking advantages of new technology are major issues widely recognized. In this perspective, we launched the Joint Experiment of Crop Assessment and Monitoring (JECAM) to enable the global agricultural monitoring community to compare results based on disparate sources of data, using various methods, over a variety of local or regional cropping systems. These experiments should facilitate international standards for data products and reporting, eventually supporting the development of a global system of systems for agricultural crop assessment and monitoring. The network of test sites is now well established and still growing; space agencies are quite aware about data requirements but this call is one of the very first to effectively develop activities across sites. There is therefore a great opportunity to leverage the whole agriculture monitoring community capacities thanks to well-targeted Sen2-Agri project activities. More recently, satellite remote sensing use has also been developed for operational agriculture monitoring in the context of the precision farming (see Farmstar example at Figure 2-3). While these satellite instruments will become more and more similar with the launch of Sentinel-2, the system objectives, the expected products and the ancillary inputs are

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 37 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 completely different than a national or regional monitoring. This is the reason why the precision farming will not be considered in this proposal.

Figure 2-3 : Example of operational precision farming system using satellite EO data (credit: Farmstar)

2.1.3 Review of agriculture monitoring state of the art

As mentioned before, there is large gap between the current practices of operational system and the scientific literature in the field of crop remote sensing. On one hand, this gap can be explained by the fact that most scientific experiments cover very limited test sites areas and that the scaling-up to national or international level is a very distinct research effort. Similarly, experimental results published in the literature often deal with one or two agricultural years which can be quite specific while operational systems often require mid- term demonstration before switching to more advanced approach. On the other hand, the poor availability of suitable in-situ and satellite data over large scale hampers large scale demonstrations.

2.1.3.1 Mapping cropland and crop types

Recent and forthcoming development of satellite remote sensing offers many possibilities for mapping cropland and crop types in various agricultural landscapes. A large diversity of cropland mapping strategies at different scales associated with various degrees of accuracy can be found in the literature. From local to regional scale, croplands are often depicted according to land cover typology focusing mainly on the natural vegetation types. Crop lands are often included in mosaic or mixed classes making them difficult to use for agricultural applications (neither as agricultural mask, nor as a source for area estimates).

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 38 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

This is typical for global land cover products, such as GLC2000 (RD.186), GlobCover 2005/2009 (RD.171, RD.172), GLCShare, MODIS Land Cover (RD.23), which are not specifically targeting the agriculture component of the landscape. Even the most recent and more precise ESA Climate Change Initiative (CCI) Land Cover products obtained from a multi-year multi-sensor approach still consider the croplands as any other land cover classes (RD.173). Alternatively, a few global crop maps were produced at global and continental scale. RD.18 produced the map of global cropland extent at 250 m spatial resolution using multi-year MODIS data and thermal data. Two other global maps specifically dedicated to croplands were produced with an emphasis on water management: the global map of rainfed cropland areas (GMRCA) (RD.19) and the global irrigated area map (GIAM) (RD.20). However, their coarse spatial resolution (10 km) does not meet the needs for operational applications and suffer from large uncertainties (RD.21) – especially in complex farming systems in Africa. Several initiatives adopted existing land cover products as inputs. RD.22 combined two satellite-derived land cover maps (Boston University’s MODIS-derived land cover product (RD.187) and the Global Land Cover 2000 (GLC2000) data set (RD.24)) with an agricultural inventory to produce the cropland mask at 10 km. One alternative for better cropland information is through hybridization, e.g. the integration of all available maps into a single product (RD.174). In that vein, Fritz (RD.25) combined existing land use/land cover data sets (i.e., GLC2000, MODIS Land Cover, and GlobCover (RD.171) relying on expert knowledge and national statistics to produce a probability map of cropland areas. However, the product merges general land cover products that do not focus on agricultural areas and that have an unadapted spatial resolution for mapping cropland (from 300 m to 1 km). To cope with these issues, RD.27 compared and combined ten data sets through an expert- based approach in order to create the derived map of cropland areas at 250 m covering the whole of African continent. Still at the regional scale, RD.28 proposed a dynamic mapping of cropland areas in Sub-Saharan Africa using MODIS time series. Yet, these compilation efforts, while being extremely valuable, present the twofold disadvantage of being spatially inconsistent and up-to-date. More recently, large scale remote sensing product have been completed to deliver a global croplands mask (RD.18) or a global soybean distribution map (Hansen et al., 2012 – oral communication). The GEOLAND-2 SATCHMO product (RD.215) focus on 10 x 10 km Landsat extracts to automate the croplands mapping and croplands conversion at 30-m spatial resolution over many countries in order to deliver agricultural expansion statistics at national scale. At the national level, RD.29 developed an automated cropland classification algorithm combining Landsat, MODIS, and secondary data to differentiate cropland extent, areas, and characteristics (e.g., irrigated vs. rainfed). Besides, RD.30 proposed a stratified approach to discriminate the cultivated areas in the fragmented rural landscapes of Mali. Locally, cropland is often extracted in a two-step classification scheme to support further crop type distinction (RD.31 and RD.32). Crop inventory by remote sensing has been studied extensively since the early days of the discipline. CITARS (Crop Identification Technology Assessment For Remote Sensing) was

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 39 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 among the first experiment designed to quantitatively evaluate crop identification performance using a well-defined set of automatic remote sensing data processing techniques [RD.170]. A few years later, in the framework of AgRISTARS (the successor of the LACIE (Large Area Crop Inventory Estimation) [RD.74]), accurate delineation of major crops was proven achievable and improved subsequently the land cover estimates [RD.69]. Today, the USDA releases a yearly crop specific map, the Cropland Data Layer, thanks to satellite imagery and ancillary data such as the NASS June Agricultural Survey [RD.175]. Crop type discrimination is based on differential spectral response of crops. Single-date multispectral imagery (RD.33; RD.49) yields good accuracy if the imagery is taken during the optimum crop discrimination period for a given region (RD.34). Multi-date imagery often outperforms the single-date approach (RD.35; RD.36; RD.37; RD.38) because it takes phenological changes into account. The most trivial way to deal with multi-date data is to stack the spectral bands and classify them all at once (RD.39). Another efficient approach is to tidily select the bands with the most discriminative power according to spectral of statistical distances (RD.41), data mining algorithms (RD.42) or other sophisticated methods (RD.43). Hierarchical or multi-stage approaches have also been developed to deliver a range of thematic maps for the same season (RD.44; RD.45). For example, RD.37produced a series of four crop-related maps that progressively classified crop/non-crop, general crop types, specific summer crop types, and irrigated/non-irrigated crops. In fact, few studies integrated time in their classification framework in order to deliver in-season estimates. RD.46 adopted a sequential masking for identifying summer crops on a field basis. RD.47 described an approach in which multi-temporal Landsat-TM data from the pre-planting season, when the cloud cover is minimal, was used in order to evaluate the area to be planted with annual crops in the rainy season. Depending on geographic area, crop diversity, field size, crop phenology, and soil condition, different band ratios of multispectral data and classifications schemes have been applied. Capabilities of traditional classifiers were extensively tested for crop type mapping such as parallelepiped, minimum distance, Mahalanobis Classifier distance, spectral angle mapper and maximum likelihood (RD.52; RD.34; RD.57). Decision trees have also been widely implemented for crop classification purposes and are used operationally by The National Agricultural Statistics Service (NASS) of the USDA to produce the Cropland Data Layer (RD.58). Machine learning and pattern recognition algorithms have generated encouraging results, e.g. random forests (RD.50), artificial neural networks (RD.64) or support vector machine (RD.59) – see RD.60 for a comparison. Evolutionary algorithms were also included successfully in neural networks (RD.61). The large availability of multi-resolution images with a high temporal density favored the exploitation of the time domain (RD.62) with methods such as harmonic analysis (RD.63), wavelet decomposition (RD.65) or spectral- temporal response surface (RD.66). The current temporal sampling of high resolution data leads to less dense and irregular time-series due to meteorological phenomena. As a result, analysis accounting for spatial information (e.g. contextual information) became more common than those exploiting the time domain. RD.38 proposed Hidden Markov Models relating the varying spectral response along the crop cycle with plant phenology, for different crop classes, to recognize different agricultural crops by analyzing their spectral profiles over a sequence of images. RD.67 presented an approach to image time series analysis – dynamic time warping – which deals with irregularly sampled series and which compares pairs of time- series with different number of samples. Planning on the availability of both high spatial and temporal resolution time-series in the coming years, RD.68 blended temporal and spatial

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 40 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 features in the same classification algorithm and demonstrated its relevance for crop type classification. Another aspect, independent from the classification algorithm, is related to the considered spatial unit, which can be either the pixel or the field. Pixel-based classification techniques often failed to determine the borders of agriculture parcels (RD.48). Spatial filters increase the accuracy by removing the small inclusions of other classes within the dominant class (RD.49). Parcel-based approach was found to be more accurate than pixel-based (RD.50). Field limits might be derived either from digital vector database (RD.51) or by segmentation (RD.52). Even if Sentinel-2 spatial resolution is expected to resolve most fields, it seems that in particular conditions (RD.53) sub-pixel approaches (as developed by RD.56; RD.54 and RD.55) will remain necessary.

2.1.3.2 Crop area extent (acreage) estimates

Currently, only a limited number of operational crop acreage estimate systems exist. They can typically be differentiated according to the intensivity of EO data and remote sensing techniques used. Following the GEOSS Workshop on Crop Area Estimation with remote sensing in 2008, a reference document was published to summarize the best practices in the field (RD.75). According to this report, the existing operational crop area estimate systems make essentially use of field data (rather than only on EO data) due to timing and accuracy requirements as well as classification feasibility. Timing requirement is straightforward: the earlier the estimation (ideally before the harvest), the more efficient the political and agricultural responses. Such objective requires classifying images early in the cropping season, with the resulting difficulty to discriminate one crop from another when their spectral signatures are very similar. This is for instance the case for separating corn and soybean around mid of August in Midwest United-States and for distinguishing between spring planted crops. In this latter case, the NASS has taken the initiative to publish a Prospective Planting Report at the end of March, realized thanks to farmers interviews on their planting intentions (RD.70). As for the crop area estimate accuracy, it is related to the commission and omission errors appearing during the classification process, partly due to mixed pixels. These errors are not counterbalanced and therefore, bias the results. To some extent, the estimated accuracy is also influenced by the image processing method. For instance, in countries with mixed agriculture and pastoral land cover classes (e.g. Sahelian countries), using image segmentation methods instead of the commonly used pixel-based approaches could be a considerable advantage since these land cover types are structurally fairly dissimilar while being spectrally similar (RD.79). In any case, the error of early estimates should be less than the uncertainty on crop area estimated at the beginning of the season using only data from the previous year (RD.75). Such uncertainty mainly depends on the degree of historical stability of crop area; it also depends on how predictable is the reaction to policy measures or to market changes. As computed by RD.188, in the larger European cereal producers (France, Poland, Spain, Germany, Italy, UK) for instance, the average rate of inter-annual changes of the total area of cereals is between 2% and 3%. Changes are often driven by known policy measures, such as compulsory set aside; therefore the level of uncertainty at the beginning of the agricultural season is generally less than 2% and early estimates should have a better accuracy to be useful which is very

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 41 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 challenging. Uncertainty is larger in some countries of central Europe (Romania, Hungary, Bulgaria), where the average inter-annual change ranges between 7% and 9% and consequently, there is a strong need of early estimates, that have a wider room to be useful for decision makers. At the operational level, almost all major agricultural monitoring systems have set up a sampling scheme of in-situ data and/or survey to assess crop area (NASS, MARS, CropWatch, Land Use/Cover Area-Frame Survey (LUCAS), etc.). Several types of design have been explored (simple/cluster, random/systematic, stratified/not stratified), each one with their strengths and weaknesses (RD.71, RD.82, RD.83, RD.84). For example, the last version of LUCAS is a two-stage systematic design scheme of unclustered points (RD.78). Remote sensing data are then used to produce a crop mask and a stratification of the territory to improve the efficient of the sampling (RD.73). Sometimes, in-situ data cannot be collected because of high cost or difficulties access (e.g. in unsecured regions). In these cases and when the aim is to get a crop map more than only statistics, remote sensing is almost exclusively used. They can be processed more or less automatically or visually photo-interpreted from high spatial resolution images (RD.72, RD.76, RD.77). Sentinel-2 performances may pave the way to new statistics production strategies in the coming years.

2.1.3.3 Crop status

The principle to evaluate the vegetation status is to evaluate the variation of photosynthetic activity. Along with the sensors characteristics availability improvements, scientists and institutions have developed various methodologies to characterize the photosynthetic activity. This review focuses on vegetation status monitoring methodologies that were proven to be operational. As vegetation status is highly linked with plant vigor, yield-biomass and phenology, those variables will be briefly evocated even if the review itself is not exhaustive concerning those aspects. Although vegetation status is linked with many other factors such as water, soil and other environmental features, emphasis is put on direct or semi direct methods using remote sensing to retrieve crop vegetation status.  Crop information retrieved from reflectance data o Biomass and yield The major biophysical properties that directly influence the yield of biomass are vegetative indices, chlorophyll content, soil nutrients, water stress and salinity stress (RD.192). In general, the above-ground biomass can be directly estimated using remotely sensed data with different approaches. A recent review (RD.185) distinguishes three main groups of techniques that are widely used for coarse scale crop monitoring and yield estimation: (1) qualitative monitoring, (2) quantitative monitoring using regressions and (3) quantitative monitoring using crop growth model. (1) In general, the qualitative methods are based on the comparison of the actual crop status to previous seasons or to what can be assumed to be the average or ―normal‖ situation. Detected anomalies are then used to draw conclusions on possible yield limitations. In these “early warning” approaches, the actual vegetation index (VI) (at time t) is compared against the corresponding long-term-average to indicate if the

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 42 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

actual vegetation conditions are favourable (VI higher than average) or worse (VI lower than average) compared to the ‘usual’ situation (RD.180). (2) In contrast to the qualitative approaches, the regression approaches must necessarily be calibrated using appropriate reference information. In most cases, agricultural statistics and, specifically, crop yield are used as reference information. This pre- requisite limits its applicability in many regions of the world. (3) While the photosynthetically active plant tissues and their activities can be monitored with a vegetation index approach, biomass of tissues that are not photosynthetically active cannot be estimated (RD.193). To quantify total biomass accumulation, key crop descriptors need to be estimated and assimilated in crop growth models (RD.183;RD.184). Crop growth models simulate growth processes, such as photosynthesis and biomass allocation, in response to solar radiation and other meteorological factors for given soil properties and management practices. o Plant vigor and stress For monitoring vegetation stress (in particular, drought) simple VI-based approaches are often not sufficient (RD.180). The main drawbacks relate to the fact that they often rely only on one parameter (the observed VI) and do not consider the persistence of the stress periods (RD.194). Hence, in the absence of easy-to-use monitoring tools and methodologies, often rudimentary methodologies are used, like the annual rainfall amount. However, sometimes even a few weeks of unfavorable climate conditions induce already serious plant stress. Hence, simple rainfall anomalies are not well suited for real-time monitoring purposes. A suitable drought index should combine information about precipitation, temperature and soil moisture strength and persistence. o Phenology The phenological dynamics of terrestrial ecosystems reflect the response of the Earth’s biosphere to inter- and intra-annual dynamics of the Earth’s climate and hydrologic regimes (RD.195). Remotely sensed satellite enables the monitoring of simple phenological events, such as the start and peak of vegetation growth. Weekly or 10-daily composites are often used for deriving phenological indicators such as start, peak, end and length of the season (RD.185). Anomalies in the timing of any of these indicators can then be used again as symptoms of variation. Different remote-sensed indicators for assessing vegetation phenology, for the most part based on smoothed Normalized Difference Vegetation Index (NDVI) curves, have already been proposed in various studies (for a review, see RD.182). These indicators are computed on moving averages, NDVI thresholds, logistic curves or maximum rate of changes. The phenological metrics directly derived from remote sensing information are generally the start and the end of the growing season and also the moment of maximum greenness.  Biophysical variables The considered variables correspond to usual variable obtained by remote sensing. o Leaf Area Index (LAI) LAI is defined as defined as half the total developed area of green leaves per unit of ground horizontal area (RD.181). It determines the size of the interface for exchange of energy (including radiation) and mass between the canopy and the atmosphere. This is an intrinsic

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 43 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 canopy primary variable that should not depend on observation conditions. LAI is strongly non-linearly related to reflectance. Therefore, its estimation from remote sensing observations will be strongly scale dependent (RD.176; RD.178). Note that vegetation LAI as estimated from remote sensing will include all the green contributors, i.e. including understory when existing under forests canopies. However, except when using directional observations (RD.196), LAI is not directly accessible from remote sensing observations due to the possible heterogeneity in leaf distribution within the canopy volume. Therefore, remote sensing observations are rather sensitive to the “effective” leaf area index, i.e. the value that would produce the same remote sensing signal as that actually recorded, while assuming a random distribution of leaves. The difference between the actual LAI and effective LAI may be quantified by the clumping index (RD.177) that roughly varies between 0.5 (very clumped canopies) and 1.0 (randomly distributed leaves). The satellite based LAI products are generally not the same variables as the LAI in crop growth models or the LAI measured in a field. A main reason for this discrepancy is that available satellite LAI are produced from reflectance obtained from coarse spatial resolution pixels, in which various different types of vegetation covers are present. For the same reason, several scientists have proven that the satellite based LAI can differ considerably from field measured LAI (RD.179). o Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) FAPAR corresponds to the fraction of photosynthetically active radiation absorbed by the canopy. The FAPAR value results directly from the radiative transfer model in the canopy which is computed instantaneously. It depends on canopy structure, vegetation element optical properties and illumination conditions. FAPAR is very useful as input to a number of primary productivity models based on simple efficiency considerations (RD.197). Most of the primary productivity models using this efficiency concept are running at the daily time step. Consequently, the product definition should correspond to the daily integrated FAPAR value that can be approached by computation of the clear sky daily integrated FAPAR values as well as the FAPAR value computed for diffuse conditions. FAPAR is relatively linearly related to reflectance values, and is little sensitive to scaling issues. This variable is actually more closely related to yield than LAI. For diverse reasons (one being that FAPAR is generally not a state variable in the current generation of simulation models) it seems to be much less popular for data assimilation in crop models, even though it probably avoids some of the problems/uncertainties encountered with LAI (RD.179). o Fractional Vegetation Cover (FVC) It corresponds to the gap fraction for nadir direction. FVC is used to separate vegetation and soil in energy balance processes, including temperature and evapotranspiration. It is computed from the leaf area index and other canopy structural variables and does not depend on variables such as the geometry of illumination as compared to FAPAR. For this reason, it is a very good candidate for the replacement of classical vegetation indices for the monitoring of green vegetation. Because of its quasi-linear relationship with reflectance, FVC will be only marginally scale dependent (RD.191). o Canopy chlorophyll content: LAI.CBabB The chlorophyll content is a very good indicator of stresses including nitrogen deficiencies. It is strongly related to leaf nitrogen content (RD.199). This quantity can be calculated both at

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 44 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 the leaf level and at the canopy level by multiplication of the leaf level chlorophyll content by the leaf area index. In this case it is obviously an intrinsic secondary variable. Recent studies tend to prove that this product could be of very high interest in primary production models because it partly determines the photosynthetic efficiency (RD.200). In addition, studies have demonstrated that a direct estimation of LAI.CabB B is more robust and accurate than an estimation based on the product of the individual estimation of LAI and CabB B (RD.191).

Therefore, the estimation of LAI.CabB B has been preferred to that of the leaf chlorophyll content.  From state-of-the-art to operational monitoring The literature review is also full of interesting papers to derive crop status information using one year of data over a given test site. These studies allow developing advanced methods much needed in the field but have almost no impact on operational systems since decades. The GLOBAM project coordinated by UCL was supported by the Belgian Science Policy Office in order to fill the methodological gap between the current state of the art of local crop monitoring and the operational global monitoring systems (RD.202). Three large sites (respectively located in Northern Europe, China and Ethiopia) were the experimental basis of the whole 4-year research. Within this project, several significant steps have been achieved to better estimate the crop yields at NUTS-3 level based on 300 x 300 km sites (RD.157, RD.205). Crop type mapping, the Leaf Area Index (LAI) retrieval from optical and SAR imagery (RD.201) and the crop growth modelling along with the corresponding assimilation method of the retrieved biophysical variables (RD.205; RD.203, RD.204) were improved thanks to quantitative and physically-based methods. These approaches were expected to be more generic, more portable and more robust with regards to exceptional growing conditions than empirical or statistical analysis. RD.190 also proposed a conceptual framework to define the spatial resolution requirements for crop mapping and biophysical retrieval addressing the problem in terms of pixel purity and scale, taking into account the interactions of the instrument’s point spread function (PSF) and the pattern of the target fields in the landscape. Finally, the compatibility and the possible complementarities between time series of LAI estimated from SAR data and those retrieved from optical imagery were analysed in the perspective of their assimilation in crop growth models to forecast the forthcoming yield. Based on this synthesis of current operational monitoring system and the recent scientific achievements in the field, the great opportunity related to the Sen2-Agri project is very clear. There is a critical need widely expressed for major improvements in agriculture monitoring capabilities and the technological advances expected from Sentinel-2 is instrumental to meet these needs. On the other hand, the global community of practices is getting more and more organized and the recently established JECAM network should allow to address the scaling- up issue faced by many research studies when dealing directly with the diversity of the world. 2.2 Objectives

2.2.1 Sentinel-2 Agriculture objectives

The Sen2-Agri project is designed to develop, demonstrate and facilitate the Sentinel-2 time series contribution to the satellite EO component of the agriculture monitoring for many agricultural systems distributed all over the world. Indeed, the temporal revisit frequency, the

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 45 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 specific spectral bands, the high spatial resolution combined with the 290 km wide swath of the Sentinel-2 mission are particularly suited to record regularly key variables of interest over cultivated landscapes. The overall objective is to provide to the international user community validated algorithms, open source code and best practices to process Sentinel-2 data in an operational manner for major worldwide representative agriculture systems. In the context of the Data User Element (DUE) programme, a user-oriented approach will drive the entire project in order to address concrete user needs and requirements. In order to achieve such an ambitious objective in a very limited time frame, the project has to rely on already well-established components (i) for user community involvement, (ii) for the practice and knowledge of image time series pre-processing and (iii) for the processing toolbox development. First, the international agriculture community of practices which led to the development of the JECAM network and the GEOGLAM design is very well known by the prime investigator who is actively committed since the early days. Second, our consortium implemented multi-sensor pre-processing tools that fully use the multi-temporal dimension of the Sentinel-2 mission and has gathered a deep practice of the difficulties inherent to this work through the processing of thousands of images (Formosat-2, Landsat, SPOT4-Take5). Third, our consortium is also committed in the development and tuning of multi-temporal processing methods in the open source OrfeoTool Box. These points are a major asset supported by the consortium partnerships to build further additional capabilities for agriculture monitoring in almost any country. The specific objectives of the project are therefore as follows:  to consolidate the international user requirements and priorities related to already identified key EO products, i.e. cloud free composite, dynamic cropland mask, cultivated crop type and areas, and vegetation status;  to review, test, select and possibly combine algorithms in processing chain or strategy based on a network of at least 13 globally distributed sites to provide the 4 key EO products already identified;  to develop the corresponding ATBDs along with open source code to deliver routinely in an operational context Sentinel-2 agricultural EO products from L1c data level;  to demonstrate and validate in situ the developed Sentinel-2 agricultural EO products and services on local scale for many sites and on national scale for 3 countries thanks to the scalability of the proposed solution facilitating the usability of very data large volumes;  to promote to key agriculture monitoring stakeholders and ease the ownership of the proposed solution based on Sentinel-2 and the open source tool box, through a specific relationship with the JECAM network, with the representative user group including the EU MARS project and the GEOGLAM partners. These quite ambitious objectives can be addressed thanks to the very complementary expertise gathered in the consortium. UCL has a well recognized experience to address the global scale for land cover production and actively contributes to the community of practices of global agriculture monitoring system from a long-standing research in agriculture monitoring in various parts of the world. CESBIO has recently acquired a unique experience

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 46 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 of Sentinel-2 like data set processing and is very well known for its expertise in advanced image processing for agriculture applications. Being the main developer of the OrfeoToolbox, CS expertise will be instrumental to develop further this open source software already selected by ESA for Sentinel-2. Along with CS, CS-R will implement the code and be in charge of the production.

2.2.2 Perspectives beyond Sentinel-2 Agriculture

The S2-Agri project will provide the user community with validated methodologies, open source and portable software and best practices to generate EO products for agriculture monitoring while preparing the operational exploitation of Sentinel-2 observations, taking into account the gaps and challenges as identified by the GEO science report (RD.206), section on Agriculture. The design of this overall project really aims at impacting significantly and over the long-term the community of practices of agriculture monitoring at global scale. The project will benefit from a number of current opportunities, development in satellite systems, methods and techniques, including:  Advances in crop mapping at local and national scale Never before such a dedicated effort in remote sensing to agriculture monitoring have been put together in a global perspective. The benchmarking of a large number of algorithms and processing strategies over well distributed sites will clearly contribute to fill the gap between operational systems and state-of-the-art in the field. The expected results will be a major step in crop and crop type mapping in particular. Through past and ongoing research projects, such as Geoland, ESA Land Cover CCI, GMFS, GlobCover (2005, 2009), GLCN, China Cover and others, methods for producing land cover classifications have reached a certain maturity. Although it is commonly recognized that the “crop class” is often a mixed class because of the resolution with a limited usability and accuracy, developments in these projects represent valuable experience to tackle the global scale for crop land mapping. These key results will become a cornerstone for future of agriculture monitoring because of the overall context. From a technological point of view, the Sentinel-2 specifications with most suited spectral bands, spatial resolution and temporal revisit capabilities is the best asset for this ambition and will hopefully last very long. From an organizational point of view, the distributed network of sites already established by JECAM initiative is a major element to enable the global demonstration and transfer the acquired experience. From a software development point of view, the OrfeoToolbox will remain supported far beyond the project allowing further development or adjustment and its integration in ESA package will enhance its dissemination in the remote sensing community.  Transfer to operational agricultural monitoring systems At the end of the project, 3 countries will have effectively experienced and demonstrated the Sentinel-2 contribution to their own on-going agriculture monitoring activities (see the letters of the proposed countries in section 19). Their experiences along with the one from the local scale demonstration will then be shared widely in details through the Users Workshop and

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 47 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 thanks to a close connection with the JECAM network (joint meeting as expressed in their letter of interest). Major stakeholders such as the ongoing EC JRC MARS, FEWS-net, US-FAS, FAO-Giews, Cropwatch and others will be watching the project as champion users through annual Users Workshops and provide valuable insights into do’s and dont’s with respect to operational agricultural monitoring, which will result in clear research and development target and objectives to effectively contribute to operational crop monitoring. Much beyond the classical users’ requirements, their active participation is very important because the move to operational system is mainly driven by very practical constraints often neglected in the design of solutions.  Existing and forthcoming availability of a variety of sensors supported by the proposed software solution A wide variety of satellite systems exist globally which ensures redundancy and operational availability. European missions such as the Sentinel series and the PROBA-V mission will very significantly change the data access. Both the increased spectral and spatial detail of current and planned systems, as well as availability of longer term satellite data time series represent unique opportunities to boost current agricultural monitoring by satellite remote sensing. The software development for Sentinel-2 will build on already existing routines able to deal with various sensors and systems. The proposed solution will therefore also evolve beyond the project allowing the users community to take advantage of forthcoming sensors.  Global connectivity and global experience The global breakthrough expected by Sentinel-2 imagery and associated data policy will last far beyond the project and the project will have served as pioneer demonstration in one of the most active and strategic field of optical earth observation with regards to routinely operated systems. GEO has fostered international partnerships and, as a consequence, exchange of research results has provided individual researchers with a more integrated view and pathways for possible improvements. In addition, data sharing mechanisms further stimulate exchange of data and experiences. The JECAM network is expected to develop further its activities as the R&D platform for the GEOGLAM activity and this project will be one of its first cross- cutting experiences with long lasting influence. Last but not least, the current large implications of the consortium in past, present and future FP-7 projects in agriculture monitoring will allow diffusing the scientific results not only in the scientific literature but also, in direction to many other science partners. 2.3 Project outputs

Because of its structure, the Sen2-Agri project will deliver results of four different natures. Some of them will be delivered to ESA while other outputs will be provided elsewhere in other forms.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 48 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Main deliverables to ESA 1. A core of processing strategies combining advanced algorithms to produce four types of EO agriculture products and able to deal with the large range of agricultural landscapes observed around the world, as long as there is a compatibility between the Sentinel-2 performances and the cropping systems. This will be fully documented in the ATBD document. 2. An open source and portable software developed from the OrfeoToolbox to convert the Sentinel-2 L1c data into cloud free multispectral surface reflectance mosaic and to relevant EO products thanks to the efficient implementation of the processing strategies defined in the ATBD. This will be fully documented and demonstrated for 8 sites distributed around the world plus some other on voluntary basis. 3. A set of 4 validated Sentinel-2 derived products for each of the 8 demonstration cases, including surface reflectance cloud free composite, cropland mask, crop type map and their respective area estimate, and crop specific vegetation status map. Each product will be delivered with quality flags which characterize their uncertainty. At least 5 sites well distributed over the world will provided as service demonstration for a full Sentinel-2 scene (290 x 290 km) by the consortium. For at least 3 demonstration cases at national scale, these sets will be locally produced using a system running at the premises of the end-user in the country. The performances of these products and the end-users assessment will be analysed and discussed in the validation report. More precisely, the suite of the Sen2-Agri EO products consists of a complementary outputs building on each other. While the generic users requirements in term of timeliness and accuracy reported in the Statement of Work are quite interesting, it is expected that they will have to be further defined according to the respective context, i.e. farming system complexity, already existing crop information, expected use of the information in the existing decision making process. However, it is foreseen that the sequence of the output production will be very similar:  Cloud free surface reflectance composite The preprocessing of the L1c Sentintel-2 will provide surface reflectance which will serve as input either as daily image or as composite for the subsequent products. The compositing interval will be region specific to find a the best trade-off between valid observation availability, information timeliness and period of the cropping cycle (monthly composite a meaningless during the fast growing period of the plant cycle but maybe quite useful in some region for the croplands mask better discriminated between soil preparation and emergence).  Dynamic cropland mask This is binary map delivered several times along the agricultural season, ideally starting at the end of the previous growing season, right after harvest. The regular update should lead to an accuracy progressively increasing along the growing season. It must reach a sufficient accuracy one month after the sowing date to provide the required mask in order to monitor only the cultivated plants with the vegetation status unlike most current system using medium resolution sensor. Figure 2-4 shows an example of a dynamic cropland mask for 1990-2000- 2010 based on Landsat extracts.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 49 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 2-4 : Example of final products for croplands extension for 1990-2000-2010 from Landsat 10x10 km extracts using semi-automated processing methods to contribute to the EU country profile. (as completed by UCL/JRC/GISAT in Geoland-2).

 Cultivated crop type map and area estimate This crop type map will of course built on the best cropland mask to process the required time series for the discrimination of the main crops. As mentioned before, the obtained accuracy for the main crops is much more important than the discrimination of all crop types. Similarly, as explained in the literature review, a balanced distribution between omission and commission errors for each crop type may be better than a slightly higher overall accuracy as this will lead to a much better area estimate than a biased product. The area estimate will be delivered at the most convenient aggregation level from a validation and users point of view.  Vegetation status This product describes on a 5 to 10 days basis the evolution of the green vegetation corresponding to the vegetative development of the crop. The very interesting set of Sentinel- 2 spectral bands is suitable to go much further than the classical NDVI anomalies which may be requested by the users for the sake of continuity. Of course these features will be provided to the user, but biophysical variables such as fAPAR and/or LAI will be also derived according to the performances reported during the benchmarking phase. It is very important to mention that most of the retrieval algorithms will better performed if a crop specific radiative transfer or Neural Network model can be inverted (RD.157). Therefore, we anticipate the use of the crop type map to retrieve the biophysical variables using a crop specific model. Last but not least, unlike the agro-meteorological community much involved in the global crop monitoring, many agronomists and crop specialists in charge of crop monitoring from the field would much prefer the occurrence dates of some important development stages than the

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 50 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 biophysical variables to assess the quality of the on-going season. This will only be most probably a more advanced product tested on very well know local demonstration cases. Main outputs delivered to crop monitoring community 4. Three complete processing systems including open source software implemented on hardware designed to process the Sentinel-2 acquisition over the whole country and to derive these EO products and demonstration of their respective performances. 5. Training sessions and best practices document for Sentinel-2 agriculture applications, in situ data collection and EO crop product validation. 6. Exchange, demonstration and performances discussion with the champion users group through the 3 Users Workshop organized along the course of the project. Proactive Sentinel-2 promotion to the global agriculture monitoring community through the Community of Practices and more specifically to the GEOGLAM initiative thanks to the direct involvement of the UCL-Geomatics in these arenas. Other main outputs of the Sen2-Agri project 7. Scientific papers submitted in the best peer reviewed remote sensing and agriculture journals, conference presentation, newsletters, and user workshop report. 8. Scientific contribution to the JECAM network by completing one of the very first cross-cutting analysis across sites and to organize jointly with the JECAM coordinator meetings and other possible activities allowing to disseminate in the community the Sen2-Agri progress and outputs (see the letter of the JECAM coordinator). 9. Support for Sentinel-2 programme thanks to the illustration of its societal benefits at global scale for food security and prize volatility reduction, one of the hot-topic for the future as identified by the G20.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 51 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

3 Problem understanding

3.1 Addressing the large diversity of agricultural systems

Farming systems and agricultural landscapes are the long term results of the combination of a natural resources and climatological context with a cultural heritage, a political history and a socio-economical production. Rural landscapes have been progressively shaped by the land use trajectories most often over centuries or even more. There is very little in common between the landscapes of the Red River Delta in Northern Vietnam with 1000 persons per squared kilometers living from food production (water management since centuries with three to four cropping cycles a year) and the one of the Southern part of Central African Republic where 8 persons per squared kilometers can live from shifting cultivation and forest harvesting without any problem. This is the reason why the agricultural landscapes are so diverse in spite of a relative standardization of the agriculture products and the large international market integration for the main commodities. In addition, agriculture is recently undergoing rapid structural transformations, as reflected by a number of key trends concerning the land tenure status, reliance on non-agricultural activities, modes of market integration, new forms of enterprise, urban-rural relationships, new food systems niches, etc. The existing classification of the farming systems made by FAO and the World Bank (RD.209) relies on the available natural resource base, on the dominant pattern of farm activities and household livelihoods, including relationship to markets; and on the intensity of production activities. The 72 farming systems typology is not really relevant for their monitoring by remote sensing as they are far too much related to socio-economic profiles. On-going study on the mapping of the field parcel size distribution around the globe, the Global Agro-Ecological Zones framework (RD.211) and global reference phenological database (RD.210) are of much more interest to stratify the cultivated lands from global scale. The major ambition of this proposal is surely to use a single technology, i.e. the satellite remote sensing, to deliver standardized products to map and measure in a relevant manner the cultivated lands of the whole range of agricultural systems. While the Sentinel-2 specifications are most probably the most suited to address such a challenge thanks to its spatial, spectral and temporal resolutions combined with its wide swath capabilities, it is still not realistic to deal with the global diversity of the agricultural world.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 52 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 3-1: High resolution image and derived crop specific mask for six different agricultural landscapes distributed on different agro-climatic zones (RD.207). In order to maintain the project ambition while making it achievable in a 3-y project, we propose to scope the target based on our previous experiences of global mapping research, on our agriculture monitoring experiences in many countries, on the community of practices discussion started in 2007. Of course the consolidation of the users’ requirements will allow further adjusting the following starting points:  the cultivated areas or croplands is here restricted to surface planted on annual basis (excluding all perennial crops, orchards and multiyear crop such as cassava); this may still include some agroforestry landscapes such as in the Sahelian region as long as the crops dominated the landscape; this also excludes the vegetables and market gardening often important for African household supply in suburban areas but not compatible with a remote sensing approach;  the crop types of interest are limited to the 3 to 5 first dominant crops which play a major role in the production systems (excluding the annual grasslands); some crop types much synchronized and difficult to discriminate on the ground will be clustered in a single class; case of several cropping cycles over a year the focus is on the main or the two main crops;, for the mixed or associated crops which cannot be untangled by remote sensing, again only the main production is consider.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 53 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 only agricultural systems with fields patterns compatible with 10-m remote sensing and with a decent cloud free data availability will be addressed by the Sen2-Agri project; these correspond to most of the agriculture lands of the world and for the incompatible zones, alternative approach such as SAR monitoring should be considered. Beyond the huge diversity of cropping systems around the world, the more hidden challenge is the local heterogeneity of the agricultural practices. Indeed, more synchronized and standardized cropping practices as observed in the 90’ in Europe including regular rotation allow discriminating easily many crop types. The success of alternative cropping systems like no tillage and conservation agriculture introduce much higher diversity than before, in addition to the agricultural common policy influence leading to the development of set aside parcels, unusual crops, grassland edges, etc. Similarly, the spatial heterogeneity of extensive agriculture practices such as observed in Russia as well the agro-meteorological variability as observed mountainous regions prevent the use of expected trajectories for crop discrimination. The global ambition is of the Sen2-Agri project can only be tackled thanks to long experience of the UCL team to deal with the global diversity of surface reflectances to extract meaningful land cover information. From a significant contribution to GLC2000 to the automated GlobCover processing chain providing the first 300-m global land cover map and the most recent Land Cover CCI multiyear land cover maps, the complexity of the scaling-up issues faced by many applications has been successfully addressed. Using 10 to 30 m resolution imageries, CESBIO as well UCL have developed different automated processing chains to capture the land cover information. Based on these very complementary experiences, the overall strategy is to develop a set of advanced methods and algorithms which can be combined in a flexible way according the region of the world. The robustness and adaptability of the proposed approaches should work over a large variety of sites, landscapes and climates. While a global solution fully integrated in a single package will be proposed, its flexibility in terms of combinations and parameterization will allow addressing the global agriculture systems diversity. 3.2 Timeliness of the EO products derived from S2 large volume

The timeliness of the crop information is of paramount importance for any operational agriculture monitoring systems. Indeed, the value of most agricultural information rapidly decreases over time. As the matter of fact, the annual cycle of the production leads to a sequence of critical dates for information delivery. The actual reasons for these information requirements vary according to the production and to the regional context of course. Early warning for food insecure countries, volatility of international market prize driven by export forecast, and policy-makers have different mechanisms to use the regularly updated information. For instance, the sugar beet industry needs planted acreage and yield forecast quite early in the growing season to book much in advance the necessary trains for transportation after harvest. The European Union identified four dates to deliver up-to-date information in relation to a calendar of four types of decision for the Common Agriculture Policy management.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 54 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

In addition to the end users requirements, some of the EO products are needed for subsequent activities such as the croplands mask to monitor the growing season typical using NDVI anomalies or the crop status to be forced or assimilated in an agro-meteorological model in order to forecast the expected yield. The time constraints here come from the processing flows. Delivering the EO products in a very timely manner on a regular basis from a very large volume satellite data is a major issue from methodological, numerical and organizational point of views. This Sen2-Agri project is a very first opportunity to experiment frequent high spatial resolution data recorded in 13 bands across a large swath. Furthermore, the arrival of Sentinel-2 data in less than 2 years may not be sufficient for operational agriculture monitoring and Landsat-8 have to be considered as a complementary source of data while waiting for Sentinel 2b. The challenge to meet here is to prove the capacity to handle rapidly the processing of whole countries, with the associated large volumes of data and products. It is important to highlight that the products volume will be also a challenge from the end-user perspective and particular attention will be paid to anticipate this problem during the product specification. To tackle this challenge, our vision is based on:  a high level of standardization of the input data sets, providing as input to the processing methods products with the same level of quality whatever the site, based on an accurate level 2A atmospheric correction and to the production of gap filled surface reflectance composites;  a deep knowledge and practice of time series processing and a thorough understanding of very different agriculture practices, to be able to quickly identify and mitigate new problems brought by the processing of diverse types of landscape;  an automated processing approach that enables to produce large surfaces with a minimum of human intervention, enabling to test several approaches and qualify their performances and robustness;  quick iterations between algorithm development and tuning and the experimentation and validation of the results on the set of widely distributed sites. In order to be able to process the data automatically for a timely product delivery, a key idea is to standardize the remote sensing reflectance products thanks to advanced preprocessing including:  A high quality ortho-rectification with sub-pixel accuracy;  Very good snow, cloud and cloud shadows mask;  An accurate atmospheric correction;  Gap filled time and space composites. The huge data volume generated by Sentinel-2 time series at large scale also imposes a great constraint on the system performance. Thus the proposed solutions will be based on well- tested software used in operational contexts and providing maximal performance. The processing chains need to be deployed on a performing system administrated by the consortium to maximize the global system performance. Solutions must be provided so that

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 55 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 the orchestration of the different processes leading to the final products optimizes the occupation of hardware resources as much as possible to generate the intended products in a timely fashion. Last, the software used as a basis of processing implementation is required to be performing and capable of processing very large volume of data. This will be provided by the consortium through the reuse of Orfeo Toolbox, which provide an architecture suitable for processing arbitrary volume of data, through the use of streaming (processing of data by tiles) and multi- threading (processing of unitary tiles on an arbitrary number of cores). The architecture of the library provides the scalability needed to cope for future evolution of the hardware resources where the system is deployed. 3.3 Portable open source solution for operational production

The solution developed within the frame of the project is required to be portable and open source. It must be open source to ensure that no restriction can be applied by IPR holders to the use of the system, and that ESA and end-users will not need any special conditions to be negotiated with original providers for the current system, and for its reuse and modification in future systems if desired. So the solution proposed need to leverage on pre-existing FOSS libraries and software, carefully chosen both for their technical functionalities and licensing terms, so that the overall system can be licensed with a GPL license. The solution need to be portable to ensure the long term maintenance and scalability of the system, by providing possibility of deployment on arbitrary hardware and operating system meeting basic requirements, and to ensure users appropriation of the system is maximal. The portability of the system is enforced by the reuse of widely deployed software libraries known to be portable across a very large number of platforms and operating systems, like Orfeo Toolbox, GDAL, Python, etc. The purely open source status of all the software on which the development will rely ensures long term portability, since eventual modification to the core components are made possible if ever needed to support specific platform not supported today. The challenging problem is therefore to deliver within 2 years such an open source and portable solution to process huge amount of data to address the whole diversity of the agricultural world. The key answer is to capitalize on a widely recognized toolbox for high resolution image processing which is fully tested, open source and portable and to be able to build on to complement the suite of the already developed algorithms. These are probably the reasons why ESA has already selected the Orfeo Toolbox for Sentinel 2 development. This toolbox already includes state of the art segmentation algorithms and many most advanced classification methods. As main developer of the Orfeo Toolbox, the CS partners and a member of CESBIO team are most experienced with this environment and most capable to take advantage of the already implemented algorithms and to add specific development required to customize the tool to operational agriculture applications.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 56 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

3.4 Algorithms selection and solution development before S2 era

A peculiarity of the Sen2-Agri project is to carry out all the experimental part without the specific input data and to select and implement a global solution beforehand. This is very uncomfortable and risky with regards to the full scale demonstration expected for the first year of Sentinel 2 data exploitation. Similar situations were already successfully handled by UCL for the GlobCover classification chain development. There is no doubt that in the coming years, Sentinel-2 data will change the way land surfaces are monitored using satellite remote sensing. When both Sentinel-2 satellites are in orbit, users will have access to free data sets with an unprecedented coverage and repetitivity of observations at a decametric resolution. Contrarily to SPOT like data, users will have the assurance that their site will be observed without needing to program it, and although the cloud cover is unreliable, they will be able to count on a cloud free every month or every second month on most regions. Compared to LANDSAT, a large improvement will be available in terms of spatial and spectral resolution and repetitivity. A large number of operational operations will arise from this new capability, and existing ones (such as the Land cover Maps production) will be deeply enhanced thanks to Sentinel-2. However, compared to SPOT-Like data, to which European users are used an trained, the new features brought by Sentinel-2 will demand a change in methods and use habits. Even if the theory of these methods is now well understood, there is a large necessity to test these methods with realistic data sets, in order to confront them with the complexity of nature and the diversity of observed images. The flexibility mentioned previously to address the global diversity also allow coping with the risk of misleading results due to the absence of Sentinel-2 performances for the test. In most cases, it is expected that the test performances should be overpassed thanks to the better resolutions of Sentinel-2 imagery. However the wide swath coverage clearly introduces significant agro-ecological gradient in the input data enhancing the spatial heterogeneity of certain classes. Machine learning algorithms are designed to take most advantage of the available data set and should adapt to the Sentinel-2 image content. The proposed sites selection is making most use of the JECAM network where time series have been already acquired thanks to the free data access policy supported by CEOS commitment. ESA acquisition strategy provided very nice data set over some of these sites in particular thanks to the SPOT4 (Take5) experiment. Additional SPOT4 (Take5) sites have been selected to complement the list. Actually, CESBIO has already acquired some experience using Sentinel-2 like data set. Indeed they worked hard to obtain remote sensing data sets that simulate the multi-temporal dimension of Sentinel-2 data sets. First works were done with Formosat-2 time series that had the same revisit time but lack coverage, and were only available above a few sites. Landsat 5 and 7 data were also uses to increase the coverage but here the repetitivity was much reduced. Finally, it is worth mentioning that CESBIO proposed this great opportunity of the SPOT4 (Take5) experiment accepted by CNES, in order to provide the land remote sensing community with a large data set of image time series, with Sentinel-2 like repetitivity and resolution, spanning a large range of climates and landscapes, sometimes over large sites

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 57 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

(several sites are larger than 120*120 km²) and simultaneously with the collection of in-situ data. Therefore the consortium is very much aware about the preprocessing of these dataset to be used while waiting for Sentinel-2. 3.5 EO products validation

The performances of the EO products for agriculture applications will be assessed in various ways. From the end-user point of view, a critical aspect is the reliability and the accuracy of the products. The first one refers to the reproducibility of the results with the same data and to the repeatability of the performance for another growing season. This will not be assessed in the course of the project due to its duration while this is a key element for an operational stakeholders to switch from one solution to another. The accuracy of the Sen2-Agri product is also very important and will defined the usability of the products. The challenge is mainly to collect reference data set of good quality in many places in the world during a single year. This can only rely on well-designed field data collection strategies which are feasible and scientifically sound. Best practices guide for field data collection will be developed and training for in situ data collection will also take place along with the software training. UCL and CESBIO will rely on the phase 1 lessons learnt and on their respective experiences in ground data collection for agriculture applications around the word (in the last ten years, e.g. Europe, Morocco, Tunisia, Mauritania, Senegal, Burkina, Niger, RDCongo, Ethiopia, Vietnam, Cambodia, Thailand, Philippines, Russia, China).

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 58 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

4 Work description

The project is organized in three consecutive phases over a total period of 36 months. Each phase has specific objectives:  Phase 1 corresponds to an initial design phase. It is dedicated to user requirements consolidation (task 1), data collection and pre-processing (task 2) and agricultural EO products specification, algorithms benchmarking and development and system development (task 3). Its duration is set to 12 months.  Phase 2 aims at implementing the Sen2-Agri system, prototyping the agricultural EO products and assessing the performance of the developed system and products. It will also serve to prepare the phase 3 by defining the plan that will be applied to demonstrate the agricultural EO products with Sentinel-2 data. It corresponds to the task 4 and lasts 10 months.  Phase 3 is mainly made by the final demonstration phase (task 5), which focuses on the demonstration of the Sentinel-2 agricultural EO products with the champion user groups in real life conditions. This phase also includes activities that promote the project (task 7) and draw the main conclusions from this project and deliver recommendations both to users and ESA (task 6). Length of phase 3 is planned to be 14 months. It is intended to start after the commissioning phase of the Sentinel-2 mission and could thus be delayed. A complete description of the project structure and phasing is presented in the Management proposal (chapter 3 of this proposal) through a Work Breakdown Structure diagram and a Gantt chart. This section details the work to be performed under each task. 4.1 Task 1 - Users requirements consolidation

This task aims at consolidating the requirements for the Sen2-Agri EO products and services. The requirements are established paying specific attention to (i) users’ needs in terms of EO agricultural products, (ii) worldwide applicability and (iii) relevance for the Sentinel-2 mission. To fulfil these objectives, Task 1 is divided in 3 sub-tasks that are described hereafter:  Consolidation of the initial UR by survey with champion user group (Section 4.1.1);  Selection of potential sites and main crops (Section 0);  Sentinel-2 exploitation scenario development (Section 4.1.3).

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 59 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

4.1.1 Consolidation of the initial UR by survey with champion user group (WP 1100)

The initial user’s requirements have been established during a consultation exercise organized by ESA in April 2012 with about 50 members of the agricultural user and expert communities. A representative group of users, the so-called “champion users” agreed to get actively involved in the Sen2-Agri project and to:  Advise ESA during the execution of the project ;  Participate to some tasks of the project with advise and expertise (user requirements consolidation, products validation, service quality assessment) ;  Facilitate access to suitable in-situ data and any other ancillary data available to them, in order to help implement and validate the required products.  Evaluate the project outcomes at the mid-term and final project review. The initial requirements focus on dynamics crop masks, crop area extent and type as well as vegetation status indicators. These requirements specify, amongst others, the needed coverage, delivery time, spatial resolution and thematic accuracies. Additional requirements concern data format, provision of quality flags, and access to the data, products and softwares. These requirements provided in the table A2 of the Sentinel-2 Agriculture statement of work [AD.1] are to be consolidated under several aspects. For example:  UR-1: the required time delivery for the dynamics cropland masks is “1-2 days after the end of composite period”, but the temporal frequency is “seasonal products, up to monthly updates”. The reason why the delivery of seasonal products is required within 1-2 days after the end of the composite period should be clarified, since in some circumstances (e.g. low internet bandwidth) it could lead to increased costs for a small benefit in terms of use;  UR-2: the required thematic accuracy of crop type area extent is 10 %. Does this requirement apply to national coverage and seasonal products? Is it also applicable at sub-national level? If yes, what is the reference geographic extent?  UR-3: The required temporal frequency of vegetation status indicators is 10 days. Even with the two Sentinel-2 (A and B) in orbit plus Landsat-8, this requirement will be difficult to achieve over some cloudy areas or during cloudy seasons. What could be the tradeoff to achieve in that case (e.g. 15 days, 30 days frequency)? In addition, information is required on the way the products are to be delivered to the users, such as for instance:  Do they need products at the pixel level, at the field level, aggregated over some administrative or natural (e.g. river basin) level? Do they intend to perform some Geographic Information System (GIS) processing, or to store the products (e.g. crop maps) in a database?

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 60 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Is FTP (or similar) the preferred way for downloading the products (Phase 1 or if the user is not interested by in-house processing)? Is there an interest for a web-service for easy access to the products?  Which level of human interaction are they ready to accept either to train the algorithms or to validate them? The answers to these questions, and others, will probably differ from one user to another. This means that the user requirements have first to be consolidated. In order to reach this consolidation we propose the following approach: 1) We will first perform a detailed analysis of the objective and interest of every champion user. This analysis will be based on publicly available information and on direct contacts, under ESA supervision. The focus will be on dynamics crop masks, crop area extent and type as well as vegetation status indicators. From this analysis, we will derive a preliminary table of converging and diverging refined requirements. 2) When considering the whole champion user group, the resulting converging refined requirements might correspond to the lowest common denominator, which could lead to unused products. To avoid this trap, we propose to discuss first with ESA on its high level priorities (e.g. priority given to a continent, to some kind of organization, to some type of products, etc.), from which high level criterions will be derived. 3) A questionnaire will be prepared and tested with some users before being sent to every users. The questions will address mainly (i) the main priorities of the user, (ii) the most wished products and (iii) the technical details of every product, including the delivery mode. Direct contact with the users will also be useful. Whether we will indicate at this stage which product characteristics can be expected on the basis of the current technical and scientific state of the art is an open question. 4) From the answers to this questionnaire, a synthesis will be prepared and a list of refined requirements will be proposed to ESA and to the users. For each requirement, we will indicate the achievement expected during the phases 1 and 3 of the project. 5) We suggest distributing the refined requirements to the users before the user requirement review (to be held at KO+3). One aim of this review will be to get formal approval of the list. Once approved, the list of refined requirements will become the consolidated requirement list.

4.1.2 Selection of potential sites and main crops (WP 1200)

During the project, two different types of sites are considered. In a first step, test sites will be used for the algorithm benchmarking and the testing and validating of EO product prototypes (Section 0). These test sites will be decided at the very beginning of Task 2 (section 4.2.1), using varying criteria (agro-ecological context, agricultural practices, availability of EO and in-situ data, etc.) based on a pre-selection achieved in this WP. In a second step, some sites will be used for operational production for the demonstration case of the phase 3 (Section 4.5), either at the national or the local scales.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 61 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

More precisely, the project will consider at least 4 types of sites:  Within phase 1: o A set of 13 test sites for the phase 1, to be used in methods benchmarking and S2 data simulation;  Within phase 3: o At least 5 sites among the 13 for local scale demonstration and in situ validation of the different EO agricultural products with Sentinel-2 data; o 3 demonstration cases dealing with national coverage and national diversity, 2 of them corresponding to African countries; o Additional voluntary sites either for local scale demonstration or nationwide production thanks to a proactive networking and appropriate technical support, “voluntary” meaning that these sites can use the developed S2-Agri processing chain but without receiving any funding (working on a voluntary basis) or technical support (even if their participation to the training workshop could be foreseen). The pre-selection will be done to cover as much as possible the needs of the user’s requirement (crop types, etc.) while ensuring a good representativity in terms of geographical distribution, local landscape conditions and agricultural practices. It will be based on the WP1100 (Section 4.1.1) but also on the list of JECAM and GMFS sites provided by ESA, which have a good coverage in terms of EO time series and field data.

4.1.3 Sentinel-2 exploitation scenario development (WP 1300)

The exploitation scenario of agriculture monitoring based on S2 data will be defined according to the user requirements collection (Section 4.1.1). This scenario provides users with the technical issues related to their expectations. Indeed the huge data volume induced by large S2 time series need to deal carefully with performance and facilities capacity. For example one acquisition at national scale (around 500000 km²) takes around 21 GBytes in JP2000 Lossless format. If we consider the nominal revisit of 5 day on one 6 month season, we get around 762 GBytes of S2 L1c data. Provide an estimation of the data budget for each user’s requirements can give some interesting clues to the end-users. If we consider existing UR products with some hypothesis (6 month seasons, revisit period of five days, internal raster format in JPEG2000 LOSSLESS and output format in GeoTIFF and five biophysical products), the processing system much deal with:  At regional scale (290kx290km) around 230 GBytes for one site;  At national scale (around 500000 km²) around 1.4 TBytes for one site.

These preliminary estimations have a strong impact on the hardware configuration because it needs to handle large data volume. Therefore the hardware and the software used must be carefully chosen. We estimated that a hardware system with high multi-threading (minimum 8 threads available) and high memory access (minimum 24 GBytes) must be the most efficient to deal with national scale. For regional scale a high level commercial PC must be the minimum. User’s hardware configuration will be investigated to select the user premises.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 62 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

One common point at all these facilities will be the data storage and the input data collection. Indeed the data will be retrieved from ESA facilities through internet network (automatically via ngEO downloader if possible) and need to be stored on disks with efficient I/O access (for example SAS disks). To obtain the best internet bandwidth (10Gbps to Internet and 300Mbps from Internet) we propose to use a dedicated hosting service to host the main Sentinel-2 Agriculture production facility (cf. Figure 4-1).

Figure 4-1: Proposed Sentinel-2 Agriculture facility and network bandwidth It will enhance the availability of data to end-users through the website. Moreover using a hosted device offers the guaranty to have the material always available during the service subscription without additional cost. We can propose to end-user to host their own system at their premises or to host their system close to an internet node to enhance our support. This choice depends to the technical level of the users and their facility (internet bandwidth can be low in some African countries for example). The choice of efficient I/O disk access is also a critical point for the processing chain because we need to avoid losing time with I/O disk operations. Therefore we propose to use SAS disk to store intermediate products and SATA disk to store final user’s products. Moreover to avoid saving all data and system on a third server, we propose to configure the data disk of the processing server with RAID6 method. The operating system of the processing server will be saved on the website service and on the other hand critical part of the website server will be saved on the processing server. Another interesting point to define into the exploitation scenario is the output data dissemination. Indeed the output data format and service should be carefully defined with user to have the easiest and fastest access to the data. The GeoTIFF format could be defined as the best choice to disseminate the raster data because it is widely accepted by users and their software. However the lack of the efficient compression could negatively impact its use of large scale. The JPEG2000 format used in Sentinel-2 L1c and L2a product is more efficient but less accepted due to the fact that the best JPEG2000 software solution is commercial. CS and UCL promote the use of OpenJPEG as alternative open source solution. CS has already done some interesting optimization of the library which can be released into the Sentinel-2 Agriculture project. However the GeoTIFF format will be mainly proposed to end-user. The GML format will be proposed to end users as format for vector data or mask which can be produced by the system. Other choices can be ESRI shp file of SQLite format. The last one

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 63 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 is very interesting if we need to store a large set of geometry with attributes. Its extension SpatialLite offers more functionality linked to georeferenced data. The choice will be defined during the user requirement consolidation. The ancillary information provided by the system should be also easily accessible to end users through a XML file based for example on DIMAP format. Concerning the output data service we can expose different choices to end-users:  Delivery of data on secured ftp (SFTP) or via automatic direct download link generation;  Delivery of data through web services WMS or WFS;  Delivery of data through JPIP protocol for JPEG2000 data.

The first solution based on SFTP offers the security and simplicity for users which can do all they want with the data retrieve from the server. This is the classical way to provide data to end-users. The use of direct link can be more useful than a SFTP and it was already used by USGS for the delivery of data ordering via EarthExplorer system. The second one offers the possibility to integrate directly and in a standard way the output of Sentinel-2 Agriculture products into the web client or GIS software as QGIS. This type of service allows manipulating data easily if the network bandwidth is correct. The last one offers the possibility to optimize the access of JPEG2000 data for users but it is not widely supported by open source software. Therefore we propose to deliver the data via a system of direct download link generated for each users account. We will discuss with users the possibility to offer a web service access to some output like UR-1 and UR-2 if they have specific needs. The use of various hardware configurations at user premises leads to the choice of software components which are multi-platform and can exploit all the capacity of the system. For example, if the memory is too low avoid loading in memory all the input data or if the system exposes a large set of thread using them. A library as Orfeo ToolBox is totally adapted to this great variety of system because it was designed for this and was used in various contexts (from laptop to high computing server). Moreover, the fact of using open-source software with a useful applications framework into the designed system provides the possibility to end users to test each software component in its premises. This test can be done via the interface mechanism which must be designed with Sentinel-2 Toolbox. To conclude we propose the following exploitation scenario which will be enhanced during the user’s requirements consolidation:  Hosted processing server to have the best access for end users and avoid losing time in data exchange between consortium premises and ESA Sentinel facilities;  Use a high computing server with high disk volume and efficient I/O characteristics to host the Sentinel-2 Agriculture system;  Provide output data to each end user through a system of direct download links which can be send via e-mail when products are available;  The format of these output data are the GeoTIFF. The possibility to provide JPEG2000 should be presented to end-users. The output products should respect also the in UTM tiles footprint defined by ESA for L1C and L2a products and the standard Sentinel-2 products organization;

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 64 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Use a remote sensing library adapted to handle large data volume and which use all the hardware specificities to optimize the processing.

4.2 Task 2 - Data acquisition and pre-processing

This task aims at providing the Test Data Sets (TDS) that will to be used as input to in Task 3 to select, experiment and validate the algorithms that will generate the Sen2-Agri EO products. The TDS will include in-situ data collected on the sites, as well as remote sensing time series. The TDS will be used as the basis to select, experiment and validate the algorithms to produce the earth observation products detailed in Task 3. In order to verify ensure the robustness of the methods developed in Task 3, it is wise to select generate TDS at least over a dozen of sites, spanning different crop types, climates and weather conditions. The TDS produced in this Task2 will include in-situ data collected on the sites, as well as remote sensing time series. As the time series will be provided by different satellites, with probably different processing levels, we have decided to include the pre-processing in this Task 2, in order to provide standardized products to Task 3. As a result, we decided that the remote sensing data provided to Task 3 will be Level 2A products, as defined in Sentinel-2 Mission Requirement Document [RD.13]. A description of the products level and of the available processors is provided in Subtask 2.4 (section 0). To fulfill all these objectives, Task 2 is divided in 6 sub-tasks that are described hereafter:  Site Selection (section 4.2.1);  Design and Collection of In-Situ data for all sites (section 0);  Collection of High Resolution optical time series (section 4.2.2.1);  Development of pre-processing chains (section 0);  Preprocessing of High Resolution optical time series (section 4.2.5);  Validation of test data set (section 4.2.6).

4.2.1 Site selection (WP 2100)

This activity will be based on the sites pre-selection achieved in WP1200 (section 4.1.2). The set of selected test sites will allow benchmarking the methods and algorithms with regard to the diversity of agro-ecological context, the various landscape patterns, the different agriculture practices and the actual satellite observation conditions (atmospheric pertubations, sun zenith angle and cloud coverage). The challenge is to allow identifying the most appropriate methods to scale up to the national and possibly the global scale. One of the most challenging aspect for automated algorithm is probably the spatial heterogenity of growing plants both within the field and across fields for a given crop due to local heterogenity of condition (diversity of practices, non synchronisation of practices, soil and weather heterogeneity, etc). Russian winter wheat fields are a good example of an apparent simple case as already observed in the FP7-MOCCCASIN project (RD.212).

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 65 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

For programmatic reason, it is also very important to select sites where in-situ and satellites time series are already available to start from the very beginning of the project with these data. The JECAM sites and the SPOT4-Take5 experiment are key elements in this context. More specifically, appropriate field observations corresponding to dense high resolution time series will constitue the backbone of the data sets, but other data sets will also be considered both to complement the data sets after the end of the SPOT4-Take5 experiment and to increase the repetitivity of acquisitions. Furthermore, it would be nice to distribute the sites in order to distribute the workload along the year (in situ data collection and satellite processing) and to learn from one site to the other. Therefore sites in Southern and Northen hemispheres as well as temperate versus Mediteranean regions will be considered as they provide four different growing periods spread over the year. Finally some sites should also be valuable candidates for full-scale operational production for the demonstration case of the phase 3. While these demonstration cases will be jointly decided with ESA, the local partners should be involved from the beginning of the project as test sites for the phase 1 and 2 in order to progressively acquire the expertise to manage the production in the phase 3. It is also a key advantage of consortium to be able to rely on many already existing relationships with local partners to cooperate with. In addition to the mandatory list of 8 JECAM and GMFS sites, we include 7 other sites (see As explained before (section 4.1.2), four different types of sites are currently considered in this proposal:  During phase 1: o A set of 15 test sites for the phase 1, to be used in methods benchmarking and S2 data simulation;  During phase 3: o A subset of at least 5 sites among the 12 for local scale demonstration and in situ validation of the different EO agricultural products; o 3 demonstration cases dealing with national coverage and national diversity, 2 of them corresponding to African countries; o Additional voluntary sites either for local scale demonstration or nationwide production, “voluntary” meaning that these sites can use the developed S2- Agri processing chain but without receiving any funding (working on a voluntary basis) or technical support (even if their participation to the training workshop could be foreseen). Table 8-1 presents the preliminary list of sites proposed in this project. A more detailed characterization of these sites can be found in section 4.2.1. ID Site name Location InSitu data Remote Sensing data 1 JECAM-1 Argentina JECAM Take5, L8, RE 2 JECAM-2 Belgium JECAM Take5, L8, RE

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 66 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

3 JECAM-3 China JECAM Take5, L8, RE (Shandong) 4 JECAM-4 Paraguay JECAM Take5, L8, RE (near Iguazu) 5 JECAM-5 Ukraine JECAM Take5, L8, RE 6 JECAM-6 South Africa JECAM Take5, L8, RE 7 GMFS-1 Morocco GMFS-1 Take5, L8, RE 8 GMFS-2 Ethiopia GMFS-2 Take5, L8, RE 9 JECAM & Madagascar JECAM Take5, L8, RE Take5-2 10 Take5-2 Tunisia Take5, L8, RE 11 JEC CSudmipyO Submitted as JECAM site Take5, L8, RE 12 CMaroc Submitted as JECAM site Take5, L8, RE 13 Russia Submitted as JECAM site L8 14 Kenya Submitted as JECAM site L8, Aster, SPOT 15 Senegal GMFS site L8, SPOT Table 8-1 in section 7.1) in a preliminary list. We recommend to work over 13 sites, thus removing 2 sites from the preliminary list after discussion with users and ESA (e.g. to exclude one of the two Morocco sites). Half of the proposed sites (8) are located in Africa (South Africa, Morocco (x2), Ethiopia, Madagascar, Tunisia, Kenya and Senegal). A set of 5 sites are located in Southern hemisphere (Argentina, Paraguay, South Africa, Madagascar, Kenya), thus allowing complementary field campaigns during the very beginning of the project (winter time for Northern hemisphere) if needed after the quality control of the available in-situ data set (in phase 1). Yet, it should be mentioned that winter crops in these Southern sites will not be obseved with the SPOT4-Take 5 time series. Another set of 6 sites are in temperate climates (Belgium, Midy Pyrénées, Ukraine, Russia, Argentina, Tunisia), including both irrigated and rainfed agriculture with very different cropping systems and parcel size distribution. The preliminary list of test sites, along with a short characterizaiton for each of them, is presented in Table 4-1. ID Site Location InSitu Remote Interest, specificities name data Sensing data 1 JECAM-1 Argentina JECAM Take5, L8, Southern hemisphere (2011 -) RE Temperate humid climate Main crops: soybean, maize, wheat, sunflower,

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 67 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

sorghum, barley; mainly rainfed 2 JECAM-2 Belgium JECAM Take5, L8, Northern hemisphere (2011 -) RE Temperate climate Main crops: wheat, barley, potatoes, sugar beet, maize, alfalfa, etc. Mainly rainfed UCL’s garden Access to European data bases (CAP) 3 JECAM-3 China JECAM Take5, L8, Northern hemisphere (Shandong) (2011 -) RE Temperate to semi-arid climate / monsoon climate Main crops: winter wheat, corn, cotton, vegetables

High aerosol optical thickness, Nebulosity, Snow cover in February. Very few clear images available.

4 JECAM-4 Paraguay (near JECAM Take5, L8, Southern hemisphere Iguazu) RE Humid subtropical climate Main crops: soybean, wheat and corn (planted in December or June); cotton, sesame and cassava can also be found 5 JECAM-5 Ukraine JECAM Take5, L8, Northern hemisphere (2012 -) RE Humid continental climate, with snow cover in February and March Main crops: winter wheat, spring barley, maize, soy beans, winter

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 68 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

rapeseed, sunflower, sugar beet, potatoes, winter rye, and spring wheat No typical simple crop rotation 6 JECAM-6 South Africa JECAM Take5, L8, Southern hemisphere (2011 -) RE Sub-humid to semi-arid climate Main crops: wheat and oats in winter; maize, sunflower, soya, groundnuts, sorghum, dry beans in summer

7 GMFS-1 Morocco GMFS-1 Take5, L8, Northern hemisphere RE Moderate subtropical OR Semi-arid climate OR Mediterranean Main crops: wheat mainly, early crops 8 GMFS-2 Ethiopia GMFS-2 Take5, L8, Northern hemisphere RE Tropical OR subtropical climate Main crops: cereals, pulses, oilseeds, and coffee 9 JECAM & Madagascar JECAM Take5, L8, Southern hemisphere Take5-2 (2013-) RE Tropical (coast), temperate (inland), arid (coast) climate Main crops: rice (mainly cultivated under irrigation, on terraces or basins), irrigated wheat, early crops High density of images Very small fields 10 Take5-2 Tunisia JECAM Take5, L8, Northern hemisphere (2013-) RE Temperate (north),

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 69 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

desert (south) climate Main crops: irrigated wheat (but in a very dry year), extensive rainfed olive trees High density of images Instrumented sites Land cover in-situ campaigns 11 JEC CSudmipyO JECAM Take5, L8, Northern hemisphere (midi-Pyrénées) (2013-) RE Temperate to Mediterranean climate Main crops: wheat, corn, rapeseed, sunflower Very large site (240*120), part of it being observed twice every 5 days with SPOT4-Take5 CESBIO’s garden, 2 instrumented sites, regular field campaigns for land cover and biomass Access to European data bases (Common Agriculture Policy). 12 CMaroc JECAM Take5, L8, Northern hemisphere (Tensift) (2013-) RE Semi-arid climate Main crops: irrigated wheat, alfalfa, olive trees, orchards High density of images with SPOT4-Take5 CESBIO’s garden, instrumented sites Land cover in-situ campaigns 13 Russia Submitted L8 Northern hemisphere as JECAM Moderate continental site climate Main crops: winter

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 70 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

wheat, spring barley, potatoes, maize, rape seeds, winter rye; only rainfed Large spatial heterogeneity with large fields 14 Kenya JECAM L8, Aster, Southern hemisphere candidate SPOT tropical climate Main crops: maize, wheat, cotton 15 Senegal GMFS site L8, SPOT Northern hemisphere Tropical climate Main crops: millet, sorghum, peanuts Sahelian food insecure site

Table 4-1 : Preliminary list of test sites Additional volontary sites will be the Netherlands as requested by the Alterra/Wageningen University (see letter of interest) and, if possible,the Mali JECAM site thanks to the continuous support of ILRI in in situ data collection.

4.2.2 Design and collection of in-situ data for all sites (WP 2200)

4.2.2.1 Approach and general requirements for in-situ data

The strategy for in situ observation is a key part of the interpretation process both for calibration and validation purpose. Furthermore, contextual information on the farming systems, the crop calendar, the agro-ecological conditions and the local agricultural practices are of paramount importance for the date selection and the output typology determination. During the phase 1 and 2, in situ observations are required in order to perform algorithms inter-comparison, benchmarking and selection and to validate the EO product prototypes. In situ observations are particularly needed to validate dynamic cropland mask, cultivated crop type & area and vegetation status EO services. Validation of cloud free composite service will be performed within the benchmarking exercise, described in WP3300. While contextual information about the test sites are partly available through the JECAM network, national agricultural statistics and FAO databases, direct interactions with the respective site managers are expected to provide the necessary complementary information. Such a dialogue will be also supported thanks to the participation of all site managers to the first and third Users Workshop organised in the framework of Sen2-Agri (WP7300). Different types of field data will also have to be collected either from the already completed field campaign corresponding to the available imagery (phase 1) or from specific field

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 71 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 campaign to carry out in the course of the Sen2-Agri (phase1 to complement some sites database, phase 2). We summarize hereafter the in situ data requirements:  Dynamic cropland mask The expected information is a digital map representing the croplands delineation based on a locally adjusted definition, for every test site and for the period considered in Sen2-Agri. Crop acreage or crop area statistics at various aggregation levels are also of interest.  Cultivated crop type and area The expected information is a data base of sampled fields (parcel level) under cultivation during the period considered in Sen2-Agri, at least for the four main crops or crop groups. The data base may contain the geographical reference of a single point per field or preferably the coordinates of a polygon that delimits every field. Other attributes shall include as a minimum the date of the survey and the crop type. Additional information such as crop cultivars, farming practices (irrigated/rain-fed, tilling/no tilling, sowing date and sowing density, bio-farming) would be useful.  Vegetation status indicator The expected information is a data base of sampled fields under cultivation during the period considered in Sen2-Agri. Every field shall be geo-referenced (point or polygon), and the date (day, hour) of sampling has to be provided. Vegetation status indicator should include at least one of the following quantitative biophysical variable measurements obtained along the growing season: height, leaf are index (LAI), fAPAR, percentage vegetation cover, above ground biomass. Ideally, vegetation status should be monitored every 1-2 weeks during the vegetation cycle. Additional qualitative and quantitative information would be useful, for instance : dates of the main phenological stages, dates and volumes of irrigation if any, pest/disease attacks and severity, weather events impacting plant development (e.g. flooding, drought, crop laying) The availability of these in-situ data together with simultaneous time series of satellite data will be an important criterion for selecting the sites (WP2100). In order to serve as reference information for validation purpose, the in situ data collection must fulfil a set of important criteria: independent, repeatable, transparent, scientifically sound and acceptable for the users’ community.

4.2.2.2 Quality of the datasets

Three main components constraints the quality of a validation data set: 1) the sampling design in space and time : the number, their size, the geographical distribution and representativeness of the sample sets; 2) the design and the feasibility of the measurement protocol including among others, the observation typology, the instruments, the positioning material; 3) the expertise, the commitment and the familiarity of the people available to carry out the work in a given region. The challenge is always to apply the initial data collection plan while coping with the local practical constraints, in particular with regards to the accessibility and the unpredictable events occurring in the field. This field campaign has always to be adjusted according to the

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 72 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 agro-ecological region, the agriculture practices, the local accessibility conditions, the human, financial and technical resources available. A control procedure of the validation data set quality must be also be included. The collection of field data for calibration, i.e. as training data set, can be much more flexible and possibly complete on ad hoc basis. The main objective for this training data set is often to cover most diverse situations. Quality assessment of already acquired data is a difficult task, and we will mainly rely on information provided by site managers and on i) a detailed analysis of the available data (e.g. check for consistency); ii) analysis of the protocols and sampling design used; iii) comparison to satellite time series. When using previously acquired data, it might happen that some information is lacking but can be reconstructed. For instance, if a cropland mask was not produced over a site, we will get in touch with the site manage to define the way to produce it. Photointerpretation of high resolution satellite or airborne images, combined with field survey and local expertise, is a way to obtain the missing information. Photointerpretation can be assisted by producing and using a preliminary land cover classification. Same kind of approach performed by an expert (e.g. agronomist) could be used to produce a higher number of crop type samples than initially available. Of course, samples obtained this way shall be marked differently than samples from ground surveys.

4.2.2.3 Protocols outline for in-situ data collection

When in-situ data are still to be acquired (e.g. because SPOT4-Take 5 does not fit crop calendar), we will get in touch with site managers in order to learn about their current approaches and if needed to discuss the additional surveys we need for reaching Sen2-Agri objectives. As far as possible, the protocol of these surveys will be harmonized between the various sites.  Dynamic cropland mask A random sampling of every test site should be in principle necessary, for instance by conducting ground survey at the node of a regular grid covering the whole test site. The number of samples depends on the acreage and on the complexity of the landscape. A sparse sampling grid of 1 km corresponds to 3600 samples when considering a single Spot scene. In order to produce accurate cropland mask that will serve for algorithm benchmarking, we will work in close collaboration with site managers to produce maps based on a combination of ground surveys, ancillary data if they exist (e.g. forest inventories), satellite data, classification algorithms and photo-interpretation.  Crop type and area Since one of the aims is to produce reference maps suitable for benchmarking the algorithms, a large number of samples is desirable. For the croplands and the main crop types, field visit once a year or twice for multiple cropping systems is sufficient to capture the observation. The sampling strategy for both variables usually consists of long transect randomly selected along roads or tracks where observation are recorded on regular basis. To cover the whole gradient in cropping practices of the region and capture the diversity of field conditions related to the meteorological, soil or topographic effects, several long transect will have to cross the whole region of interest. However, preliminary assessment should also control that

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 73 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 there is no bias related to the proximity of the road network. Alternatively, two stage- sampling strategies including large primary unit random selected and then the secondary sampling unit allow avoiding such the risk of bias but is much more time consuming due to the accessibility constraint of such sample. Wherever parcel delineation is already available, the sample support may correspond to the parcel itself. When this is not available, the field observation should correspond to a spatial unit of at least 3x3 pixels. For simple parcel pattern, it is also expected that segmentation techniques will allow delineating automatically the field boundaries from the imagery. Different sets of equipment can allow collecting efficiently the croplands and crop type including laptop coupled with GPS to possibly use most recent imagery on the field. Preparation of such field data collection may be time consuming in advance but save a lot of time on the ground. CESBIO also uses the ODK application available for free on Android smartphone (http://www.cesbio.ups-tlse.fr/multitemp/?tag=odk and http://opendatakit.org/). This application allows building an easy to use form to be filled during the ground surveys. Date and geographic coordinates as given by the GPS are recorded with the form, and pictures taken by the smartphone camera can be added. All the information is stored on the phone and pushed on a web server when internet connection is available. This smartphone application allows increasing the number of persons involved in field surveys, and might pave the way for some kind of crowdsourcing.  Crop status For the crop status, two options are of interest and could be selected according to the test site conditions. On one hand, the biophysical variables can be measured for any parcel randomly selected in the landscape. The alternative is to identify a set of well identified parcels covering the whole range of situations and to monitor along the season. While the latter is more efficient (well know sequences of parcels, no need for geolocation at each measurement, etc.) but this does not insure the strict independency between samples. Assuming the second option and the choice of some biophysical variables to describe the vegetation status, this would proceed as follows as already experienced in the GLOBAM and various FP7 projects: i. to identify parcels boundaries of 30 large representative fields as early as possible in the season (minimum 20 to 30 for each main crop type) ii. to measure a biophysical variables using an instrument (for instance Leaf Area Index (LAI)) to be averaged at parcel level per crop 4 times (4 visits distributed every 3 weeks after sowing for instance ) of the growing season in order to validate LAI estimation by satellite images; the LAI measurement could be replaced by Green Canopy Cover Fraction measurement surely for the early stages of the growth (LAI < 2) and possibly for all the growing period iii. to measure Green Canopy Cover Fraction averaged at parcel level for 30 large representative fields per crop 2 times before and after the 50 % senescent leaves senescence stage iv. to control the quality of the field measurement by some redundant measurements whenever possible.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 74 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

v. to collect the average temperature recorded by synoptic meteorological station in order to plot the plant growth in degree.days. The hardware equipment could be resumed as 2-m height poles mounted with a digital camera equipped with a fisheye (for green canopy cover fraction measurement, and LAI at early stage) or the LiCor LAI-2000. Leaf Area Index: measurements using LAI 2000 sensor (LICOR) include 1 (above canopy) + 4 (below canopy) along field row direction and 1 (above canopy) + 4 (below canopy) perpendicular to field row direction. For both directions, below canopy measures should be distributed along a 50 m transect in the field where the crop development seems representative to the whole field (excluding a buffer of 15 m from the edges, the tractor tracks or any other marginal field area); each revisit should try to locate the transect approximately in the same part of the field. In sparse vegetation structured in rows like maize or millet, the LAI measurement by LAI 2000 sensor should be replaced by Green Canopy Cover Fraction. In very heterogeneous crop canopy such as in semi-arid region, up to 15 LAI-2000 measurements may be needed. Diffuse light is much better for the measurement; quality control and post-processing is required to estimate the LAI average and standard deviation. Green Canopy Cover Fraction: 10 vertical photographs using 3-m high portable mast equipped by an hemispherical lens digital camera triggered by a remote command system; the photographs must be distributed every 10 m along a transect of 100 m in the field where the crop development seems representative to the whole field (excluding a buffer of 15 m from field borders, the tractor tracks or any other marginal field area). The mast horizontal arm is placed parallel to the crop sowing lines, in the direction of the sun to avoid shadowing effects. Each revisit should try to locate the transect approximately in the same part of the field. In very heterogeneous crop canopy such as in semi-arid region, up to 15 vertical hemispherical photographs may be needed. Dedicated image processing routine (using Caneye software) must be used to estimate the Green Canopy Cover Fraction average and standard deviation at the field level. Height: total height of the canopy is measured with a flat and light cardboard of 1 square meter. The cardboard is horizontally posed on the top of the canopy and the distance between the soil and the cardboard is measured 4 times (one by corner or one by side). The mean of the 4 values represent a unique height value. For each parcel 10 height values must be taken. If possible, the acquisition of georeferenced large-scale aerial photographs widely distributed over the site either by drone or by light aircraft allow identifying the crop type, the estimate of the Green Canopy Cover Fraction and can validate green LAI estimation obtained from satellite images (in particular for early growth stages, when the Green Canopy Cover Fraction can serve as LAI proxy). The use of such aerial coverage is of particular interest for the large scale demonstration case studies because of its cost benefit ratio.

4.2.2.4 Management of in situ data

The management of in situ data is an important part of the task. In the frame of the Sen2-Agri project, our aim is to make use of the data by the programmed algorithms, in an automatic way. This will allow fast benchmarking of the algorithms and will limit the possible sources of errors when manipulating large amount of data. CESBIO has already developed such a data

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 75 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 base management to analyze satellite time series together with crop declarations gathered in the frame of the Common Agricultural Policy (tens of thousands of parcels per year, several years). The solution is based on PostgreSQL and PostGIS open source softwares, and can be interfaced with web-GIS (OGC standards). It allows for example automatic clipping of parcel boundaries on images, and more generally to use GIS functions directly through SQL request.

4.2.3 Collection of high spatial resolution optical time series (WP 2300)

Once the list of sites is finalised by the consortium and agreed by ESA, the next issue will consist in collecting the time series of high spatial resolution optical data requested to support the specification, the development and the qualification of the Sen2-Agri system planned in WP 3000. During the first step of this activity, high spatial resolution optical data available for each selected site and which might contribute to the corresponding Test Data Sets (TDS) will be identified. The target sensors will be: Spot 4 (data collected in the frame of the Take 5 experiment), Landsat 5, 7 or 8 and RapidEye. We do not recommend the use of SPOT4 (outside of Take5) or SPOT5 data, because of the difficulties and inaccuracies that would arise in the pre-processing due to the absence of a blue band, which could not be replaced by multi-temporal methods because of variations in viewing angles. Special attention will be paid to the datasets already gathered within the JECAM and GMFS projects. As UCL is very much in setting up the JECAM network, the interactions with site managers and the access to the data concerning the Sen2-Agri sites already monitored in this context will be quite straightforward. This exercise will result in data lists which will then be used to define satellite time series to be processed for the development of the Sen2-Agri prototype products during WP 4300. The main criteria to build these time series are to find image series:  Matching the requirements expressed in the URD, more particularly: o With regard to the crop season, or crop status and development for the main crops under consideration. The availability of data covering more than one year or one crop season will enforce the interest for the corresponding images; o With suitable resolution and revisit frequency in order to develop the requested products;  Consistent with the in-situ data collected in WP2200 in order to make the expected validation of prototype products easier and more relevant. Once these data lists are ready, the next step will consist in the actual acquisition of the defined satellite time series. Several options can be considered and will be implemented to get the selected data:  For the data collected over the JECAM sites (SPOT4-Take 5, Landsat 8 and RapidEye imagery): as UCL is responsible for one of the corresponding sites, the connexion with the different bodies supervising the JECAM activities for each site of interest will be

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 76 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

made easier. The consortium therefore plans to get detailed knowledge and access to these data through UCL;  For the SPOT4-Take 5 images other than the one collected through JECAM: they will be directly retrieved either from ESA for the experiment sites under its responsibility, or from CESBIO for the other sites since the latter is the Principal Investigator for the experiment;  For the data collected within the GMFS project: detailed information will be sought after from ESA before the data is retrieved either from the GMFS team or using the ESA Third Party Mission (TPM) initiative;  Landsat data will be directly retrieved from the USGS archive using the Earth Explorer catalogue;  As RapidEye recently joined the ESA TPM, archive data derived from this sensor will be accessed using this channel. Concerning all data derived from the different EO missions under the ESA TPM initiative, it is the consortium intention to submit a proposal as soon as the Sen2-Agri project starts, so that these data can be retrieved at limited or no cost. Such a proposal will be prepared within WP 2300 and submitted by CS as Principal Investigator. After their delivery, the image time series will be visually checked in order to detect any problem and documented as requested to become part of the TDS. They will then be transferred to the pre-processing unit managed by CESBIO within WP 2500.

4.2.4 Development of pre-processing chains (WP 2400)

In the following pages, we will reuse the product denomination used by ESA for Sentinel-2. Here is a recall of this denomination:  Level 1A is a basic product with only radiometric corrections applied and with auxiliary data enabling a well-informed user to produce Level 1C data.  Level 1C is a mono-date ortho-rectified product expressed in Top of Atmosphere (TOA) Reflectance (or in Radiance for USGS LANDSAT products).  Level 2A is a mono-date ortho-rectified product expressed in Surface Reflectance (after atmospheric correction). Although it is not specified in S2–MRD, our sense is that the Level 2A must be provided with all the useful masks necessary for a user, i.e. a Cloud Mask, a Cloud Shadows Mask, a Water Mask and a Snow Mask.  Level 3A is a multi-date cloud free composite product expressed in surface reflectance. Typically, a level 3A could be produced every 2 or 4 weeks, using data spanning 4 to 6 weeks (periodicity and period length will be tuned in this project). Examples of Level 1C, 2A, 2B products are provided in Figure 4-2.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 77 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 4-2 : Examples of Level 1C, 2A, 2B products simulated with a time series of Formosat-2 data (CESBIO). The Level 3A is a cloud free composite of the images gathered during 15 days around the date.

Our team is very experienced with the pre-processing of optical image time series:  Regarding Level 1C, CNES has developed multi sensor tools for the automatic ortho- rectification of images with Ground Control Points (GCP), and this experience has been re-used in the OTB library and Applications, which is well known in CESBIO (Jordi Inglada is one of the creators of the library), and in CS-SI who developed most of the library on behalf of CNES.  Regarding Level 2A, CESBIO has spent 8 years of research, prototyping, development and validation to yield a Multisensor Atmospheric Correction and Cloud Screening processor named MACCS. This processor is the first one to use the multi- temporal dimension to complement the standard mono date methods (Use of Thresholds in the blue band for the cloud detection and use of “Dark Dense Vegetation” method to estimate aerosols). The multi-temporal dimension uses additional information to enhance the accuracy and robustness of cloud detection and atmospheric correction. The MACCS method has already been tested on thousands of images, including 400 Formosat-2 images, more than 1000 Landsat Images and 1000 SPOT4 (Take5). Given the complexity of nature, a systematic verification of all the outputs enabled us to discover several issues and to find workarounds in order to enhance the performances and increase the robustness. Our last achievement enabled us to produce the atmospheric correction of all the SPOT4 (Take5) images, with a good overall quality, without the presence of a blue band on that satellite. For sake of concision, we do not detail here the MACCS methods, but more information may be found in the Take5 blog (for basic explanations) or from the following referencesi. o Clouds : http://www.cesbio.ups-tlse.fr/multitemp/?p=876 o Cloud Shadows : http://www.cesbio.ups-tlse.fr/multitemp/?p=911 o Atmospheric correction : http://www.cesbio.ups-tlse.fr/multitemp/?p=1211 o Aerosols : http://www.cesbio.ups-tlse.fr/multitemp/?p=1710  Level 3A generation is included in Task 3 description.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 78 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The MACCS software is available in 2 versions:  A prototype version developed at CESBIO, but taking account of operational constraints that enable it to produce thousands of images. The prototype version is currently implemented in the MUSCATE processing center within the French Land Data Center (named PTSC from its French acronym). This processor has the advantage of being easily modified and updated, and can be used to test algorithms enhancements;  An operational version, developed by CS-SI on a CNES contract, is based on the OTB library. Because of contractual issues, the processor was not ready for the start of the PTSC, but these issued are solved by now. Both processors are regularly compared and checked to ensure a compatibility of the results. The operational version brings the ability to parallelize the processing, which makes it more suitable to a massive processing environment. It will only be used for the TDS generation. It is thus not a deliverable and will not be distributed as open source software. CNES is currently not willing to release the MACCS software as open source software, although it may be open to negotiations with ESA. For the operational processing in phase 2 and 3, the S2PAD package will be used as indicated in the SoW under the conditions defined by ESA. According to our knowledge, the current performances of the S2PAD may be a concern in particular for the aerosol correction. However, we assume at this stage that the surface reflectance accuracy on green sites should be good enough for land cover applications. As all Sentinel-2 data above France will be processed to level 2A by the French Land Data Center (PTSC), it will be possible to compare the pre-processing and to implement a version of the EO products. As the PTSC production capacity is dimensioned to 10 times the surface of France, CNES will be open to discussions with ESA about including other test zones within PTSC production for further comparison if needed. Depending on the satellite, the pre-processing will have different starting points:  SPOT4 (Take5) data will be delivered by CNES at Level 2A. It should not be necessary to pre-process these data. As SPOT4 (Take5) data will be processed by CNES operationally with the same processing parameters, it could happen that a specific tuning could enhance the data set for a few sites. In that case, CESBIO, who developed the MACCS processor used for SPOT4 (Take5) processing at CNES could use its processor to adapt the parameters to the local conditions. However, from the experience of the early SPOT4 (Take5) processing performed at CESBIO, this should not be necessary.  For LANDSAT 7 and 8, the USGS delivers L1T products, equivalent to L1C products. Very recently, USGS announced the delivery of a surface reflectance product (equivalent to L2A) provided with a cloud mask, with cloud shadows and a snow mask. However, if the atmospheric correction is correct, the cloud and above all the shadow mask are less accurate, because the algorithm does not use the temporal dimension but only performs mono date thresholds. We propose to use the MACCS prototype, available at CESBIO to process the LANDSAT 7 and 8 data. The MACCS software already processed hundreds of LANDSAT 5 and 7 data and its adaptation to

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 79 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

LANDSAT 8 is straightforward (the use of the 1.38 µm spectral band is already implemented in MACCS software). Figure 4-3 shows an example of comparison of cloud masks obtained either by the MACCS and the LEDAPS methods.

Figure 4-3: Comparison of cloud masks (circled in red) obtained from MACCS (left) and from LEDAPS (LANDSAT USGS cloud mask) (Right). MACCS nearly classifies the whole image as cloudy whereas LEDAPS only classifies 30% of the image as cloudy. The difference between both approaches should be reduced with Landsat 8, thanks to the 1.38µm spectral band that can easily detect high and thin clouds.  Rapid Eye data could be ordered at Level 1A or Level 1C. If Level 1A is selected, the ortho-rectification will be performed using OTB, as OTB implements an automatic, GCP based ortho-rectification software based on the refinement of a sensor model. This scheme has already been successfully used at CESBIO, in the framework of Mynerve2 Project, funded by Fredec Midi Pyrénées. Regarding Level 2A, a Rapid Eye plugin will be added to MACCS operational version, via the addition of a RapidEye Plugin. Since RapidEye data are not acquired with constant viewing angles, the multitemporal methods are not fully applicable: o A cloud mask can be obtained (already tested with regular SPOT data), but its accuracy will not be as good as the MACCS cloud masks usually obtained with constant viewing angles. Given the low number of RapidEye images to process, and the fact that mostly clear images will be used, when necessary, an interactive cloud detection will be used to refine MACCS cloud masks. It will be based on a simple classification scheme based on Support-Vector Machine (SVM) and a manual learning (already tested with SPOT and existing in OTB). o For the aerosol estimates, we will not use the multi-temporal method and only rely on our version of the DDV method extracted from MACCS software, that uses blue and red spectral bands to estimate the aerosol optical model. This method works well on agricultural regions. o The Rapid Eye plugin will be added to MACCS by CS-SI, CESBIO will provide a parameter set adapted to Rapid Eye.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 80 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

4.2.5 Preprocessing of high resolution optical time series (WP 2500)

As said above, the processing of SPOT4- Take5 data will be performed at CNES, by the French Land Data Center (PTSC). As the early processing has been done in CESBIO, and since the same processors are used in PTSC, we are really confident about the data quality. However, if necessary, a few sites could be processed with specially tuned parameters at CESBIO (Maybe EChina, plagued with huge aerosol contents). The Landsat 7 and 8 time series will be processed at CESBIO, using the MACCS prototype. The Rapid Eye data sets will be processed at CS-SI, using OTB for ortho-rectification, and the MACCS operational version to produce Level 2A data. Because the viewing angles may not be constant on Rapid Eye time series, some of the images will need an interactive refinement of their cloud and shadow mask, using a tool developed at CS-SI. As a consequence, the processing of Rapid Eye data set will be done at CS-SI.

4.2.6 Validation of test data set (WP 2600)

An independent evaluation of the TDS generated in the WP 2500 will be performed. More precisely, this assessment will focus on the following aspects:  Evaluate the performance of the most critical pre-processing chain (including ortho- rectification, water, cloud and cloud shadow masking) for each sensor (LANDSAT, SPOT4 and RapidEye);  Evaluate the spatial consistency of the HR optical time series for each date over each site;  Evaluate the temporal consistency of the HR optical time series over each site.

The ortho-rectification will be validated by examining the registration quality of the images, using automatic GCP extraction tolls available in OTB or in CNES libraries. To assess the performance of the various masks, the most efficient procedure is found to be a visual analysis. They will be overlaid, as illustrated in Figure 4-4, with the contours of the masks displayed in different colors.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 81 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 4-4: This kind of display enables a quick evaluation of the various masks generated by ourp processors. The masks are outined in different colours (green for clouds, black for cloud shadows, blue for water and pink for snow). Here, one of the SPOT4 (Take5) images processed with MACCS in a difficult case (presence of thin cirruses (RD.168)) With regard to the radiometry point of view, the validation of surface reflectance products is already planned in the frame of the SPOT-4 TAKE 5 experiment. It will use the Crau calibration site (called Rosas) located in the South East of France and operated by CNES. Shortly, this site is permanently equipped with CIMEL radiometers which simultaneously and automatically measure sky radiance and ground radiance measurements for different viewing angle and within several spectral bands. SPOT-4 images have been acquired over this site every 5 days during the Take 5 experiments. It will also be used for Sentinel-2 in flight calibration and for validation of atmospheric corrections processing chains. This site was used in the frame of Venµs preparation to validate atmospheric corrections algorithms applied to Formosat-2 image time series. Figure 4-5 presents the results obtained.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 82 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 4-5: Comparison of surface reflectances derived from FORMOSAT-2 image time series over the Crau calibration site (small symbols connected by line) with surface reflectance provided by the measuring station ROSAS (big points) We consider that since CESBIO and CNES will perform surface reflectance validation for Take 5 and Sentinel data, this task is not part of Sen2-Agri project. In addition, the precision and the relative radiometric accuracy in space and over time are considered to be much more critical than the absolute accuracy performance since any classification algorithm dealing with reflectance time series relies on temporal consistency and proceeds by relative statistical comparison or similarity analysis in space. In other terms, the most important is the possibility to work with surface reflectance products which are spatially and temporally consistent. Regarding the combined use of Landsat 8, SPOT-4 and then Sentinel-2 data, CESBIO is collaborating with NASA (Vermote’s team) on the comparison of cloud mask, atmospheric and BRDF effects corrections, and biosphysical variable retrievals (RD.213). The pre- processing chain of the Sen2-Agri project will clearly benefit from this collaboration. The spatial consistency of the TDS will be visually assessed through a comparison with the WELD Landsat products. Finally, it is planned to evaluate the temporal consistency through a comparison with expected LAI temporal trajectories derived from a Canopy Structure Dynamic Model. The dynamics of the canopy structure are depicting the natural crop growth processes, which result in a relatively smooth and typical temporal profile of LAI (Koetz et al. 2007). 4.3 Task 3 - EO Products specification and algorithm design

The main objectives of this task will be in first hand to provide a technical answer to the users requirements which details the technical specifications of the four S2-Agri EO products (dynamic cropland masks, cultivated crop type and area extent, vegetation status indicator and composites of cloud-free surface reflectance).

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 83 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Then, in second hand, the design and specification team involved will work to provide a technical answer to users requirements which presents how to deliver the four S2-Agri EO products to the end-users (data formats, metadata, data access, service access) To fulfil these objectives, Task 3 is divided in 5 sub-tasks that are described hereafter:  Products and system specifications (Section 4.3.1);  Candidate algorithm selection based on literature and best practices review (Section 4.3.2);  Algorithms inter-comparison, benchmarking and selection (Section 4.3.3) ;  Operational system design (Section 0) ;  Operational system validation protocol development (Section 4.3.5).

4.3.1 Products and system specifications (WP3100)

The definition of technical specification of the system and its output products is a key point for the entire system design. These two aspects are the main components of the Technical Specification (TS) document defined by ECSS standard. More precisely we can describe the system specification according into the System Requirements Specification (SRS) document. On another hand, the output products specification and interface of each software component will be described in a Product Specification Document (PSD). This last document should respect the guidelines defined in RD-15 and RD-14 for Sentinel-2 L1c and L2a products. This task strongly depends to the quality of the URD. The more it is detailed and clear the more the technical specification will be easy to done. If necessary some clarifications could be asked to the end users. The open issues need to be also defined to generate the best technical specifications.

4.3.1.1 Preliminary System Specifications

The preliminary system specification provides a system overview which lists the different considerations applicable to the system and the different types of requirements. In the following, we will define these different points in the case of the Sentinel-2 Agriculture project and try to provide preliminary answers. The main environmental considerations related to the targeted system are:  The Sentinel-2 Agriculture processing server will be dedicated hosted system to have the best Internet bandwidth to retrieve input L1C data as soon as possible after their processing and deliver to end user a product through Internet with the best conditions.  The Sentinel-2 Agriculture processing server need to have a strong hardware configuration to handle large data volume processing in a reasonable delay. After analysis of system budget, we think the following platform is necessary: Intel Xeon Sandy Bridge E5-2643 3.3GHz, 16 threads available, 64 GBytes of RAM and 12To of SAS disk and a SSD disk for the system.  The operating system of processing server will be Linux/Debian in its latest release to provide a stable and well supported Linux distribution. It will be the targeted operating

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 84 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

system but as your solution used multi-platform software components, it will be quite easy to deploy the system to other Linux flavor. The main relations of the system with external systems are related to two SoW requirements:  A fully automatic system from S2-L1c data;  The integration of the Sen2-Agri components as plug in into the future Sentinel-2 ToolBox. To provide a fully automatic system we propose to use a simple orchestrator based on the data driven model. Therefore the automatic download of L1c products should be managed by the NgEO downloader designed by ESA to provide automatic download from the S2 production facilities based on predefined spatial area. If one new L1c product is detected into the input directory of the system, the orchestrator will launched the different processing on this product. The first one is the L2A processing. If all the data are available into the composite period the orchestrator will launch the composite processing. After for each product generation component, the orchestrator will launch the processing if all required inputs are available. This mechanism will be specified into the TS document. Concerning the integration of the software component into the future Sentinel -2 toolbox, we propose to use the application framework of OTB which provide for each application a library which can be connected to various language as python or Java via a specific launcher. This launcher interprets the different parameters exposed by the small library which can be used in the targeted language. The main constraint of the Sen2-Agri system will be the use of only open source component as described into the SoW. Each requirement defines into the SRS are mainly related to the User requirements or derived from the SoW. The list of UR of the SoW will be used to initiate this section. This requirement should define the expected performance as express by users. CS-SI has already deals with this type of document for example for Sentinel-2 IPF or into the Sentinel-2 MPC project. Concerning the SRS, the main goal of this task is to translate user requirements into a set of requirements which will guide the design and implementation phases.

4.3.1.2 Interface definition

The output product specification is based on the user’s requirements and on the products specification of L1c and L2a Sentinel-2 products. Therefore we propose to follow the products architecture of the previous level. The general architecture of the Sen2-Agri user products will consist of:  A product metadata file ;  A Granule folder which contains the tiles composing the products;  A datastrip folder which contains the datastrip linked to the selected tiles;  An auxiliary data folder which provide all the GIPP files used during the processing ;  A preview image at low resolution (320m) of the product. All the raster data and mask are provided into each tile folder of the main granule folder. The name of each user product and its elements following the naming conventions expose into the

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 85 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

PSD document [RD-7]. A specific code is used to identify Sen2-Agri names. New GIP name are defined for each components of the Sen2-Agri system and list all the parameters used. Following the user requirements already identified, the output format of the raster data could be modified to be available into a more standard format as GeoTIFF. HDF5 format could be also interesting to handle raster data with different spatial resolution. The use of JPEG2000 format will depend to the existence of a stable and optimized open source solution for JPEG2000 decoding as OpenJPEG. For all metadata, DIMAP format will be mainly used because it is widely used by several products and some ESA software as BEAM. For the preview image, the JPEG-LS format is an interesting alternative to the JPEG2000 image in this case. Concerning the interface of each software components, they will be defined in relation with the algorithm design phase. They will be expressed at a sufficient level of details to allow the best implementation. All this documents will be provided for the PM1 meeting and should be updated for the PDR meeting according to the RIDs made by ESA.

4.3.2 Candidate algorithm selection based on literature and best practices review (WP3200)

The main objective of WP3200 is to perform a review of existing approaches to use satellite image time series (SITS) suitable for operational production of:  cloud free composites;  cropland maps (crop dynamics, crop types, crop area, etc.);  vegetation status (vegetation indices, LAI, fAPAR). The path to this work has been paved for this proposal, but for sake conciseness, we have reported the bulk of the text to section 9. The attentive reader is invited to refer to this section for further details. The required inputs for this task are the Technical Specifications document (TS) and the documentation of the algorithms for Sentinel-2 atmospheric correction, compositing and readers [RD.7]. This task will produce a contribution to the Design Justification File (DJF) in summarizing the algorithms review by selecting a minimum of 5 algorithms or strategies (see below for this distinction) for each EO product defined in the TS based on performance criteria and on their adaptability to Sentinel-2. The interface between this WP and the WP3300 will be made easy by the fact that most algorithms of the literature are available in OTB and therefore, it will be possible to implement an insightful selection before the actual benchmarks implemented in WP3300. For instance, the following classifiers are readily available:  Support Vector Machines;  Decision Trees;

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 86 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 AdaBoost;  Gradient Boosted Trees;  Multi-layer Perceptron Neural Networks;  Normal Bayes;  Random Forests;  K-Nearest Neighbours. They can be used through a generic set of command-line applications (Training, Validation and Classification). This allows a first-guess quantitative comparison for an algorithm selection which will be more pertinent than if it was only based on the literature review. However, this comparison does not constitute a replacement of the benchmarks implemented in task 3.3, but it has given very good results in the development of a land-cover map production system. This system has been developed at CESBIO in the frame of the preparation of the exploitation of Sentinel-2 data and is described in 9.2.1.1

4.3.2.1 Algorithm selection

4.3.2.1.1 Algorithm selection rationale

The literature overview that will be presented in the section 9 shows that there are endless possible choices when combining feature extraction, image segmentation techniques and classification algorithms in order to design a land-cover mapping processing chain. It is therefore needed, before implementing benchmarks, to perform a preliminary selection of the algorithms based on theoretical choices. We propose the following criteria for this preliminary selection: 1. Theoretical performances derived from the literature review; 2. Existence of a validated implementation of the algorithm; 3. Scalability of the algorithm.  Performances from the literature This is of course the main criterion for the selection of an algorithm. The 2 major aspects to be taken into account are:  quality of the results: o smoothness of images for the composites o kappa index or overall accuracy for classification techniques; o estimation error (bias, SME) for vegetation status parameters;  computational cost. However, other aspects which may be difficult to quantify have to be taken into account, as the simplicity for genericity (is the algorithm specific to a particular kind of data or thematic field, or is it robust enough to work on various conditions) and the degree of supervision needed for operation and amount of in-situ data needed for the training.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 87 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Existing implementations Regarding the Land cover maps generation, which is, to our sense, the most challenging task, our proposal is strongly based on the use of the ORFEO Toolbox library which contains validated implementations of hundreds of image processing algorithms. The existence of an algorithm in the library is an important advantage since it reduces implementation costs and risks associated to its validation. OTB allows an easy integration of existing implementations of algorithms. As an example, the full machine learning module of the OpenCV open source software has been integrated in OTB. If an algorithm is not present in OTB, but has an existing open source implementation available, it will therefore be possible to integrate it without much effort.  Scalability Scalability of the algorithm refers to its ability to be implemented in an efficient way for the operational processing of large amounts of data. The two dimensions in which an algorithm has to be scalable are space (amount of memory required) and time (computational complexity). The classical approaches used to scale-up an algorithm are multi-threading (parallelisation) for the time constraints and streaming (sequential block processing) for the space constraints. Simply put, if an algorithm is able to process the data by splitting it into independent blocks, both multi-threading and streaming can be used. However, not every algorithm is suitable to this kind of chunk-based processing (many image segmentation or supervised learning algorithms aren’t). Usually, approximate solutions can be implemented, as for instance polygon stitching after a tile-based image segmentation procedure. OTB implements multi-threading and streaming natively on most of its algorithms, since image splitting and mosaicking are integrated into its pipeline architecture. Furthermore, this pipeline architecture implements on-demand processing, which allows to run only the required parts of the processing chain. Therefore, OTB is HPC-ready and even computing on GPUs has been demonstrated although this requires special purpose hardware, which, while not particularly expensive, may be not available for some champion users. Finally, for the case of large scale Object-Based Image Analysis, OTB can interface with spatially-enabler relational data-base systems (i.e. PostGIS) and application results have been demonstrated (RD.169).

4.3.2.1.2 Strategies vs algorithms

The SoW requires that at least 5 different algorithms for each of the defined EO products be proposed and compared. Since the term algorithm can be applied to different levels of granularity in the context of a processing chain, we will propose some terminology for the following paragraphs. For instance, a land cover map production processing chain can be schematised as being composed of the following building blocks (Figure 4-6):  Input data selection

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 88 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Feature extraction and selection  Estimation/decision  Post-processing and fusion

Figure 4-6 :Block diagram for a land cover map production system It is therefore a generalised version of the diagram presented in Figure 4-7. For each of these blocks, different approaches can be used. These approaches are implemented in the form of algorithms. A combination of specific algorithms for each step in order to instantiate a processing chain is called a strategy. In this context, we understand that at least 5 different strategies have to be compared. Figure 4-7 illustrates several choices for the algorithms involved in the land-cover map production. An example of a simple strategy could be: 1) no segmentation (pixel-based approach); 2) only reflectance values used as features; 3) SVM classification; 4) no fusion applied, since a single classification is used. A more sophisticated strategy would be: 1) Mean-Shift segmentation (object-based approach); 2) Several sets of features: 3) TOA reflectance values; 4) Spectral indices (NDVI, NDWI); 5) Temporal features (onset and senescence dates for vegetation); 6) SVM and Random Forests for the classification; 7) Dempster-Shafer fusion of the result of the classifiers. All the choices for the algorithms listed in Figure 4 6 are already implemented in OTB.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 89 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 4-7: Choices of algorithms leading to strategy comparisons

4.3.2.2 Products and performances

Our thorough experience allows us already to propose a detailed description of possible choices and implementations for the target products. This is done in section 9. In this section we give a quick overview of the approaches that will be reviewed in WP3200.

4.3.2.2.1 Cloud-free composites

Cloud free composites of surface reflectance (also called Level 3A products) may be useful for several reasons:  they cover surfaces larger than that of a single satellite image;  they can be provided at the same date every year and do not depend on a cloud free acquisition date;  they enable a data volume reduction compared to the level 2A products (but they also represent a data loss compared to the level 2A);  many algorithms for classification or segmentation do not easily handle the presence of data gaps in the time series. A detailed analysis is presented section 9.1. It is based in the experience gained at CESBIO during the development and exploitation of our Level 3 processing chain in the framework of Venµs Project. The main steps of our strategy for producing a cloud-free composite with our chain are:  The compositing itself which is aimed at choosing the good reflectance value for each pixel of the composite. Several compositing methods can be tested.  The gap-filling task which allows obtaining reflectance values for the pixels for which no clear observation was available. Here again, several ways of filling the gaps can be tested. The choice of the compositing period is also an important issue that has been analysed in the detailed description of the algorithms (section 9.1.2.3).

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 90 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

4.3.2.2.2 Agricultural products

Three agricultural products will be generated by the processing chains: 1) Dynamic cropland mask: binary mask resulting from the detection of the cropland areas; 2) Cultivated crop type and area: a multi-class map with a very strong error rate constraint, 3) Vegetation status: biophysical parameters, like FAPAR or LAI; The two former will be produced using supervised classification. The latter one could be generated either by using empirical relationships between reflectance values (or vegetation indices like NDVI) and the target variables or by the inversion of models like PROSAIL. The processing chains for the dynamic cropland mask and the crop type map will most likely be based on the fusion of several classifiers. Each of one may operate on different sets of features. Indeed, recent literature shows that this kind of approaches has better performances. Nevertheless, since all of the state of the art techniques are available in OTB, many different combinations of algorithms will be compared. One final point which is important to note is the high spatial resolution of Sentinel-2 images, which will allow the use of object-based approaches for the land-cover map production. These approaches rely on robust and scalable image segmentation approaches. Those are available in OTB. Again, a detailed explanation of the different approaches is given in section 9.

4.3.2.2.3 Performances

The land cover products have very strong constraints in terms of error rate and resolution. Error rate and resolution are of course related, since, for the same input image resolution lowering the target resolution of the map allows better error rates. This is the case because regularization schemes can be implemented. Error rates between 5% and 10% are close to the upper limit of what can nowadays be achieved in operational systems. Of course, remote sensing image processing literature presents often results better than that, but they are only achieved on selected data sets and are rarely validated across different sites and landscape types. Therefore, it will be very difficult to achieve the expected error rates at the target resolutions (this is identified as a risk) and workarounds and improvements will have to be developed. Among those, we can already list the following ones:  use important amount of ancillary data and knowledge about previous agricultural campaigns;  increase the number of training samples by using Active Learning and semi- supervised approaches;  increase the temporal sampling by using the synergy with MR imagery (Proba-V and Sentinel-2). All these approaches are reviewed in detail in section 9.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 91 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

4.3.3 Algorithms inter-comparison, benchmarking and selection (WP3300)

This activity aims at benchmarking and comparing in order to select the “best” algorithms or strategies for fulfilling to a maximum extent to the user requirements and the products specifications. The activity should contribute to improve the understanding of algorithms performance in enabling their inter-comparison. The “best” algorithms or strategies selection is achieved following algorithm inter-comparison and product assessment exercises in considering the user-focus with critical attention. To this end, the algorithms performance and products quality are evaluated towards a list of objective criteria. The benchmarking exercise will be dedicated to the testing of algorithms previously identified from the state of the art of literature (WP 3200 - section 4.3.2). A minimum of 5 algorithms needs to have been selected for each EO product, which consist of cloud free composites, a dynamic cropland mask, a cultivated crop type and area map and a vegetation status map. For each product, the benchmarking exercise can be viewed as composed of:  an input dataset, which can be either the TDS for the compositing module or in the other cases, the multi-date cloud-free composites ;  a set of alternative processing algorithms ;  different output products to compare;  a methodology for comparison (e.g. graphical visualisation methods and/or statistical analyses), along with appropriate validation dataset made of in-situ data. This exercise will be documented in the Design Justification File, which shall describe all important design choice justifications, trade-offs, feasibility analyses as well as the supporting technical assessment (test procedure, results analysis and evaluation) that show that the products meet the requirements. The articulation of this benchmarking exercise within the project and the proposed approach are presented in detail in section 10.

4.3.4 Operational system design (WP3400)

The main objectives of this work package are (1) to provide a detailed description and technical justification of the system architecture and (2) to collect and aggregate in one document all the inputs provided by the preliminary studies about processing chains algorithm and choices. Sequential proposed task are listed below and will be done during this activity:  Review input documentation and upgrade the assessment of the S2-Agri system presented at bid time;  Update documentation about S2-ToolBox and Sentinel-2 product data format;

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 92 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Raise any inconsistency or discrepancy found in the reviewed documentation. Typical problems that may be identified are inconsistencies, non-quantifiable requirements, missing information, over-specification, etc.;  Contribute to the D.4 - Design Justification File (DJF) about the different options of design for processing chains which can handle huge data volume;  Present the result of all significant trade-offs, feasibility analyses, technical decisions and supporting technical assessments made to cope with all S2-Agri requirements;  Analyse the S2 ToolBox architecture and provide technical solution to integrate the Sen2-Agri system;  Collaborate with the S2-ToolBox to define the best solution;  Trace all the relevant information and choice which led to the Sen2-Agri solution;  Provide a description of the software architectural design and the software detailed design to give all practical specifications to the system development task;  Attend to the Critical Design Review (CDR) meeting ;  Answer the RIDs raised during the CDR meeting by the Review Board and update the documentation according to CDR meeting issues. The purpose of this activity is to produce and refine the top-level architectural design of the software product, i.e. the top-level structure and software components meeting software requirements. A top-level design for external interfaces (i.e. to other software pieces or systems) and internal interfaces (i.e. between software components of software product) is identified. These activities apply on the design of the entire Sentinel-2 Agriculture processing chains even if the algorithms are analysed in parallel of this activity (WP 3200 and 3300 - see sections 4.3.2 and 4.3.3à. Indeed, the definition of the processing chain will not impact the design but only the core of the process. The design phase will be divided into two sub-phases, the preliminary design and the detail design. All tasks linked to this activity are described and presented with more details in the Software development plan (see section 0: Appendix A of this proposal).

4.3.5 Operational system validation protocol development (WP3500)

A vital aspect of the development of any system is to check that it fulfils its purpose with respect to potential users. Validation activities will ensure that all requirements are taken into account by the technical work packages. Furthermore, as requirements might evolve, so will do the validation, to ensure that the developed system is complete, consistent, and it does what it is supposed to do in its intended environment. Validation determines the correctness and completeness of the end product, and ensures that the system will satisfy the actual needs of the users. These concepts are applied not only for the entire system, but also for all its components. The software validation process will consists of: validation process implementation, validation activities with respect to the technical specification, and validation activities with

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 93 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 respect to user requirements. The software verification process will consist of: verification process implementation and verification activities. The Software Validation and Verification Plan will introduce a set of methods and techniques that will be carried out in the context of the project, as well as necessary tools to perform the validation and verification tasks and a clear regression test strategy. According to the project vision and its expected validation activities, this plan will divide the complexity of the processes into accessible perspectives, addressing specific problems from different points of view. The Software Validation and Verification Specification document will define the ensemble of test cases and procedures for all user requirements and those defined by the technical specification:  The mission data and user scenarios are a priority in the testing activities, next to ensuring a successful performance of the software in an operational and non-intrusive environment. The validation is completed with tests regarding stress, boundary and singular inputs, protocols, timing, and also the software’s ability to isolate and reduce the effect of errors. The validation tests will be “black box”, through test campaigns comprising mainly tests procedures and additional analysis, inspections and demonstrations, in case tests procedures may not be created.  The verification of user requirements will ensure that they are consistent and verifiable, providing a clear description of the environment in which the software operates, of the characteristics of all external systems, of the fault detection, identification, and recovery strategy to be implemented, specifying data configurations, memory and CPU allocations and defining clear operational scenarios.  The verification of the technical specification will ensure consistent and verifiable software requirements and interfaces, a complete traceability between system requirements and software requirements (and justifications where the traceability is not possible), correctly identified implementation constraints, hardware environment constraints, and software requirements related to safety, security and criticality, a feasible software design and maintenance.  The verification of the software architecture and detailed design will ensure its consistency and complete traceability with the requirements of the software product (justifications where the traceability is not possible), a correct implementation of sequences of events, inputs, outputs, interfaces, logic flow, error handling, safety, security and critical requirements, a feasible detailed design and maintenance, tasks definition and priorities, synchronization mechanisms, shared resources management with justified real- time choices.  The code verification will ensure its consistency and traceability with the requirements and design of the software product (justifications where the traceability is not possible), the implementation of proper events sequences, consistent interfaces, correct data and control flow, completeness, and error handling, safety, security, and critical requirements, correct numerical protection mechanisms and no memory leaks, correct code robustness verification (resource sharing, division by zero, pointers, run-time errors)

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 94 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 The activity is completed with the verification of achieved code coverage, measured by analysis of the results of the execution of tests. Inspection, analysis or review of design will be applied in case the execution does not ensure 100% coverage. The software validation and verification process will be based on a test platform and a set of test cases and procedures corresponding to each component. Where applicable, the tests will be automated as much as possible. For this purpose, depending on the programming language used for developing components (C or C++), we envisage the usage of Check (for C language) and CppUnit (for C++) open-source frameworks. All test procedures will have the following structure:  Pre-requisites and specific context defined within the test platform;  Actions to be performed;  Expectations to be verified / predetermined expected results;  Status after execution: OK if the final result corresponds with the defined expectation, NOK otherwise. All test cases must be executed in the order specified in the test procedure, as they are usually connected one with another and assure a normal execution flow for the user operations. If specified, the pre-requisite of a test case must be respected; otherwise a false NOK may be generated. Test cases will be divided into smaller verifications, in order to make the problem identification easier. All verifications will be numbered and a result matrix will be generated after each test campaign, ensuring the conformance to expected results. The operational testing activities will test the software in the following conditions: operating hardware environment, system configuration, sequence of operations, cases in which the software is designated to be fault tolerant. The Software Validation and Verification Report document will contain the tests results, grouped in traceability matrices. 4.4 Task 4 - System development

An important step in the project is the development of the software applications along with the execution chain used to generate the prototype products. This task aims at producing all this software along with the whole system documentation, defining a procedure for validating the created prototype products and to check the system performances. To fulfill all these objectives, Task 4 is divided in 5 sub-tasks that are described hereafter:  System implementation (section 4.4.1);  System documentation (section 4.4.2);  Generation of prototypes products (section 4.4.3);  Assessment of prototype products and system performance (section 4.4.4);  Demonstration plan (section 4.4.5);

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 95 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

4.4.1 System implementation (WP4100)

4.4.1.1 Software Development

In order to minimize the development effort, already existing open source software will be used by several Sen2-Agri prototypes. Additional development will be done around these open source libraries for the functionalities that are not covered. The libraries used will be under Open Source Public license.  Image processing The baseline for our proposal is the ORFEO Toolbox (OTB) developed by CNES and which is distributed as an open source library of image processing algorithms. OTB implements a set of algorithmic components, adapted to large remote sensing images, which allow capitalizing the methodological know how, and therefore use an incremental approach to benefit from the results of the methodological research. Most functionalities are also adapted to process huge images without the need for a supercomputer using streaming and multi-threading as often as possible. Targeted algorithms for high resolution optical images (SPOT, Quickbird, Worldview, Landsat, Ikonos), hyperspectral sensors (Hyperion) or SAR (TerraSarX, ERS, Palsar) are available. As the library is written in C++, the enhancements for this library will be also performed in C++. In order to adapt the OTB to the requirements of the Sen2-Agri the development will include: - Implementation of new image processing algorithms - Code industrialisation: enhancements and optimizations performed on algorithm prototypes produced during WP2000/WP3000 In order to have a flexible architecture, several software applications will be developed, each application being responsible for producing a certain product. This will allow parallel development and testing several applications and having in this way for each application its own iteration cycle. To have an automatic system that will produce other products (L2a, L3, L3Agri and L3Agri) based on the L1c products, a set of applications and scripts will be implemented. They will be in charge of monitoring the occurrence of a new L1c product, execute the corresponding application implementing a certain algorithm and save the produced products in the archives. As soon as the Sentinel-2 toolbox interface specifications will be available from ESA, the software applications will be enhanced with the interfaces needed for the integration with this toolbox.  Visualization For the visualisation of the Sentinel products a good candidate is the Quantum GIS (QGIS) software application. QGIS aims to be an easy-to-use GIS, providing common functions and features and supports a number of raster and vector data formats, with new format support easily added using the plugin architecture. QGIS is developed using the Qt toolkit and C++ and is released under the GNU General Public License. Having a pluggable architecture, new features needed by Sen2-Agri will be added for particular needs. The visualisation tool will be used during the validation of the platform but also on the production platforms, by the final users.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 96 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Testing The final stage of the software development task will be the creation of unit tests and integration tests. The implementation of the tests can be done in parallel with the coding phase, after defining the application programming interfaces. The development process will be an iterative one, having up to 3 iterations. The first iteration will be the creation of the initial version of the software. The following iterations will be after the internal software validation and after the final user validation when possible changes will be required. During each iteration, the DDF and CDR documents might also be updated along with the Software Unit and Integration Test Plan document and the Software User Manual, if changes occur in the architecture, in the configuration or visualisation tools.

4.4.1.2 Software Installation

The operating system that will be used on the development and production platforms will be a Debian-based Linux (Debian, Ubuntu etc.). This offers the advantages of an easy installation of the operating system but also of the needed packages and is under continuous maintainability and evolution. The developed application during the Sen2-Agri project will also be packed as RPM or deb files allowing thus a very fast installation and avoiding the time consuming compilations. Additional scripts will be also created to make the installation process more automatic.

4.4.2 System documentation (WP4200)

The second part of the system development is the dedicated to generating the Software User Manual (SUM) for: - the system implementation - the build and installation procedures - the maintenance procedure Each of these details will be described in different documents as they are intended for different user types. The System Implementation document will contain detailed description for each developed prototype, for the automated processing chains, for the compliance with Sentinel-2 toolbox interface requirements and for the visualisation tools. For each developed prototype the document will describe the open source libraries used, the components used from each library and the features modified or added to each library. For certain components the documentation can be generated based on the Doxygen comments from the developed source code or from the eventual wikis written for those components. A particular attention will be paid to the compliance with the Sentinel-2 toolbox interface. The elements required by the Sentinel-2 toolbox will be presented in detail as describing the compatibility and conformance to the requirements from ESA. The Build and Installation Procedures document will contain the mechanism for creating and deploy the developed applications. As the developed source code and the used open

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 97 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 source libraries will be available on a public repository, a document will be created in order to provide information about how to: - retrieve the sources from the repository - build the each prototype - install and configure each developed component and the open source libraries used by the component This document will contain also a brief description how to install the operating system and the tools needed for compiling the application that will be found on the production or development platforms but without entering too much in details as these procedures can be found on various web sites. Though, additional links for the detailed procedures will be provided in the document. A Maintenance Procedures document will be created as the software will be used mostly by non ICT people and during this stage will be provided also a document describing maintenance and troubleshooting procedures for different normal or possible abnormal situations. The troubleshooting procedures can be described as a “how to” section for most common operations

4.4.3 Generation of prototype products (WP4300)

After the beginning of the implementation of the Sen-2 Agri system and the delivery of the corresponding documentation, the next major part of Task 4 will consist in generating EO prototype products. The prototype products will be developed from the Test Datasets gathered during WP2200 - Design and collection of in-situ data for all sites and WP 2300 - Collection of HR optical time series. The resulting products will be used to assess the system formerly designed and implemented before running it in a more operational context during Task 5. It should be noticed that all these activities will be carried out in an iterative way, so that any significant feedback from the generation of prototype products could lead to an update in the implementation of the Sen- 2 Agri system if necessary. The work carried out by the consortium at this stage will include two main components:  The generation of the Sen-2 Agri prototype products themselves (Cloud Free Surface Reflectance Composites, Dynamic Cropland Masks, Cultivated Crop Types and Area Extents, Vegetation Status Indicators). These products will be developed following the specifications expressed in the User Requirements Document (URD) delivered at the end of WP 1000 and using the Sen-2 Agri system implemented in WP 4100, together with the documentation prepared in WP 4200. The EO data collected over the concerned test areas for this work will consist in:  First, Spot 4 data already available from the JECAM or GMFS projects (depending on the sites actually selected during WP 1200) or resulting from the undergoing Spot 4 Take 5 experiment;

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 98 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Second, Landsat 5 or 7 imagery, either collected in both above mentioned projects or using the ESA Category-1 supplying mechanism;  Third, RapidEye data coming from the same channels;  At last, Landsat 8 imagery, provided that image time series will be available at this stage of the Sen-2 Agri project over the test areas. Additional information about the generation of the prototype products (for instance processing time) will also be collected to contribute to the Prototype Validation and Assessment Report (PVAR).  A preliminary validation of the Sen2-Agri prototype products using the in-situ data collected during WP 2200. The developed prototype products will be analysed together with the corresponding in situ data to check any major discrepancy. In case there is one, an in-depth study of the concerned prototype product, of the corresponding input EO and in situ data will be carried out in order to explain the observed anomaly. In the other cases, the prototype product will be considered as acceptable and ready to be submitted to the user approval planned in WP 4400 - Assessment of prototype products and system performance. The development of the Sen-2 Agri prototype products will be carried out according to ESA specifications, i.e.:  Cloud Free Surface Reflectance Composites will be supplied over 3 of the selected test sites, covering a 200 x 200 km area at a 10 or 20 m resolution, with a monthly frequency and during at least one crop season, provided that the collected Test Datasets allow to meet these requirements;  Vegetation Status Indicators will be supplied over 5 of the selected test sites, covering a 60 x 60 km area including relevant in situ measurements at a 20 m resolution, with a 10- day frequency and during at least one crop season, provided that the collected Test Datasets allow to meet these requirements;  Dynamic Cropland Masks will be supplied over all the selected test sites, covering a 60 x 60 km area at a 20 m resolution and derived from multitemporal EO imagery of at least one crop season, provided that the collected Test Datasets allow to meet these requirements;  Cultivated Crop Types and Area Extents will be supplied over all the selected test sites, covering a 60 x 60 km area at a 10 m resolution and derived from multitemporal EO imagery of at least one crop season, provided that the collected Test Datasets allow to meet these requirements. The nomenclature of the product will depend on the considered site and the corresponding main crop types.

4.4.4 Assessment of prototype products and system performance (WP4400)

4.4.4.1 Overall validation strategy

The validation is essential for providing a high quality product that is accepted and applied by the user community. Different steps of validation that jointly lead to the achievement of the

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 99 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 validation objectives are anticipated in the project (Figure 4-8). The internal validation procedures will take place during the benchmarking exercise and will be documented in the D.4 Design Justification File and D.5 Design Definition File. In addition, the validation process relies on 3 other complementary pillars: (i) the confidence-building, (ii) the statistical accuracy assessment and (iii) the comparison with existing products.

DJF /DDF Products Validation

Comparison Statistical Confidence- with other Internal Accuracy building products validation assessment Performed to Performed in a Performed to Performed evaluate usability, systematic way assess the by dataset impact on model to detect thematic accuracy production performance and macroscopic in a statistical team to limitations to errors rigorous protocol evaluate further improve system and CCI-LC quality processing

Achievement of validation objectives 1. Provide robust assessment of product accuracy, precision and consistency 2. Build user confidence in applying the products for model applications 3. Increase acceptance and legitimacy of product with the international community of users and producers of Land Cover data

Figure 4-8 : Overall organization of the validation and related user assessment activities Prior to the independent and statistical quantitative assessment of the thematic accuracy of the Sen2-Agri EO products, a confidence-building procedure will be conducted in order to assess the quality of the products in a systematic manner. This step is aimed at reinforcing the overall acceptance of the land cover product by users. The results of such a qualitative systematic assessment will also allow investigating the influence of different parameters on the quality of the agricultural products such as: landscape heterogeneity, crop types, observation conditions, etc. As for the statistical quantitative assessment, it will consist in validating the products using validation dataset made of in-situ data collected within this project. The outcomes will include the various parameters describing any map’s accuracy: contingency matrix, user’s and producer’s accuracy, Kappa statistics, and area statistics. Finally, a complementary comparison with other agricultural products will also be performed. These other products can be other maps, but also existing statistics.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 100 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

4.4.4.2 Validation of prototype products

For each product, a list of assessment criteria will be established, based on the literature review and on users’ requirements. Before the beginning and during the first months of the Sen2-Agri project, successive users’ requirement analyses have been conducted to derive the specifications for EO agricultural products relevant for monitoring applications. The assessment will rely on the main findings of these analyses in terms of (i) products thematic contents, (ii) accuracy (overall and class- specific accuracy values), (iii) temporal resolution for the composites and the dynamic cropland mask, etc. The validation process made of the 3 pillars presented in Figure 4-8 concerns the four kinds of products generated during the Sen2-Agri project: the composites, the dynamic cropland masks, the cultivated crop type and area map and the crop status map.

Confidence-building Statistical accuracy Comparison with procedure to remove assessment other products macroscopic errors Cloud-free composites E.g: spatial or temporal / With WELD Landsat gaps, radiometry mosaics anomalies, Dynamic cropland E.g.: confusion With in-situ data With existing maps (list mask between crop and no- Confusion matrix to be established) crop, temporal (overall accuracy, inconsistency user’s and producer’s accuracy) Cultivated crop type E.g.: confusion With in-situ data With existing maps (list and area map between crop and no- Confusion matrix to be established) crop (overall accuracy, user’s and producer’s accuracy) Refinement with thematic distance Crop status map E.g.: confusion With in-situ data With existing maps (list between crop and no- to be established) crop

Table 4-2 : Validation operations for each prototype products As mentioned in Table 4-2, an additional refinement is foreseen for the statistical accuracy assessment of the cultivated crop type and area maps. The classical confusion matrix does not take into account the thematic distance between different classes. Indeed, in this matrix, a misclassification between a crop class and a no-crop class has the same impact as classifying a winter crop as a summer crop. Of course, these confusions have a totally different value from the user point of view. From both a producer and a user’s point of view, we need to present a matrix where misclassifications between similar classes are weighted lower than misclassifications between dissimilar classes.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 101 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The similarity will be described as the relative importance of different land cover classes for agriculture monitoring applications, and will be based on the hierarchical legend of the cultivated crop type and area:  Crop vs no-crop; o Winter crop vs summer crop ; . Cereals crop vs broadleaved crop;  Crop type A vs crop type B.

4.4.5 Demonstration plan (WP4500)

The demonstration plan is prepared at the end of Phase 2 in direct interaction with ESA with regard to the updated planning of Sentinel-2 data availability and the selection of final national scale demonstration cases. In order to guide decision making process for the latter, a set of criteria is already identified:  the commitment, the mandate and the current remote sensing activities of the candidate entity with regards to agriculture monitoring in his country;  the technical and remote sensing expertise available as well as their experience in field data campaign for validation;  the representativeness of the selected countries with regards to a global demonstration ambition; this includes the type of agricultural systems to monitor (e.g. crop type diversity, spatial heterogeneity, croplands fragmentation and field size distribution) and the agro-climatic conditions defining the actual availability of valid observation at the critical periods of the growing season;  the feasibility and the chance of success related to political stability, technical capabilities and institutional capacity; the a priori knowledge and experience of the consortium in the selected countries increase the anticipation and support capabilities.  the expected added-value of the EO Sen2-Agri products with regards to the current products available; on the other hand, this has to carefully consider as already existing high quality data sources may serve for the cross comparison and cost-benefits analysis. Furthermore, a strategy has to be defined to provide convincing demonstration. For example, there is a balance to find between “easy” cases, which correspond to users with a strong organization and sufficient means, to the more difficult cases where users are in a more difficult situation because of limited resources. Therefore, the work has to start from the analysis of the situation of the involved entities. A grid of criteria will be built for that and shared with ESA. This will help decided where national coverage and local scale testing should be implemented, keeping in mind that two national sites shall be located in Africa. The interactions with several possible candidates for the national demonstration cases allowed us identifying 4 valuable candidates to be considered in the selection process. These potential demonstration cases at national level are the Senegal, the Russia (covering only 6 agricultural oblasts), the Kenya and the Morocco.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 102 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

We already received, from three out the four potential national stakeholders a letter of commitment in which they express their willingness to invest in such a demonstration, their confidence in the consortium and their great interest for the project (see their letter of interest in section 16). More detailed information is included in the section 12. 4.5 Task 5 - Demonstration use cases

This task aims at demonstrating the proposed methodological and software solutions respectively delivered at the end of Phase 1 and Phase 2 by processing actual Sentinel-2 time series after the satellite launch and the commissioning phase. This demonstration will be completed at two different levels: 3) the local scale demonstration sites selected among the test sites used in the Phase 1 plus any voluntary test sites interested to join the test and demonstration effort (a Netherlands partner being already identified and committed - see his letter in section 16); 4) the national scale demonstration sites also selected among the test sites used for Phase 1 based on a set of criteria; at this stage, four countries have been already identified as potential candidates (see their letters in section 16). The objective of this task 5 is to generate the Sen2-Agri EO products with the active involvement of operational partners and to assess the performances in various dimensions. At least 8 different stakeholders will be trained and financially supported to collect in-situ data for the product validation but also to assess the actual added-value of the Sentinel-2 products. For three countries, more specific technical and financial support is allocated to implement large scale processing systems, to maintain a technical support along the season and to complete a nationwide in situ data collection campaign. Several national stakeholders already express their strategic interest for this full scale demonstration exercise and their capabilities to host such processing facilities. To fulfill all these objectives, Task 5 is divided in 6 sub-tasks that are described hereafter:  In-situ Sen-2 Agri system installation (section 4.2.1);  Capacity development and training (section 0);  Internal Sen-2 Agri EO production (section 4.2.2.1);  In-situ Sen-2 Agri EO production support (section 0);  Field data collection in minimum 8 sites (section 4.2.5); Use cases validation and assessment (section 4.2.6).

4.5.1 In-situ Sen-2 Agri system installation (WP 5100)

The first part of the Task 5 activities is dedicated to the installation of the Sen2-Agri system at user premises (i.e. in-situ). According to ESA specifications, the consortium plans to set up the Sen2-Agri system at user premises for at least 3 different organisations. The selection of these bodies will be made

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 103 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 during the project Phase 2 in the frame of WP 4500 (see section 4.4.5) and the final choice will be made in agreement with ESA. Staff from these organisations will be implicated in these installations and the work carried out by the consortium in this frame will include four main components:  First, some tests and an installation procedure update will be done if necessary during an internal Factory Deployment Test (at consortium premises);  Second, the Sen2-Agri system will be installed at user premises (after a complete user material and system configuration checking);  Third, a set of tests will be run on the installed system based on the Acceptance Document to check the consistency of the system regarding the internal one;  Fourth, a phase of Sen2-Agri system installation training (with a user staff dedicated) will be planned to make users able to make such kind of installation themselves. The effort allocated by the consortium to support this installation phase is estimated at one week in-situ for each concerned user organisation. This time frame will be completed by system support from the consortium in order to answer preliminary or additional questions from the user organisation staff by phone or e-mail. Feedback from the Sen2-Agri system installation could then be collected at each involved organisation in order to contribute to the Exploitation Report (ER).

4.5.2 Capacity development and training (WP 5200)

The aim of this activity is to support users so that they become self-sufficient in the use of the developed system and use it in an effective and sustainable way for their needs. The keyword of the task is therefore “capacity development”, which, according to the UNDP, defines the process through which individuals, organizations, and societies obtain, strengthen, and maintain the capabilities to set and achieve their own development objectives over time. A capacity building plan will be developed. This plan will have to consider the respective situations of the involved entities in terms of, for instance, objectives, area of interest, existing approaches and tools, manpower and skills, technical facilities (computing and network resources, equipment for ground surveys, power supply and cooling system). The choice of the first users involved in the process will be discussed with ESA which holds the final decision. As explained for the Demonstration Plan Development, the selection should consider easy and more challenging case. This is however obvious that the selected entities will have to be remote sensing entity with operational processing experiences in agriculture monitoring. This will also help to define the detailed content of the training, which shall take into account the initial situation of the users (e.g. skills, resources), their mandates and their objectives. As an initial proposition, the capacity building will cover three main topics all combining theoretical background and on-the-job training. The first topic concerns the remote sensing aspects and the related agriculture applications, the second address the in situ data collection and the validation protocol and the third introduces the software and technical aspects of the system.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 104 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

More specifically, the training is expected to include:  Lectures on the scientific and technical basis on which the developed system is built : Sentinel-2 and Landsat 8 characteristics, why pre-processing is needed (geometry and radiometry), which classification algorithms are implemented and why, how vegetation status is evaluated, what are the product delivered by the system  Lectures on the in situ data collection including sampling strategy, quality control procedure, and validation process to derive the accuracy figures;  Lectures on the technical aspects: system architecture, main development choices (open source softwares, OTB, ..), data flow, inputs, outputs, tools for further analysis of the products and for building and using multi-year time series. Staff from these organisations will then receive an appropriate training to run the Sen2-Agri system during WP 5200 - Capacity development and training. This training will be carried out once the system has been set up at their premises. The training objective is to make these people able to develop the Sen2-Agri products suitable for their activity by themselves.  Practical training on the system developed, from the Sentinel 2 and Landsat 8 data ingestion to the products generation and analysis. As far as possible, this training should use data acquired over the areas of interest for the users, but also data acquired in different areas in order to show how to face the diversity of situations. This practical training will also include some insights in the advanced use of the products, such as for example using a GIS to aggregate the information at different scales or build statistical tables. The main training sessions will be collocated with two user workshops to save time and money. These training sessions could also welcome a larger audience from the User Workshop participants. The training could be held immediately before in order that the audience would benefit of the feedback of the trained users. In addition, specific capacity building targeted to the national demonstration case will take place at the premises of each entity to support the local production and the nationwide field campaign. Another important aspect of the capacity building is the networking between users (south- south and north-south links), and between users and other kind of organizations, especially research laboratories and service providers. These issues could be one of the topics discussed during the workshop. If required by the users, advanced training may be considered outside the perimeter of the project (e.g. professional or student stays).

4.5.3 Internal Sen2-Agri EO production (WP 5300)

The first part of the EO production activities is dedicated to the generation of Sen2-Agri products at the consortium premises for the local scale demonstration sites which will receive the final products. Of course this delivery does not prevent some of these users to also try the software solution on their own. To be able to successfully manage the fulfilment of user requirements expressed in the User Requirements Document (URD), collaboration with and direct contact to user representatives will be established.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 105 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The generation of the Sen-2 Agri products will target the areas selected and agreed upon for 8 use cases established during WP 4500. The input data for this operation will consist in Sentinel-2 EO data. The validation will be performed against Landsat 8 imagery. If, following the validation of the products, corrections are to be performed, provisions for updating the system have been made. Such potential corrections will trigger a simplified software development lifecycle to avoid regressions and to make sure that the results match the expectations of users. The results of the validation will contribute to the Validation Report (VR). Users will have access to generated products by means of a dedicated FTP server. The project website developed in the frame of WP 7100 will contain a dedicated section that will direct to the location of the physical products.

4.5.4 In-situ Sen-2 Agri EO production support (WP 5400)

The second part of the EO production activities is dedicated to the generation of Sen2-Agri products at user premises over the demonstration areas defined in the Demonstration Plan (DP). As explained above, the current 3 out of the 4 candidates, i.e. Senegal, part of Russia, Kenya and Morocco, have been considered as a preliminary choice. This task is limited to the organisations which have received a full version of the processing system in the frame of WP 5100 - In-situ Sen2-Agri system installation. According to ESA specifications, the consortium plans to set up the Sen2-Agri system at user premises for 3 different organisations. The selection of these bodies will be made during the project Phase 2 in the frame of WP 4500 - Demonstration plan development and the final choice will be made in agreement with ESA. Staff from these organisations will then receive an appropriate training to run the Sen2-Agri system during WP 5200 - Capacity development and training. This training will be carried out once the system has been set up at their premises. The training objective is to make these people able to develop the Sen2-Agri products suitable for their activity by themselves. It is therefore assumed that, at the beginning of WP 5400, these people are ready to run the Sen2-Agri system according to their information needs. However, a short reminder of the training content may be necessary before starting the actual production. This could be useful because of the necessary time frame to collect the EO image series requested to develop the Sen2-Agri products (at least one crop season), or in the event of a change of the staff assigned to the project in the concerned user organisations. In this case, such a reminder will be given by the consortium at no supplementary cost. Once the user organisation staff is considered ready, the Sen2-Agri EO production will begin. The work carried out by the consortium in this frame will include two main components:  A support for the generation of the Sen2-Agri products themselves (Cloud Free Surface Reflectance Composites, Dynamic Cropland Masks, Cultivated Crop Types and Area Extents, Vegetation Status Indicators). These products will be developed by the user organisation staff under the supervision of the consortium, following the specifications expressed in the User Requirements Document (URD) delivered at the end of WP 1000, and in agreement with the Demonstration Plan (DP) prepared in WP 4500.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 106 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The user organisation staff in charge of this activity will use the Sen2-Agri system implemented in WP 4100 and installed during WP 5100, basing upon the documentation prepared in WP 4200. The EO data collected over the concerned test areas for this work will consist in: o First, Sentinel-2 EO data o second, Landsat 8 imagery (if necessary).  A support for the validation of the Sen2-Agri products using the in situ data collected during WP 5500 - Field data collection in minimum 8 sites. The developed products will be analysed together with the corresponding in situ data. The results will be recorded as a contribution to the Validation Report (VR). The effort allocated by the consortium to support the production and validation of the Sen-2 Agri products is estimated at one week in-situ for each concerned user organisation. This time frame will be completed by remote support from the consortium in order to answer preliminary or additional questions from the user organisation staff by phone or e-mail. Feedback from the Sen-2 Agri production and validation activities will then be collected at each involved organisation in order to contribute to the Exploitation Report (ER). In order to harmonise this feedback and make its analysis easier, it is planned to gather the user comments through a dedicated questionnaire which will be submitted to ESA approval prior to its use.

4.5.5 Field data collection in minimum 8 sites (WP 5500)

This activity is very much related to the field data collection operated during Phase 1 over the test sites. As both local and national demonstration sites were already involved in Phase 1, the Sen2-Agri EO validation activity will first rely on the dataset already built in Phase 1. This WP5500 aims at complementing this dataset if necessary (lessons learned from the prototype products performance assessment in task 4 - section 4.4.4) and making it totally suitable for Sentinel-2 data properties (spatial resolution, revisiting frequency, etc.). An additional strategy, specific to this demonstration phase, could be the acquisition of geo- referenced large-scale aerial photographs widely distributed over the sites either by drone or by light aircraft. Such data should allow identifying the crop type, the estimate of the Green Canopy Cover Fraction and can validate green LAI estimation obtained from satellite images (in particular for early growth stages, when the Green Canopy Cover Fraction can serve as LAI proxy). The use of such aerial coverage is of particular interest for the large scale demonstration case studies because of its cost benefit ratio.

4.5.6 Use cases validation and assessment (WP 5600)

Each product will be validated using the list of success criteria defined in the Demonstration Plan (section 4.4.4). The validation of the Sentinel-2 EO agricultural products rely on the same procedure than the one applied to the prototype products and described in section 4.4.4. It will rely on 3 other complementary pillars: (i) the confidence-building, (ii) the statistical accuracy assessment and (iii) the comparison with existing products.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 107 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

With regard to the prototype products, the same validation operations will be applied for each Sentinel-2 demonstration products (Table 4-3).

Confidence-building Statistical accuracy Comparison with procedure to remove assessment other products macroscopic errors Cloud-free composites E.g: spatial or temporal / With WELD Landsat gaps, radiometry mosaics anomalies, Dynamic cropland E.g.: confusion With in-situ data With existing maps (list mask between crop and no- Confusion matrix to be established) crop, temporal (overall accuracy, inconsistency user’s and producer’s accuracy) Cultivated crop type E.g.: confusion With in-situ data With existing maps (list and area map between crop and no- Confusion matrix to be established) and crop (overall accuracy, statistics user’s and producer’s accuracy) Refinement with thematic distance Crop status map E.g.: confusion With in-situ data With existing maps (list between crop and no- to be established) crop

Table 4-3 : Validation operations for each Sentinel-2 demonstration product In addition, the cultivated crop type and area map will be assessed through a comparison with existing agricultural statistics. Indeed, statistics are key figures in any agricultural monitoring system. A typical area of focus for statistics is politico-economic and concerns the world food market monitoring and management. For example in Europe, it is of critical importance for the Common Agricultural Policy application that accounts for 40% of the European Union budget. Only information on the major crop production from the main world producers is necessary to achieve this objective. Usually governments compile their results in terms of agricultural statistics by administrative area, which gives no clue to the exact locations where specific crops are actually grown. This comparison with statistics will be done systematically for the 3 national demonstration sites. It will also be computed for the other local demonstration sites if they are large enough to include administrative units over which statistics are compiled by governments (e.g. NUTS 3 in Europe). This statistics-based comparison will be performed for the different hierarchical levels of the legend. Regression analysis between the existing statistics and the estimated statistics from our Sentinel-2 demonstration products will be used, to assess the presence of any bias, offset, under- or over-estimations, etc. In addition, a user-oriented assessment will be set up, in order to assess the Sentinel-2 demonstration products utility and benefit. The user assessment is a component of the user

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 108 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 dialogue. The primary objective is to work with users to apply the products generated for their applications to provide evaluation and first indications on the potentials and limitations. This will feed into final discussions with the users for feedback on the products and results in a set of recommendations to further improve agricultural monitoring beyond this part of the project (see section 6). The user assessment after the Sen2-Agri products production and validation involves a number of steps:  Inform and acquire direct user feedback after the validation and comparison;  Provide data to users in an appropriate format that can be feed into their applications. An optimal dissemination mechanism to provide dedicated products in a useable format to the users’ needs to be found (first discussion can be found in section 4.1.3);  Using the new Sen2-Agri products in different applications, which will be done by users. The outcomes and user feedback from these three steps will be synthesized to assess potentials, limitations and adjustments to the Sen2-Agri products with particular emphasis on the transferability of the developed monitoring capabilities to regions and crops not covered by this project. Two main kinds of users will be considered:  The data producer, which uses the Sen2-Agri processing system and source code;  The analyst, which uses the information provided by the Sen2-Agri EO agricultural products to produce bulletin. This second category has also specific users, which can be: o Governments, public administrations, companies etc. which make any kind of decision based on provisional statistics (e.g. logistic decision) o Information brokers o Food security stakeholders, which struggle against famine and food crisis in countries at risk with agricultural system based on subsistence farming. Adequate knowledge on crop producing areas allows decision makers to locate populations that are most vulnerable to food insecurity and poverty. In this case, we expect they will be more interested in having a crop map than statistics alone to identify the precise region(s) where the shortage occurred which often differs from an administrative or statistical region. 4.6 Task 6 - Conclusions and recommendations

This task announces the end of the project and aims at concluding it successfully and preparing for a transfer of the developed service to a sustainable environment. This objective will be achieved through 4 sub-tasks that are described hereafter: • Data and results dissemination (section 4.6.1); • Users training on the developed system (section 4.6.2); • Sentinel-2 products added-value assessment (section 4.6.3);

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 109 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

• Synthesis and recommendations for further improvement and generalization (section 4.6.4).

4.6.1 Data and results dissemination (WP 6100)

This task aims at making all available products and documents publicly available to a user community wider than the one already involved in the project. Indeed, in addition to develop and demonstrate the Sentinel-2 contribution to the agricultural monitoring, the Sen2-Agri project wants to continuously attract and engage additional users into the project. To this end, the Project Web Site (section 4.7.1) will be the key interface. The following resources should be made available: • EO time series (L2A products): o Test Data Sets over the test sites; o Pre-processed Sentinel-2 time series over the demonstration sites; • EO products (including cloud-free composites, dynamic crop mask, cultivated area and extent map and vegetation status map): o Prototype EO agricultural products ; o Sentinel-2 EO agricultural products; • In-situ data over the test and the demonstration sites; • Source code of the Sen2-Agri system. In order to ensure an easy and correct use of the different products, special attention will be paid to the documentation. If necessary, not only projects deliverables (e.g. system documentation, validation reports, etc.) will be provided along the products but additional quick guides could be written. They would inform users, in a short document, about the products content, format, metadata, etc.

4.6.2 Users training on the developed system (WP 6200)

One major objective of the Sen2-Agri project is to help broadening as much as possible the number of users of Sentinel-2 products and of the developed system. In addition to the promotional activities considered in this task 7 (section 4.7), diffusion of advanced technical and practical information is needed. This will be done through dedicated trainings. A training plan will be drafted at the beginning of Phase 3 and progressively improved in order to incorporate the lessons and feedbacks of the first trainings and demonstration use cases. Given the variety of demonstration use cases, it is expected to accumulate a significant number of practical examples on the use of the system and on the characteristics of the products in many different situations (climate, cropping systems, etc.). Before defining the content of the training material, we will have first to list the kind of user organizations we target, and inside the organization the different categories of staff we should consider. We should also consider organizations which maybe are not direct users of the Sen2-Agri project but which could act as prescribers in the frame of their activities. For example, development banks (World Bank, African Development Bank, Asian Development

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 110 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Bank, etc.) can suggest or even support the use of Sen2-Agri in the frame of the projects they fund. As a very preliminary list, potential user organization could be for instance: • Sub-country level : chamber of agriculture, cooperatives, water management authorities, public services (agriculture but also land management, environment), managers of irrigated perimeters, seeds and pesticides resellers, insurance companies, biofarming certification firms, and maybe be even farmers if some simple interface is provided by added value companies or public bodies. • Country level : ministries and their departments (e.g. statistical offices, research institutes,) in charge of agriculture, environment, land planning, foreign affairs, trade, research, etc. • Regional and international level : European Environment Agency, Eurostat, JRC, World Bank, AfricaFAO, UNDP, UNEP, early warning agencies, etc. Different categories of users are to be identified. For instance, some organizations mainly provide data and products and do not perform any thematic analysis by themselves. They act as nodes of a network, and are mainly data providers. Usually, this kind of organization has a personnel accustomed to using satellite data and processing software. At the other end of the value chain we find organizations who are interested by information and not by the techniques. More important, potential users include both non-profit organizations with public goals, and commercial companies (e.g. grain traders). Our understanding is that ESA gives the priority to non-profit organizations. This has to be confirmed, and the analysis shall be refined with ESA since some non-profit organizations are providing commercial services. Once target users are defined, the training material can be identified and developed. Training material will be built on the basis of demonstration use case. Its detailed components will be defined in the training plan. As a first guess, this material could consist in several packages: • Several presentations (Power point or equivalent) dealing with the scientific and technical basis, hardware system and software, products, products generation, products use; • Tutorials (short presentations and user manuals) for using the system, at different level of skills and responsibilities (data acquisition, product use, system operator, etc.); • Example of data sets which can be used for learning how to use the system, from data ingestion to advanced uses. All these materials will be available freely on the Sen2-Agri web site (section 4.7.1) if ESA agrees. These materials can be downloaded by users by who are seeking information. They will also be used in the course of dedicated training sessions. They could be also useful for training organizations (e.g. specialized training courses of the universities). We plan to organize a training session as part of the user demonstration meeting to be held at KO+32. Additional training sessions might be considered and this option could be discussed with ESA in due course.

4.6.3 Sentinel-2 products added-value assessment (WP 6300)

This sub-task is dedicated to the activities that allow assessing and presenting the added value of Sen2-Agri products for operational monitoring systems. It will be a key input for

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 111 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 formulating recommendations for improvement of the processing system and demonstrated products (see section 4.6.4). This objective is totally in line with the overall validation strategy presented in section 4.4.4.1 (WP4400), in which a process of comparison with products is foreseen. According to the workflow of our proposal, this kind of process will be already partly implemented in the validation activity of task 5 (section 4.5.6). Indeed, it is planned: • to compare the Sen2-Agri products with other existing maps and statistics; • to organize a user-oriented assessment to evaluate the Sentinel-2 demonstration products utility and benefit, which will consider both data producers and analysts. This WP will build on these outcomes to finally provide clear evaluation of those Sen2-Agri products. It has the objective of building confidence in the Sen2-Agri EO products. It is, to some extent, driven by the notion, of “best” available product. Therefore, there is a need to quantify and characterize the advantages of the new products both in terms of technical characteristics (spatial resolution, temporal update, thematic accuracy, etc.) and utility (use cases, impact, etc.).

4.6.4 Synthesis and recommendations for further improvement and generalization (WP 6400)

This final task is made of 2 important components: 1) Providing synthesis of the main achievements of the Sen-2 Agri project; 2) Providing recommendations for the future. With regard to the 1st component, the main outcomes in terms of algorithms performance, input datasets, system development and products generation, products evaluation and validation will be gathered and summarized. The close link between the consortium and the end-users will also receive specific attention. Second, recommendations for the future will be written considering the consortium and the users’ point of views. As a result, recommendations should cover the technical aspects (typically related to the Sentinel-2 processing) and the applications. The transferability of the developed system and services to other regions or countries will be critically evaluated in terms of computing possibilities but also of algorithms generalizations to different climatic conditions, landscape patterns, crop types, agricultural systems. The project will end with a final User Workshop which will bring tighter the agricultural and the EO communities. Each participant will be invited to present his own experience and his recommendations for the future. 4.7 Task 7 - Promotional activities

In order to raise awareness about the Sen2-Agri project and further the dissemination of the developed system among the user community, a specific task is devoted to promotional activities. This task aims at ensuring the promotion of the project and of the corresponding results. This objective will be achieved through 3 sub-tasks that are described hereafter:

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 112 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 The development and maintenance of a project Website (section 4.7.1);  The development of promotional material (section 4.7.2);  The federation of the Sen-2 Agri user community and the communication towards the scientific world (section 4.7.3).

4.7.1 Website development and maintenance (WP 7100)

The project website will consist of two distinct entities, the public web site (PWS) and a password-protected web site (PPWS) utilizing Microsoft SharePoint. The purpose of PWS is mainly the dissemination of information to the user community (such as updates on the project progress, electronic newsletters, promotional content). PWS will be accessed by a designated URL (to be agreed upon when the project starts, for example www.sen2agri.eu). The structure of the PWS will be organized into sections to facilitate dissemination of information to the various audiences. CS-R is responsible for the design, layout and overall maintenance of the PWS for the whole duration of the project, and at least one year after the project closure. While CS-R is responsible for the entire site, representatives from consortium members are responsible for maintaining the content related to their specific activities. Web page content and evaluation must start at the consortium member level and must be approved by the project prime contractor before being published. Subsequent development, construction, and maintenance will be performed by CS-R. The PPWS provides secure access to internal communications and it is well-suited for collaborative groups in which a number of members are editing documents, posting items or participating in other online activities. Collaboration is not limited to consortium members and ESA representatives. Instead, it is extended to user communities to allow them to easily provide feedback on methods of using remote sensing data for agriculture (processing, validation, protocols, etc.). Users will have the possibility to self-register and subscribe to such discussion lists. Additional to access restriction policies, PPWS will be secured with a 2048-bit SSL certificate and will be accessible over HTTPS. From the many features that are available in SharePoint products, the following are envisaged for being part of PPWS:  Document Library – serves as a central repository for project-related documents that are secure but easily accessible to consortium members;  Resources – serves for storing information that requires access by internal users but is not intended for public dissemination;  Committees – centralized storage and access to committee notes, agendas, minutes, task lists, discussions, calendars;  Project Groups – Centralized storage and access to project plans, charters, task lists, discussions, minutes – all documents involved in project management;  User discussions - forum-like section for allowing users to provide feedback and to have discussions on products/topics related to Sen2-Agri project.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 113 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Most users of PPWS will have one of the types of access listed below:  Site Owner (Full Control): has full control and is responsible for defining access policies  Member (Contributor): can view, add, update, and delete content  View Only: read-only access. In addition, for users subscribing to the discussions list, a special role will be defined so that their contribution capabilities would be restricted to the discussion area only, where no anonymous posting will be accepted.

4.7.2 Promotional material development (WP 7200)

The second aspect of promotional activities concerns the development of specific material on various supports to support these activities and enlarge at mid-term the Sen2-Agri user community. Following ESA specifications, the consortium plans to work on 3 different types of promotional material: project newsletters, advertising brochures and a video.  A project newsletter with regular issues Newsletters will be written each quarter using information regularly collected about the project progress. They will be prepared in English and published using professional editing standards. The target dimension will be A4 double-sided documents. The content will be delivered in Word format and validated by ESA before each newsletter issue is finalized and delivered as a PDF file. For the first issue, the main addressees will be ESA, the members of the champion user group and the project team. This list will then be enlarged throughout the project. The newsletters will be disseminated to the addressees by e-mail, and also be made available through the project Website.  Several advertising brochures which will form the first part of the Promotional Package (PP) deliverable. These brochures will present the project content and objectives and describe the major activities and findings. The consortium plans to develop one brochure prior to each main project milestone (Users Requirements Review, Qualification Review/Acceptance Review, Final Meeting), i.e. to 3 documents for the whole project. As a preliminary assumption, the consortium suggests that: o the first brochure presents the project rationale, organization and schedule, as well as the champion user group. It will reflect the Users Requirements Consolidation results; o the second one includes additional details about the system development, the Sen2-Agri products to be delivered and the proposed demonstration use cases; o the third one is completed with results from the actual demonstration use cases and feedback from the user involved in these activities.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 114 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The brochures will be prepared in English and published using professional editing standards. The target dimension will be 2 or 3 page A4 double-sided leaflets. The content will be delivered in Word or Publisher format and validated by ESA before the brochure is finalized and delivered as a PDF file. Paper copies of the brochures will be also supplied. The consortium currently plans to provide 300 copies for each of them. This addressee list for the brochures will be defined in agreement with ESA. It will include at least the members of the champion user group, the ESA project team and the consortium. This list will be enlarged throughout the project. The brochures will be disseminated to the addressees as paper copies or by e-mail, and also be made available through the project Website. Paper copies will also be made available during the conferences or symposia during which the project will be presented.  A video presenting the project, which will form the second part of the Promotional Package (PP) deliverable. Just as the brochures, this video will present the project content and objectives and describe the major activities and findings, specifically from the user point of view. The video will be developed during Phase 3 in order to include results from the actual demonstration use cases and feedback from the user involved in these activities, for instance as interviews. It will be prepared in English and shot using professional standards. The basic language will be English and the target duration will be between 5 and 10 minutes. The resulting video will be validated by ESA before it is finalized and delivered in a standard format. It will be made available through the project Website, during the project meetings or at any event attended in relation with the project.

4.7.3 User community federation and scientific communication (WP 7300)

The objective of this task is twofold: - Ensuring a strong link with the user community (not only the one involved in the project) - Ensuring the scientific communication for the project as written and oral contributions

These objectives will be achieved thanks to the organization of Users Workshops that will bring together the agricultural and EO communities. This kind of meeting could be joint- meetings with the JECAM initiatives. Through the organization of this kind of events, the project aims at attracting and engaging additional users in the project. The concept of voluntary site (see WP1100) goes in the same direction, and aims at federating users on our products. The scientific outputs of these workshops will be valorized in the forms of presentations, proceedings and/or papers that will be made available through the project website. In parallel, specific attention will be paid to the scientific communication about the main outcomes of the Sen2-Agri project.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 115 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

4.8 Task 8

The management of the project will be carried out by UCL. The management will be performed in accordance with the rules and procedures defined in the UCL Quality System, which meets the general requirements of the ESA ECSS standards (based on ISO 9001 Quality Management Systems - Requirements). The project manager responsible for a subcontracted activity will ensure that all subcontractors make competent deliveries. This may include the audition of the contractor by the project manager. A Documentation Management Plan is part of the Project Management Plan and has the objective of verifying the application of the quality criteria throughout all the phases of the project. ESA will be informed about the project in accordance with UCL project control requirements. Communications between the members of the consortium and ESA will be carried out in a manner that improves the flow of information between all project participants. E-mails, teleconferences and faxes will be used to implement a cost effective and efficient method for frequent exchange of information. All project documentation and communications will be in English language and available through the project web site. Formal communication to ESA will be provided by UCL, in accordance with the project rules as applicable. Any formal correspondence will be sent as hard copy; it may be sent electronically for information only. Informal communications are defined as routine technical correspondences or teleconferences that, upon agreement between Project Managers, can be exchanged between their subordinate key-persons or engineers with copy to the respective project hierarchy. The project management approach is presented in detail in the Chapter 3 of this proposal.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 116 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

5 Coordination with Ongoing and Complementary Activities

Ensuring timely, open and free access to reliable, relevant and spatialized data is a necessary condition for the full deployment of a global agricultural monitoring capacity. Therefore creating an operational global agricultural monitoring system of systems (satellite and in situ) and ensuring it sustained operation is a major ambition, of key importance for projects like the Sen2-Agri one. Of course such a global agricultural monitoring system of systems can only be built on many different activities carried out in parallel and must built on international dynamics. Similarly, the completion of the Sen2Agri objectives is only feasible thanks to the multiple converging research and organizational efforts started a long time for some, much more recently for others. The best example concerns the global scale of the project which can be tackled because of the GEO Ag Community of Practices, the JECAM network and the GMFS project, the SPOT 4 (Take 5) experiment and the launches of Landsat-8 and Sentinel-2. The strategy of this Sen2Agri is of course to take this unique momentum and to continue this strategy of converging efforts. This appears very natural to our consortium has UCL- Geomatics and CESBIO have respectively play the role of pioneers respectively for the GEO Ag CoP and JECAM, and for SPOT 4 (Take 5) initiative. From the research and methodological point of view, a suite of FP-7 agriculture monitoring projects where UCL-Geomatics has been continuous involved leads to certain maturity of the dynamic cropland mapping. First the Sen2Agri approach builds Geoland-2 experience and the results of the FP7-Mocccasin project carried with Russian colleagues. Effective synergies are expected between the recently started FP-7 IMAGINES, the EU GIO-Global Land Service and even more with the SIGMA project still under negotiation. UCL-Geomatics are key partner is most of these projects and will coordinate the interactions. Indeed, past or on-going projects of UCL-Geomatics have been an opportunity to both develop site-specific knowledge over different environments and compile relevant data sets. Recently, EO data and field observations accumulate over vast study sites included in the JECAM network. For instance, in the Russian region of Tula (25 000 km²), two successive field campaigns were organized in 2012 and 2013 to collect crop type information and/or LAI measurements. Moreover, a unique data set of 14 regional coverages in RapidEye and Radarsat-2 (ScanSar Narrow) is currently in progress (April to July 2013). The South African JECAM site located in the Free State is part of the Spot 4-Take 5 experiment: from February to May 2012, high resolution images have collected every 5 days. The link between our Sen2- Agri project and the JECAM initiative has been highlighted in varying section of this proposal. From the software point of view, several development have been already developed from the OrfeoToolbox for Sentinel-2. It is clear that the experience, the knowledge and whenever contractually possible the direct interactions will also allow developing significant synergies and complementarities. The future of the Sen2Agri solution for instance will probably be linked with the evolution of the Sentinel-2 tool box as a whole.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 117 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Last but not least, the Sen2-Agri project is strongly related to the existing agriculture monitoring systems in place in the users involved in the demonstration phase. These systems are for instance, the EU MARS, the ESA GMFS and the FP7 AGRICAB projects in Senegal; the EU MARS FoodSec and US FEWS Net systems in Kenya; the FP-7 MOCCASIN in Russia. These interactions are beyond the project framework and many personal relationships including almost ten PhD people graduated from UCL-Geomatics are working in the field of agriculture monitoring. Regular interactions with our US, Chinese (Dragon project going on) and Russian colleagues allow maintaining a expert network very active. More specifically, our consortium has regular interactions with the EU MARS unit at JRC. In a broader sense, the Land Cover component of Climate Change Initiative (LC_CCI ) of ESA is also partly related this Sen2Agri project by providing a valid contextual information to further scope the agriculture lands. The global scale experience continuously enriched will serve this project for sure. Furthermore, the great quality of the MERIS FRS land surface reflectance composite which will be delivered by the LC_CCI may be of interest to derive some phenological metrics from the time series analysis. Conversely, the agriculture experience gained here to discriminate further the landscape will support to better delineate and better characterize the croplands in the LC_CCI product which corresponds to a request of the climate modelling community. At a different scale, GEO is a potential focal point for the next step of any initiative in the domain. Strengthening these applications and actors, and engaging them through an international collaborative network, could contribute significantly to developing the capacity for increased transparency for agricultural and food commodity markets worldwide. GEO was launched at the 2002 Summit on Sustainable Development in South Africa, which is a voluntary partnership of governments and international organizations with the main goal to support sustainable management of the earth’s resources making use of remote sensing. Its main vision is to build a Global Earth Observation System of Systems (GEOSS) through a coordination of remote sensing activities around the globe. As one of its strategic targets GEO aims to expand application capabilities to advance sustainable agriculture, among others. As explained above, the G20 launched in June 2011 the Global Agricultural Geo-Monitoring (GEOGLAM) and the Agricultural Market Information System (AMIS) initiatives. They asked the GEO Agriculture Community of Practice to implement GEOGLAM with the main objective to improve crop yield forecasts as an input to the AMIS to foster stabilization of markets and increase transparency on agricultural production. The objectives of GEOGLAM are to:  Enhance national agricultural reporting systems  Establish a “global” network of experts in agricultural monitoring  Create an operational global agricultural monitoring system of systems based on Earth Observation and in situ data In this community of global crop monitoring, the Sen2Agri project could play a predominant role to demonstrate the capabilities of Sentinel-2 sensor to address the agriculture information request at high resolution in an operational context over large scale.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 118 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

6 User-oriented approach

6.1 User consultation plan

The SoW (AD.1) presents of organizations who have agreed to take part in the Sen2-Agri user group. They are listed in Table 6-1.

Acronym Institutions AAFC Agriculture and Agri-Food, Canada ARVALIS ARVALIS (Institut du Végétal), France ATIC Alberta Terrestrial Imaging Centre, Canada CAS Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, China CIRAD Centre de coopération international en recherche agronomique pour le développement FAO Food and Agriculture Organization FENAREG Federation of irrigation associations IFAD International Fund for Agricultural Development INTA Instituto Nacional de Tecnologia Agropecuaria CDU & Chouaïb Doukkali University & Réseau National des Sciences et Techniques de la Géo- REGI Information, Morocco RCMRD Regional Center for Mapping of Resources for Development NEOSS National Earth Observation and Space Secretariat, South Africa - Ministry of agriculture and irrigation - Agricultural Statistics Department, Sudan SRI Space Research Institute of the National Academy of Science & State Space Agency, Ukraine UCL Université catholique de Louvain, Belgium USGS US Geological Survey, United States WFP World Food Programme

Table 6-1: Participants of the Sen2-Agri user group As indicated in Table 6-1, several actors and types of users will be involved in the representing both political, R&D and development institutions. Users associated with the pre- selected test sites (Table 4-1 in section 4.2.1 and Table 8-1 in section 7.1) complete the list. The structure to ensure a continuous dialogue in the different phases of the project includes three main areas: 4) Identification of specific user needs for product specifications (SoW and WP 1100): a. Initial user’s requirements established during a consultation exercise organized by ESA in April 2012 with about 50 members of the agricultural user and expert communities;

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 119 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

b. Users’ requirement consolation by an additional questionnaire sent to every users to address mainly (i) the main priorities of the user, (ii) the most wished products and (iii) the technical details of every product, including the delivery mode; c. Users’ requirement consolation by direct contact with the users 5) Critical user review of the Sen2-Agri production process: a. Selected users will be invited to participate the product development, implementation and validation and asked to provide input and feedback at different points during the process. 6) User application and feedback mechanism from the users on the use of the products and related potentials and limitations (WP 4400-5500-6300): a. Users will be use the products generated in their applications to provide first indications on their impact, added-value, potentials and limitations; b. Final discussions with the users will yield feedback on the products and results in a set of recommendations to further improve agricultural monitoring beyond this part of the project. 6.2 Link with user requirements

The SoW (AD.1) provides the outline users’ requirements, which have been collected during the user consultation process in preparation of the project (Table 6-2). They are the minimum requirements that are to be satisfied by the activities of the project. How the proposal addresses each requirement is presented in the third column of the table.

UR UR detail Proposal compliance ID 1 Dynamic cropland masks The proposal mainly complies these Coverage: regional (e.g. East Africa) requirements (section 2.3) with two special Time period: historical records comments: required for detection of anomalies - some clarifications will be gathered Temporal frequency: seasonal during the Users’ requirements products, up to monthly updates consolidation (see WP1100 in section 4.1.1) Delivery time: 1-2 days after the end of composite period - the generic users requirements in term of timeliness and accuracy reported in Spatial resolution: 10-20 meters the Statement of Work will have to be Geometric accuracy: sub pixel further defined according to the location error respective context, i.e. farming system Thematic accuracy: 5 % (maximum complexity, already existing crop error of omission and commission of information, expected use of the cropland mask), cropland is defined information in the existing decision as currently cultivated land (irrigated making process. In other terms, an and rain-fed) updated version of the users’ Grid/Projection: UTM, WGS84 requirements will be defined (in

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 120 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

association with users) to make them specific to each site In any case, the URs expressed in the SoW are considered as real targets.

2 Cultivated crop type and area extent The proposal mainly complies these Coverage: national requirements (section 2.3) with two special Time period: historical records comments: required for detection of anomalies - some clarifications will be gathered Temporal frequency: during the Users’ requirements seasonal/annual products consolidation (see WP1100 in section 4.1.1) Delivery time: 1-2 weeks after end of composite period - the generic users requirements in term of timeliness and accuracy reported in Spatial resolution: 10-20 meters the Statement of Work will have to be Geometric accuracy: sub pixel further defined according to the location error respective context, i.e. farming system Thematic accuracy: 10 % (maximum complexity, already existing crop error of omission and commission of information, expected use of the area extent per crop information in the existing decision class), the area extent product needs making process. In other terms, an to be described with an error updated version of the users’ estimate. To be distinguished requirements will be defined (in crop classes are main crop types or association with users) to make them crop clusters (at least maize, wheat specific to each site and dominating crops of In any case, the URs expressed in the SoW are chosen regions), irrigated and rain- considered as real targets. fed crops Grid/Projection: UTM, WGS84 3 Vegetation status indicator The proposal mainly complies these Coverage: national, regional hot-spot requirements (section 2.3) with two special areas comments: Time period: historical records - some clarifications will be gathered required for detection of anomalies during the Users’ requirements Temporal frequency: decadal consolidation (see WP1100 in section products 4.1.1) Delivery time: 24 hours after decadal - the generic users requirements in term composite of timeliness and accuracy reported in the Statement of Work will have to be Spatial resolution: 10-20 meters further defined according to the Geometric accuracy: sub pixel respective context, i.e. farming system location error complexity, already existing crop Thematic accuracy: >85 % accuracy, information, expected use of the possible indicators: FAPAR, information in the existing decision vegetation indices making process. In other terms, an Grid/Projection: UTM, WGS84 updated version of the users’

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 121 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

requirements will be defined (in association with users) to make them specific to each site In any case, the URs expressed in the SoW are considered as real targets.

4 Composites of cloud free surface The proposal mainly complies these reflectance requirements (section 2.3) with two special Coverage: national to regional comments: Time period: current - some clarifications will be gathered Temporal frequency: 1-2 weekly during the Users’ requirements composites consolidation (see WP1100 in section 4.1.1) Delivery time: 24h after end of composite period - the generic users requirements in term of timeliness and accuracy reported in Spatial resolution: 10-30 meters the Statement of Work will have to be Geometric accuracy: sub pixel further defined according to the location error respective context, i.e. farming system Thematic accuracy: atmospherically complexity, already existing crop corrected surface reflectance, with information, expected use of the highly accurate cloud and information in the existing decision cloud shadow mask making process. In other terms, an Grid/Projection: UTM, WGS84 updated version of the users’ requirements will be defined (in association with users) to make them specific to each site In any case, the URs expressed in the SoW are considered as real targets.

5 A standard data format is requested The format selection will be done in close for all data and products and shall be collaboration with the users - WP 1100 defined in collaboration with the All products will be delivered to any kind of users users (e.g. GEOTIFF). - WP1100 and 1300 All higher-level products (UR1-3) is requested to be delivered for administrative/national entities 6 All provided products are requested The validation procedure follows international to be validated against in-situ data standards, including a confidence-building and quality controlled. process, an independent quantitative validation Uncertainties of the products shall be based on in-situ data and a comparison with specified and quality flags provided existing products (see sections 4.4.4 and 13.2) as part of the metadata. Products are delivered with quality flags which The documentation of the data characterize the uncertainties (section 2.3) production and validation shall be Documentation of the data production and publicly available. validation is made available, along with the products (WP 6100 in section 4.6.1)

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 122 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

7 All data and products are required to The proposal complies these requirements (WP be open and freely available, 6100 in section 4.6.1 and Appendix B in section including a straightforward access 14) Tools and algorithms required for the production of the EO products should be developed under an open source code license.

Table 6-2 : Outline Users’ Requirements consolidated from the initial requirements of the users listed in Table 6-1

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 123 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

7 Specification of satellite and in-situ data

7.1 Satellite data

7.1.1 SPOT

SPOT4 - Take5 data will constitute the backbone of phase 1, regarding remote sensing data. With the exception of China, Belgium and Ukraine, at least one cloud free observation per month has been obtained during period from the beginning of February to midMay, for all selected SPOT4-Take5 sites. Other images will be acquired before the end of the experiment on June20th. Table 7-1 shows the number of cloud free (>80%) images obtained for each site and each month, until May 15th. These images will be provided as L2A products by the French Land Data Center. Figure 7-1 illustrates the SPOT4-Take 5 images collected over the Argentina site. Sites February March April May Argentina 3 3 4 >=2 Belgium 0 0.5 2 >=0 China 0 (1 with heavy 0 (1 with heavy 1 >1 aerosols) aerosols) Paraguay (near 2 2 4 >=1 Iguazu) Ukraine 0 (2 with snow) 0(3 with snow) 3 >=0 South Africa 3 3 3 >=2 Morocco 4 1 1.5 >=2 (GMFS) Ethiopia 2 2 2 >=1 Madagascar 1 2*0.5 3 >=1 Tunisia 3*0.25 1 3 >=3 Midi-Pyrénées 1.5 1.5 1.75 2x0.5 Maroc (JECAM) 4 3 4 >=2

Table 7-1 : Number of cloud free (>80%) images obtained for each site and each month, until May 15th

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 124 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 7-1 : Example of thumbnail SPOT4-Take5 images gathered above the Argentina site. The date is provided in the filename.

7.1.2 LANDSAT

In addition to the data collected through the Spot 4 Take 5 experiment, Landsat imagery will also be considered to ensure a proper coverage of the Sen-2 Agri test sites. The emphasis will be put on LANDSAT 8 data. The acquisition of such products starting on a routine basis at mid April 2013, they will relay if necessary the Take5 acquisition, although with a lower repetitivity. In addition, Landsat 7 or even 5 sensors may also be considered in order to gather enough data so as to build appropriate Test Datasets (TDS) for the development of the Sen-2 Agri prototype products requesting data over more than one single crop season. Landsat data are free of charge and distributed freely by USGS. They will be directly retrieved from the USGS archive using the Earth Explorer catalogue.

7.1.3 Rapideye

Beyond its involvement in the Spot 4 Take 5 activities, ESA asked RapidEye to task data over the sites selected for this experiment. The underlying idea is to collect Rapid Eye imagery to fill in the potential gaps within the Spot 4 Take 5 time series. As a result, RapidEye data should be available over the Sen-2 Agri sites. If necessary, these data will be ordered through the TPM procedure on a limited basis. For instance, on the

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 125 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Chinese site, 3 clear dates seem to be available (April 2, April 12 and May 5). We would therefore order something like:  4 images for the Take 5 sites in China, Belgium and Ukraine;  1 or 2 images for each of the other Take 5 sites under consideration in the project; The estimated total number of requested RapidEye images should be therefore lower than 30.

7.2 In-situ data

The strategy for in-situ data acquisition is described in section 4.2.2 WP2200. This strategy will be applied for the whole project, including phase 3. In-situ data will rely as much as possible on the existing collection performed by users. The role of Sen2-Agri partners, mainly UCL and CESBIO, will be first to gather and organize the data collected which are useful for training and validating the algorithms and the developed system. A dialogue with the users will be established in order to identify possible lacks and to assess the quality of the data. When new data are to be collected, discussions will be held in order to define the protocols for data collection and management. UCL and CESBIO will also participate to some of the field campaigns, in particular in sites where users might need some help for protocol consolidation, use of new tools (e.g. hemispherical photographs), and training on softwares for data management. The main in-situ data needed by the project are of three main types: • Dynamic cropland mask will be based on a combination of ground surveys, ancillary data if they exist (e.g. forest inventories), satellite data, classification algorithms and photo- interpretation. • Crop type and area: For the croplands and the main crop types, field visit once a year or twice for multiple cropping systems is sufficient to capture the observation. The sampling strategy consists of long transect randomly selected along roads or tracks where observation are recorded on regular basis. To capture the diversity of field conditions related to the meteorological, soil or topographic effects, several long transect will have to cross the whole region of interest. The use of survey equipment based on GPS will be promoted. • Crop status: two options are of interest, namely random selection of parcel or preferably regular monitoring of a set of well identified parcels covering the whole range of situations. Crop status will be characterized through a set of measurements: Leaf Area Index (LICOR LAI 2000), Green Canopy Cover Fraction (hemispherical photographs), and height (cardboard). Additional qualitative information (e.g. phonological stages) will also be collected.

The management of in situ data is an important part of the work. A solution successfully implemented by CESBIO is based on PostgreSQL and PostGIS open source softwares, and can be interfaced with web-GIS (OGC standards).

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 126 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Regarding the demonstration use cases with national coverage, specific validation process will be discussed with the users, since the data to be acquired will depend on the main objective of the users. For instance, if national statistics are the objective, the data to be collected will be different from that of multiple local or sub-national uses.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 127 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

8 Preliminary list of test sites

As explained before (section 4.1.2), four different types of sites are currently considered in this proposal:  During phase 1: o A set of 15 test sites for the phase 1, to be used in methods benchmarking and S2 data simulation;  During phase 3: o A subset of at least 5 sites among the 12 for local scale demonstration and in situ validation of the different EO agricultural products; o 3 demonstration cases dealing with national coverage and national diversity, 2 of them corresponding to African countries; o Additional voluntary sites either for local scale demonstration or nationwide production, “voluntary” meaning that these sites can use the developed S2- Agri processing chain but without receiving any funding (working on a voluntary basis) or technical support (even if their participation to the training workshop could be foreseen). Table 8-1 presents the preliminary list of sites proposed in this project. A more detailed characterization of these sites can be found in section 4.2.1. ID Site name Location InSitu data Remote Sensing data 1 JECAM-1 Argentina JECAM Take5, L8, RE 2 JECAM-2 Belgium JECAM Take5, L8, RE 3 JECAM-3 China JECAM Take5, L8, RE (Shandong) 4 JECAM-4 Paraguay JECAM Take5, L8, RE (near Iguazu) 5 JECAM-5 Ukraine JECAM Take5, L8, RE 6 JECAM-6 South Africa JECAM Take5, L8, RE 7 GMFS-1 Morocco GMFS-1 Take5, L8, RE 8 GMFS-2 Ethiopia GMFS-2 Take5, L8, RE 9 JECAM & Madagascar JECAM Take5, L8, RE Take5-2 10 Take5-2 Tunisia Take5, L8, RE 11 JEC CSudmipyO Submitted as JECAM site Take5, L8, RE 12 CMaroc Submitted as JECAM site Take5, L8, RE 13 Russia Submitted as JECAM site L8

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 128 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

14 Kenya Submitted as JECAM site L8, Aster, SPOT 15 Senegal GMFS site L8, SPOT

Table 8-1 : Preliminary list of test sites

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 129 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

9 Preliminary list of algorithms

This section presents a preliminary list of algorithms and strategies which will most likely be used and benchmark during the project. We recall the distinction between algorithms and strategies made in section 4.3.2.1.2:  algorithms are elementary building blocks of a processing chain;  a particular choice of a set of algorithms for the different steps of a processing chain is a strategy. 9.1 Cloud-free composite generation

Cloud free composites of surface reflectance (also called Level 3A products) may be useful for several reasons:  they cover surfaces larger than that of a single satellite image;  they can be provided at the same date every year and do not depend on a cloud free acquisition date;  they enable a data volume reduction compared to the level 2A products (but they also represent a data loss compared to the level 2A);  many algorithms for classification or segmentation do not easily handle the presence of data gaps in the time series. Many algorithms may be used to produce a cloud free composite. Our remote sensing ancestors used the famous Maximum NDVI compositing method (RD.89) to obtain composite products from moderate resolution optical satellites. The main advantage of this method was to select the date the most likely to be cloud free among the list of available dates in the compositing period. This method also did not require heavy computing resources. But this type of composite, which for each pixel only selects one date and discards the others, always results in very noisy surface reflectance composites, because:  the selected date may have been acquired under a different viewing angle compared to the neighbourhood;  the selected date may be affected by a cloud shadow or observed in the vicinity of a cloud;  surface reflectance may have changed with time, and if the date changes from one pixel to the other, the image appears noisy. Other methods were more recently suggested to use all the valid observations either by averaging all the valid data during the compositing period (for instance the method used by UCL to produce the Globcover products (RD.90)), or by fitting a directional model to cope with the directional effects problem, such as the CYCLOPES composite (RD.91) developed by CESBIO and CNES, or the MODIS NBAR composite (RD.92). All these methods provide much better results than the basic max-NDVI method (see Figure 9-1).

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 130 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 9-1 : Comparison of the noise of a Max NDVI composite (left), with a CYCLOPES composite (right), from VEGETATION data. The noise is much higher in the Max-NDVI composite (the black spots on the CYCLOPES composite are due to missing data since the algorithm required at least 3 observations) Compared to very wide field of view instruments, in the case of Sentinel-2, the problem is really simplified for two main reasons: 1. the cloud detection is much more accurate when performed at a high resolution, with images acquired with viewing angles close to nadir, and with a large diversity of spectral bands including the 1.38 µm spectral band, able to detect thin cirrus clouds; 2. the directional effects, although still present with Sentinel-2, are largely reduced thanks to the limited viewing angle of the acquisitions. However, since the revisit cycle is not daily but 5 days (10 days in the initial phase with one satellite), the amount of valid dates is somewhat reduced. The document (RD.12) suggests three compositing methods:  priority to the most recent pixel;  priority to the temporal homogeneity;  priority to the radiometric quality. Although based on good principles, these three methods are quite far from the state-of-the-art and do not seem to have been tested on real data. All three methods consist in selecting the best pixel based on various criteria. These methods should result in large discontinuities in areas where the selected date changes. They are therefore very noisy, and they result in a large data loss, since part of the acquired information is not used. Moreover, as the method only selects one date, but may select a different date for each pixel, the centre date of the compositing period does not represent the date for each product. CESBIO has already experimented the first suggestion, as it is the one used for the composite image that serves as reference for the cloud detection and the aerosol estimates in the MACCS processor. Having viewed a lot of these composites, observed large discontinuities are frequently observed when different dates are used (see Figure 9-2).

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 131 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 9-2 : Comparison of level 2 product (left) and “most recent pixel” Level 3 composite, at the same date, from SPOT4-Take5 data. Places where clouds or shadows are found in the level 2A are replaced by pixels from an older date, for which the ground was much wetter (after some rain), large discontinuities can be observed. These discontinuities are acceptable for cloud detection, but would not be accepted in a Level 3A product. The (RD.12) methods also omit to address the problem of directional effects, which are not completely negligible in Sentinel-2 case (see for instance Sentinel-2 symposium conclusions [RD.93]). Although the (RD.12) methods will be easily implemented by the consortium, it is intended to implement a much more accurate method already partly implemented in Venµs ground segment. This strategy is made of 2 steps: 1) Produce a Level 3A composite (with directional effect correction) based on a weighted average (implemented in Venµs’ ground segment); 2) Use a gap filling method to fill the pixels for which no clear observation is available during the compositing period (implemented in CESBIO’s classification processor).

9.1.1 Step 1: compositing

The proposed Level 3 method implements, for each pixel, a weighted average on the valid dates selected within the compositing period. For each pixel, the valid dates are the dates which are not flagged as clouds or cloud shadows by the Level 2A. The weights we already have implemented mainly depend on:  the optical thickness estimated in the Level 2 product (the lower, the better);  the distance to a cloud or to a shadow (as the cloud edges are always fuzzy and as the clouds cause large adjacency effects which are difficult to correct);  the percentage of clouds in the image. Figure 9-3 shows an example of this weighted average approach.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 132 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 9-3 : Weighted average composite derived from FORMOSAT-2 data. The two available images are both partly cloudy. The resulting shows no artefacts. Some pixels in the composite were flagged as cloudy in both input products. They are flagged as cloudy in the composite, but their reflectance is filled using a minimum blue reflectance criterion. The method developed for Venµs can be reused for the test data sets but requires the implementation of a directional effect correction. We will thus use the method of Vermote [RD.94]. This correction is being tested in the framework of NASA-CESBIO collaboration, and has already been successively applied to compare MODIS and Formosat-2 time series [RD.95]. The method will also be tested on several SPOT4-Take5 sites which have a common zone observed from two adjacent orbits (Maricopa, Sudmipy, BretagneLoire, ProvLanguedoc), thus under different angles at a one day interval.

9.1.2 Gap filling

The aim of the gap filling is to obtain a reflectance value for the pixels for which no clear observation is available during the compositing period. Gap filling algorithms based on different approaches are readily available at CESBIO and will be assessed.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 133 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

9.1.2.1 Temporal interpolation approach

The proposed method consists of a combination of linear and Savitzky-Golay interpolation [RD.96], using a weighted version of the smoothing filter, which is particularly fit for the application of least-squares polynomial fitting to frames of noisy data, as it is the case here. The filter is local (f degree k), being applied on a series of values of at least k+1 points for invalid values (as established by the weights), and admitting unequally spaced time points. The process is illustrated in Figure 9-4 to Figure 9-6 by showing one band of information for a single date, its corresponding cloud mask (detected with the MTCD method [RD.97]), and the resulting band after processing. The process is applied for all spectral bands, and for all the time series in order to produce the final result with the chosen time sampling.

Figure 9-4 : Initial information with clouds and cloud shadows

Figure 9-5 : Cloud data mask obtained with the MTCD method

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 134 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 9-6 : Resulting cloudless data after interpolation When the number of available points is inferior to the number of points needed to undertake the interpolation as defined by degree, linear interpolation is applied. Two examples of images impacted by clouds and cloud shadow are shown in Figure 9-8 and Figure 9-10, and the resulting data after processing is given in Figure 9-9 and Figure 9-11, respectively. A one- pixel example of the interpolation process in time is given in Figure 9-7.

Figure 9-7: One-pixel evolution example: data values in time identified as erroneous or missing (in red, outside the fitted lines), are interpolated to their corresponding new values (in green, on the fitted lines). Reflectance values are given here against corresponding day-of-year calculation since the midnight Coordinated Universal Time (UTC), January 1st 1970 (Unix time)

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 135 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 9-8 : Initial data affected by clouds

Figure 9-9 : Resulting cloudless information after processing

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 136 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 9-10 : Initial data affected by clouds

Figure 9-11 : Resulting cloudless information after processing

9.1.2.2 Fusion with mid-resolution imagery

The temporal interpolation approach presented in the previous section has the main advantage of not needing any other imagery than the Sentinel-2 data. However, the interpolation used may not be satisfactory enough in the case of long gaps in the temporal dimension. An interesting approach consists of using imagery coming from sensors with higher temporal revisit, which have however a lower spatial resolution. Time series data fusion methods between high spatial resolution systems and high revisit frequency systems have proved to provide interesting results. The Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM) [RD.98] successfully blends the reflectance of Landsat data (30m) with reflectance of MODIS data (500m). STARFM estimates image reflectance at Landsat resolution with the daily revisit frequency of MODIS.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 137 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

This method is chosen as one of the fusion techniques applied in CESBIO’s Pro-Fusion [RD.99] project, to predict and assess the fusion results of future systems as PROBA-V and Sentinel-2 data for land cover monitoring. In this project, the STARFM algorithm was improved in 3 aspects [RD.100]. First, the number of parameters is reduced, second, the computational efficiency is increased and, third, some edge cases where the original algorithm fails are dealt with. OTB provides other image fusion algorithms as for instance UCL’s Bayesian Data Fusion [RD.101].

9.1.2.3 Definition of the compositing period

Once the method is selected, the compositing parameters need to be tuned. The SoW suggests to implement 1-2 weekly composites (in table A2), or monthly composites in table 1. The early results of SPOT4-Take5 show that even with a 5-day repetitivity, it is very difficult to obtain at least a clear date per month for each pixel over many sites (even in Tunisia in February). It thus seems difficult to be able to provide weekly composites in phase 3 when only one Sentinel-2A will be available (LANDSAT 8 is also available, but its imaging capacity is only one fourth of Sentinel-2’s). It is therefore proposed to separate:  the periodicity of the composites (15 days or 1 month);  the compositing period (the composite of the 15th of March could use data taken between the 15th of February and the 15th of April). The baseline is to provide monthly composites obtained using one month of data, but using the SPOT4-Take5 data set, these 2 parameters will be tuned in order to find the most suitable set for the production of land cover maps and biophysical parameters. In this respect, the possibility to implement a regional specific parameterization of the compositing parameters will also be considered. This approach has successfully been implemented in the previous ESA GlobCover and CCI projects (RD.171and RD.173). Such approach defines areas which are found to be homogeneous from the ecological (cloud cover, vegetation seasonality, etc.) and remote sensing (observation conditions) point of view. The processing chain is then run, independently for each delineated equal-reasoning area, using the same algorithms for all the areas but with parameters specific to each area. In other terms, such a stratification-based approach allows a regional tuning of the processing parameters (such as the compositing period) while ensuring globally consistent products. 9.2 Crop and vegetation status mapping

This section gives an overview of the algorithms that will be thoroughly analysed during the project in order to produce the requested thematic products. The Sen2-Agri EO products can be generated using Level 2 or Level 3 (cloud-free composites) products. The advantage of using Level 3 products is the availability of a regular

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 138 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 sampling in the time dimension, which allows simplifying the processing chains and enables the reuse of the same classifiers from one season to the next one. However, depending on the quality of the Level 3 products, in some cases, and for some particular classes, it may be more robust to use the Level 2 images. In any eventuality, it is proposed to conduct a sensitivity analysis to choose the input image data. After a presentation of the processing chain, this chapter describes the Sen2-Agri EO products, then gives some insight about the different building blocks of the processing chain and finally, discusses the need and the modalities for obtaining reference data for learning and validation of the algorithms.

9.2.1 Existing classification system

9.2.1.1 At CESBIO

CESBIO has a long experience in the production of land cover maps using multi-temporal high-resolution multispectral optical satellite data. Projects as for instance the Regional Spatial Observatory are consumers of this kind of mapping that CESBIO produces using SPOT, Landsat, Formosat-2 and RapidEye time series. Validated processing chains allow the production of very detailed and accurate land-cover maps as the one illustrated in Figure 9-12. It contains 70 classes and was produced using cloud free Landsat data over 12 months (3 images). This procedure needs still some manual operations for the selection of acquisition dates and the appropriate training samples.

Figure 9-12 : Land-cover map of the South-West of France The efficient production of land cover maps for large areas based on the exploitation of the volumes of data produced by Sentinel-2 is challenging in terms of processing costs. In general, conventional methods make use of supervised approaches, target specialised local models for determined problem areas, or include complex physical models. These approaches, expensive in terms of processing time when a generalisation to large areas is envisaged, are thus inefficient for the exploitation of data sets such as those provided by Sentinel-2.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 139 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

As a consequence, there is a need for the implementation of methods that allow for generalisation to large scale areas with high temporal and spectral resolutions, i.e. methods that are accurate, fast, robust and that require minimal supervision. CESBIO has been preparing the exploitation of Sentinel-2 data for land cover for several years based on the acquisition of Landsat, SPOT, Formosat-2 and now SPOT4-Take5 time series. The tangible output of this work is a fully automatic mapping chain which integrates several optimisations in order to take into account:  the spatial variability of landscapes present in a Sentinel-2 scene;  the presence of cloudy pixels;  the lack of reference (training) data in terms of quality and quantity;  the huge volumes of image data to be processed. Figure 9-13 illustrates the architecture of the processing chain.

Figure 9-13 : Block diagram of CESBIO’s classification chain A key feature of this processing chain is the possibility of using ancillary data both as input features and as a guide for the selection of training samples in order to identify the most pertinent ones [RD.102]. This allows a fully automatic map production work-flow. Figure 9-14 presents an overview of an output of CESBIO’s classification processing chain covering an area of about 500x150 km in the South of France from Landsat data acquired in 2009. Figure 9-15 and Figure 9-16 show full-resolution close-ups.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 140 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 9-14 : Example of land-cover classification in the South of France

Figure 9-15 : Example of land cover classification

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 141 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 9-16 : Example of land cover classification

9.2.1.2 At UCL

UCL has been involved in global land cover mapping activities for years, as being in charge of the classification pillar of the ESA GlobCover and CCI Land Cover project. Even if the spatial resolution of the time series used in these projects is far from the Sentinel-2 one, these projects have allowed developing key experiences in (i) global classification approach and (ii) handling large amounts of data. The processing chain developed in the framework of the GlobCover project had to be fully automated and based only on MERIS Full Resolution (FR) time series. It was organized in 4 main steps (Figure 9-17), consisting in:  A per-pixel spectral classification algorithm;  A per-cluster temporal characterization;  A per-cluster temporal classification algorithm;  An automated labelling procedure.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 142 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 9-17 : General scheme of the GlobCover classification methodology Within the CCI Land Cover project, there was no need to be fully automated, time series from different sensors and from multiple years could be used. As a result, the GlobCover classification chain evolved to something a bit more complex. The major update was the decision to use multiple years to produce one product. Classification results were quite promising (Figure 9-18 and Figure 9-19).

Figure 9-18 : Overview of the CCI land cover products, over US

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 143 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 9-19 : Overview of the CCI land cover products, over Central and South America This is of course not directly applicable to the Sen2-Agri products, but it allowed learning a lot on the way handling very huge amount of data. As an example, Table 9-1 summarizes the jobs and tasks run in the pre-processing chain and provides the corresponding amount of input and output data.

Workflow Step Bulks Jobs Tasks Inputs Outputs Input MERIS FRS+RR 2003-12 150 TB auto-QA +inventory 20 210020 210000 20 QL daily 1 20 217300 210000 7300 QL scenes 20 210000 210000 210000 visual QA screening inputs 7300+

AMORGOS geocoding 240 210000 210000 210000 1 Level 2 SDR processing 240 210000 210000 210000

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 144 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Level 3 SR 7-day composites 1040 247440 210000 1040000 1 QL SR 1040 1041040 1040000 1040 visual QA screening outputs 1040+

SR result export, packing (1) (1040) 43TB

Sum 3 2620 2345800

Table 9-1: Summary of jobs and tasks managed in the pre-processing chain of the CCI Land Cover project

9.2.2 Products

Three agricultural products will be generated by the processing chains: a dynamic cropland mask, a cultivated crop type and area map and a vegetation status map. 1) Dynamic cropland mask This is a binary mask resulting from the detection of the cropland areas. The product must be delivered before the end of the season using the images of at least one season. The spatial resolution will be 20 m. Target error is between 5% and 10%. It is proposed to deal with this product as a near-real-time deliverable, that is to say, to update it every time a new image (either an acquisition or a cloud-free composite) is available. The consortium’s experience with near-real-time classification shows that in some cases, good performances in term of classification can be obtained early on in the season if a good image time series is available [RD.103]. Other approaches allow detecting croplands before the emergence of the vegetation by detecting the preparation work on the fields (ploughing, tillage) using change detection techniques [RD.104]. It has also been shown that the knowledge of the land-cover map from previous seasons allows very accurately predicting the type of cover for the current season in early steps (about March or April in North hemisphere temperate areas) [RD.105]. Finally, change detection approaches using the synergy between HR and MR imagery (using the methods developed by Robin et al. [RD.106] and available at CESBIO ready to be integrated into OTB) may help to have robust change detection and therefore an accurate cropland mask. Other algorithms for change detection can also be used as for instance [RD.107] or [RD.108]. 2) Cultivated crop type classification and area This is a very challenging product since:  it has to be delivered at 10 m resolution (which is the higher image resolution available for Sentinel-2);  it is a multi-class product (several crop classes with discrimination of irrigated and rain-fed crops);  and the target error rate per class is between 5% to 10% which is above the best accuracy available on state of the art operational systems.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 145 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The map production chain will most likely be based on supervised classification approaches (more details are given in section 9.2.4) in order to take into account the phenological variability introduced by landscape and eco-climatic differences between the sites. Nevertheless, fully unsupervised approaches based on spectral rules (decision trees) can be studied. For instance, the approach of Baraldi [RD.109] is integrated in OTB. Agricultural practices can be modelled and crop sequences across seasons can be used as to dramatically increase the accuracy of land cover maps. This kind of information is available in most European countries. For other countries, global data bases are starting to become publicly available. For example, [RD.110] produced a global data base of crop planting and harvest dates for 19 major crops, at country level (see Figure 9-20, Figure 9-21 and Figure 9-22). This kind of data bases can be combined with other data: soil maps, statistics on agriculture, such as for example the ones collected by the FAO (http://faostat.fao.org), DEM, etc. In order to produce crop maps over an entire country. This information needs to be combined with finer resolution data, if they exist, and with the knowledge of local authorities and managers.

Figure 9-20 : Planting dates for winter wheat

Figure 9-21 : Planting dates for maize

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 146 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 9-22 : Harvest dates for maize This means that the knowledge on the agricultural system of a specific area should be progressively collected, at the finer possible scale, and managed in a geospatial data bases. This will allow for example to stratify the territories in regions with rather homogeneous agricultural practices and growth conditions. When combined with easily available information on weather conditions, this will allow estimating the probable status of crops (shift in phonological stages, lower LAI as a result of drought, etc.). 3) Vegetation status Vegetation status can be assessed through the estimation of basic biophysical variables, mainly LAI and FAPAR. These approaches are operational for LR and MR imagery. Land surface indicators related to crop functioning, by using high temporal and spatial remote sensing data such as Sentinel 2, RapidEye and Landsat 8 can also be estimated. The consortium has methods and experience for both these approaches as it will be detailed in section 9.2.4.3.2

9.2.3 Needs in terms of reference data and product validation

Although specific WPs for the collection of reference data for the different test sites have been designed, the phase of algorithm selection has specific needs which are described here. In addition, the benchmarking exercise (WP 3300 - see sections 4.3.3 and 10) will implement tasks of product validation which are conceptually different than the validation of products aimed at end users and generated in a (pre-) operational phase.

9.2.3.1 Specific needs of reference data for algorithm selection

The algorithm selection and comparison tasks will need reference data early in the project realisation and should not strongly depend on the data collection task (WP2200 described in section 4.2.2). It is here proposed to apply the approaches implemented for the land cover map production system describes in section 9.2.1.1.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 147 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Data collection Two main strategies for data collection without in-situ campaigns can be used. The first one assumes the existence of ancillary data and digital maps and the second one is based on photo interpretation. o Use available ancillary data This approach can be implemented for the cases where topographical data bases are available. This is the case of most of the European countries and it is the approach used at CESBIO for France, where a data fusion process is applied using:  French Geographical Institute topographical data bases are used for urban, water bodies and infrastructures;  the Registre Parcellaire Graphique, which provides farmers declarations of crop-types per field;  and forest species yielded by the Inventaire Forestier National. An automatic GIS work-flow has been implemented which allows producing a reference data set for a given year (in order to take into account crop types) over a given area. This work- flow is generic and can be declined for other sources of ancillary data when available. Of course, the availability of ancillary data cannot be assumed for many of the proposed test sites before the in-situ data collection work. Furthermore, the reference data generated with the approach presented above may need manual corrections and tuning. This is why it is also proposed to use assisted photo-interpretation for test data production. o Photo-interpretation The amount of data needed for the algorithm selection and the associated benchmarks is lower than those which will be needed for the production of maps in an operational context. This allows foreseeing the use of photo-interpretation (manual sample selection over satellite imagery) to build reference data sets. This is a very classical approach in the phase of processing chain prototyping. Since this is a tedious and error-prone task, in the recent years, several approaches have been proposed to assist the human operators. Among those, active learning techniques [RD.111] are very efficient. The main idea of these approaches is to use classification techniques to learn from a reduced set of samples initially selected by the operator, in order to make a prediction (by supervised classification) of the most interesting and useful samples. In this way, the operator can quickly validate samples labelled by the classifier. A very interesting aspect of the approach is the possibility to use the classifier to select samples difficult to classify and therefore concentrate the work of the human operator on tasks where her skills are needed. Active learning techniques have been implemented in OTB and are therefore ready to be used for this task.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 148 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Reference data validation Reference data has itself to be validated, since it can contain errors and outliers. It is important to distinguish between these 2 concepts which are usually treated as a single one with risks of introducing biases in the final product validation. Errors in the reference data make reference to incorrectly labelled data. Outliers are correctly labelled data but which have behaviours which strongly differ from the majority of samples of the same class. As such, errors can be random or systematic and their presence depends only on the data collection protocol. Therefore, measures can be taken during reference data collection in order to reduce their number. After this, errors should be statistically non representative and therefore do not induce errors in the classification. Outliers can produce bad learning results during the training phase of supervised classifiers and need to be identified, but not eliminated. Indeed, the presence of outliers in the data may indicate a wrong choice in terms of class nomenclature1 or the need for the introduction of particular decision rules for a given class.

9.2.3.2 Product validation

The algorithm selection and the consequent benchmark implementation will need appropriate tools to describe the error budget and performance of the processing chain. It is proposed to apply best practices current in the machine learning and remote sensing literatures:  independent data sets for training and validation and use of cross validation approaches  implementation of the classical statistical indicators computed from confusion matrices: kappa index, overall accuracy, f-score, producer and user accuracy, etc. We propose to also implement other error measures which are not usually used in academic papers but which are useful to detect particular kinds of errors. For instance, in order to better understand the category of errors present in the land-cover maps (and therefore being able to propose enhancements to the work-flow) the following analyses can be implemented:  Use of stratified sampling for the error budget computation, which allows detecting the influence of the stratification variable on the results. Stratification can usually be performed using topography characteristics (slope, aspect, altitude).  Computation of error indicators at different levels of detail of a hierarchical nomenclature to detect the usefulness of the introduction of temporary sub- or super- classes during the classification.  Sensitivity analysis in terms of the number of available images, the availability of ancillary data and amount of errors in the reference data. Finally, cross-site validations may be of interest to evaluate the generalisation capabilities of the processing chains and get some insight with respect to the influence of inter-annual variability on a given site.

1 It may be necessary to split a single thematic class into several image classes for the classification and merge them together in a post-processing step.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 149 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

A selection of a reduced set of indices will also be proposed at the end of this task.

9.2.4 Processing chains

This section gives an overview of the main possible choices for each building block of a processing chain, most of which are readily available in OTB.

9.2.4.1 Input data selection

This step of the processing chain is aimed at choosing the data sources to be used for the map production. These data sources include the Sentinel-2 time series, but may also include other satellite image sources similar to Sentinel-2, like for instance Landsat 8. One can also foresee the use of medium resolution imagery (MODIS, PROBA-V and later on Sentinel-3) which can be useful for filling data gaps. This step is strongly linked to the cloud-free composite data selection processing (section 9.1). Finally, ancillary data like Digital Elevation Models, meteorological data, and existing digital maps (land cover for the previous seasons, soil types, etc.) can be integrated in the map production processing chain in order to introduce prior knowledge.

9.2.4.2 Feature extraction and selection

Once the input data (satellite images and possibly ancillary data) have been selected, there are several ways of using them. In the case of Satellite Image Time Series (SITS), one can simply stack the reflectance values for the different available dates or derive features which may be more suitable for the subsequent classification or estimation processes.

9.2.4.2.1 Feature extraction

Although, from the point of view of information theory, deriving new features from the input data does not increase the amount of knowledge available about the problem at hand, it is often useful to transform the input data in order to make the remnant of the processing steps easier. For instance, applying a non-linear transformation to the spectral bands (by for instance computing a vegetation index like the NDVI) may allow implementing a simpler classifier (a linear separator) than if the raw reflectance values were used. Feature extraction allows therefore removing some of the burden of the classifier system by introducing some prior knowledge about the phenomena at hand as a pre-processing step. The result is a system which is simpler in terms of processing complexity (i.e. a classifier with fewer coefficients) but also easier to understand by a human user.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 150 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The features which can be computed from the S2 high resolution multispectral time series can be classified in 3 families: 1) Radiometric features which perform arithmetic combinations of pixel values.  Spectral indices are combinations of the reflectances which enhance ability to detect the presence of some materials like vegetation, water, bare soil, etc.  Textures are local statistics of first (mean, variance) or second (Haralick-type) order which highlight the degree of variability of the pixels in a neighbourhood as well as their spatial organisation. 2) Temporal features which exploit the fact that multi-temporal acquisitions are available and therefore that time profiles for the pixels or the image objects can be obtained.  Time descriptors are features or statistics computed on a time profile, like its mean, its variance of even complex transforms based on Fourier or wavelet-like decompositions. Sometimes, these transformations can be flexible enough in order to take into account time distortions or time shifts [RD.112].  Phenologic indicators are time descriptors which are based on prior knowledge about vegetation (the main focus of the land cover maps) like onset and senescence dates, maximum vegetation index, etc. Figure 2.2.9 illustrates some of the indicators which can be derived from a vegetation index temporal profile. These are often obtained by fitting the observed data to analytical evolution models [RD.113]. More details are given in section 2) 3) Object features which can be extracted after the images have been segmented. This allows performing Object-Based Image Analysis (OBIA). Different kind of features can be extracted  Radiometric features which can be of the same kind as those computed at the pixel level, but performing the mean at the object level. Higher order statistics on these features (i.e. variance) can also be used in order to take into account within-object variability.  Shape features which will characterise the size, the perimeter and other characteristics associated with the spatial aspect of the region.  Adjacency features which will relate the object with its neighbouring context.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 151 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 9-23 : Example of phenological descriptors which can be extracted from a temporal profile Of course, these 3 types of object features can be extended to the multi-temporal dimension. OTB provides all the above-mentioned radiometric and object-based features and some tools for fitting time profiles which will allow the computation of temporal features.

4) Ancillary data Although not extracted from the Sentinel-2 satellite images, ancillary data (Digital Elevation Models, soil maps, cadastral information, roads and any other kind of topographical digital data-base) can be useful as input data for a classifier. For instance, croplands are seldom present at altitudes higher than a certain threshold. Other similar relations may hold with respect to the presence of roads or urban areas. Knowledge about recent history of the area is often useful to increase classification accuracies, as for instance in [RD.105], where the use of agricultural data-bases of the 3 previous years improves sensibly the land-cover maps. Since the available ancillary data is not the same in every country and its accessibility may not be granted, the use of external data must not be mandatory in an operational system. However global, freely (as in gratis and as in freedom) available databases exist, as for instance the SRTM DEMs (under 60 degrees) and the OpenStreetMap digital maps. An evaluation of the use of this kind of data will be investigated.

9.2.4.2.2 Image segmentation

As stated above when introducing the object-based features, image segmentation can be a useful processing step for the production of meaningful features for the classification.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 152 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The high resolution of Sentinel images will allow to access to intra-field variability which can be very useful for the discrimination of crop types. Also the relative position and spatial relations among objects [RD.114], [RD.115] can be useful for the correct interpretation of the image content. These approaches, available in OTB, have successfully been implemented in land cover map classifications [RD.116]. Image segmentation is interpreted as the partitioning of an image into spatially continuous and disjoint regions, which are considered to be homogeneous with respect to some criterion of interest (spectral, shape, texture…). The aim of segmentation in optical remote sensing can be seen as one possible strategy to model the spatial relationships and dependencies. Accordingly, the extraction of image regions as primary visual components can aid to landscape change detection or land use/cover classification techniques. In the last years, a better interpretation of the remote sensing scene has been possible after the availability of high resolution imagery. Most of the objects have become larger than the image pixel size [RD.117]. Accordingly, developments in image segmentation have been seriously increased over the last decades. Nevertheless, new challenges have appeared interpreting more and more types of objects on the scene. It comes from the fact that the internal spectral variations within the object have really increased distinguishing more details in the image. Accordingly, the definition of the homogeneity criterion used to partition the image becomes more complicated. The increased resolution of Sentinel-2 imagery with respect to its current counterparts (Landsat, SPOT) will allow the access to more detail at the individual field level. Traditional image segmentation techniques have been commonly divided into some approaches such as pixel, edge and region-based segmentation methods. Instead of this categorisation, a numerous other segmentation techniques (Markov Random Fields, fuzzy techniques, Neural networks,..) are found in the remote sensing literature as reviewed in RD.118. In pixel-based methods, the most elementary unit on the image, i.e. the pixel, is individually processed. These techniques make decisions based on local pixel information such as the intensity value. Processing the image at the pixel level representation can have important limitations since a single pixel provides an extremely local information. In the case of high resolution remote sensing imagery, numerous works have proved that pixel-based approach can be insufficient due to its incapability to handle the internal variability of complex scenes [RD.119], [RD.120] and [RD.121]. The second important family of methods are the edge-based techniques, which aim to find edges between regions and determine the segments as regions within these edges. From this point of view, edges are regarded as boundaries between image objects and they are located where changes in values occur. In this context, one of the most popular algorithms is the Canny detector which can be a very useful for images containing few features. Unfortunately, in case of high resolution imagery, these methods have important limitations because of the large amount of details in the data. For this reason, methods as Canny or Sobel detectors are not commonly used nowadays in remote sensing segmentations. In contrast, it must be noticed that this family of techniques included in the Oreo Toolbox can be very useful to perform pre- processing stages such as linear feature extraction.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 153 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The last category of techniques involves region-based segmentation techniques, which have solved the limitations of local decisions taken by pixel-based methods. The consideration of the local properties from neighbouring pixels explains the significant improvement introduced by these methods. Starting from a seed component (pixels or already existing small regions), region-based approaches test if the similarity against other neighbouring components exists. Following this strategy, region growing techniques such as the well-known Watershed transformation [RD.122] have received a great attention in remote sensing. Thanks to the common use of this classical segmentation technique, OTB also includes this algorithm. Despite of its popularity, this traditional technique is not accurate because it tends to produce over-segmentation results. Accordingly, this approach is relatively less used than other region-based segmentation methods. However, it must be noticed that the use of this technique can be very useful as a pre-processing step. For example, an initial over- segmentation of the image obtained by [RD.122] can be the first step towards a region merging algorithm. In the recent years, region merging techniques have played one of the most important roles in the segmentation of remote sensing images. Starting from individual pixels or any other initial partition, the region merging algorithm is an iterative process in which regions are iteratively merged according to a homogeneity criterion. In this category, one of the first methods has been the ECHO (Extraction and Classification of Homogeneous Objects) algorithm [RD.123] whose merging criterion is based on a statistical likelihood homogeneity test. One of the main limitations of this approach is that the decision to assess if this test is true requires setting a threshold. Thus, the accuracy of segmentation results strongly depends on the selected threshold. Another more robust region merging strategies are the Fractal Net Evaluation Approach (FNEA) [RD.124] and the Mean Shift (MS) [RD.125] algorithms. The FNEA used in the commercial eCognition software – and recently implemented in OTB – initially assumes each pixel to represent one region and then iteratively merges adjacent regions satisfying a similarity criterion, until convergence is reached. In a similar way, the Mean Shift (MS) segmentation procedure available in OTB software is a non-parametric technique, which detects modes of a density function and partitions an image into clusters assuming each mode to be a centre of the corresponding cluster. In order to measure the similarity between neighbouring regions, both methods take into account the spectral information in their merging criteria. In contrast, FNEA can also use the scale, shape and compactness parameters, whereas MS takes into account the spatial distance between the regions. At this point, it should be noticed that these segmentation methods only take into account some of the possible features in their merging criteria. Nevertheless, some other interesting features could be used to measure the similarity between the regions. For instance, the use of textural, temporal or semantic prior knowledge patterns could also be included in the merging criterion. In the case of texture features, although their characterisation is not simple and it requires a high computational cost, OTB software enables the computation of the classical textural features detailed in [RD.126] which can be easily incorporate to MS segmentation results. Besides, the inclusion of temporal characteristics in the merging criterion measure can also contain significant patterns in some specific applications as crop mapping. In this particular application, it can happen that some crops only have specific features during a short time

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 154 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 period of the year. Accordingly, more than one date image is required in order to obtain an appropriate segmentation of the image. In order to include the temporal information, the MS segmentation algorithm has shown its ability to handle remote sensing times-series images [RD.127]. The introduction of the prior knowledge features in the merging criterion, is not straightforward. The main reason is that it requires photo interpreter knowledge regarding the regions/objects of the image which cannot be always available. In opposite, it should be remarked that if this information is available, OTB provides a number of heavily documented functionalities to integrate GIS and mapping systems data. Despite the promising improvements that can be done in region merging criteria, it must be remarked that current region merging algorithms obtain acceptable results. Furthermore, these popular techniques can answer the scale-space image analysis dilemma. In the real world, it is very difficult to directly construct the best image partition (if there is any) given the huge number of applications potentially considered for one given image. Accordingly, the interpretation of an image at different scales of analysis implies the necessity of its multiple segmentations. This process is known as multi-scale image representation and it can be directly obtained from the hierarchy of partitions obtained during the region merging algorithm construction. Hence, the multi-segmentation property is one of the main reason why region merging segmentations have received a lot of attention in the last years [RD.128]. The second important characteristic of these techniques is that their attractive results can be processed by following the Object Based Image Analysis approach (OBIA) [RD.129]. This strategy has emerged under the assumption that individual pixels of high and very high resolution remote sensing images cannot be interpreted in themselves RD.119. In this context, OBIA has been presented as a solution proposing to work with regions representing objects in the image. Objects can be considered as aggregations of pixels which play a key role in image and scene analysis. The external relations of the regions in the image (adjacency, inclusion, similarity of properties…) as their internal properties (colour, texture, shape,..) are extremely important for nearly every image analysis task [RD.130]. Hence, there is a wide agreement in the literature that describing images with regions forming objects is beneficial to interpret information. Working with regions representing objects, the assessment of spectral, spatial, textural or temporal characteristics can be easily done in OTB. In our case, processing images with OBIA strategy can be a promising solution to handle large images. Despite of the potential of region merging algorithm, some challenges remain to perform automatic remote sensing data segmentations. This comes from the fact that some parameters need to be set in FNEA and MS algorithms to obtain as a result one segmentation level. For example, a region homogeneity criterion for the FNEA or the spectral/spatial bandwidths for the MS approach. Therefore, the performances of both segmentation techniques strongly depend on the selected parameters. For instance, depending on the selected parameter, objects forming the resulting partitions can have similar sizes which is rarely true in the real world. Given our Global Land Cover application, the manual selection of these parameters can be a problem in order to achieve an automatic segmentation algorithm. The wide diversity of

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 155 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 agricultural fields and other objects in the images around the world can result into a problem in order to set automatically the required parameters. For this reason, an exploratory work is needed in order to select automatically the parameters. The traditional procedure of identifying parameters is by trial and error procedure, where the parameters are arbitrarily assigned to segment the image until an operator validates the visual results. It implies then that users have to find useful segmentation levels by ‘trial and error’. As a consequence, FNEA or MS requires that the user must know the scale of the objects of interest to select appropriate segmentation heuristics. A possible solution to handle the automatic parameter selection relies on processing results contained in the segmentation hierarchy. A mechanism for scaling between hierarchical levels or image objects can offer various advantages as compared to the strategy of creating a single segmentation level. For this reason, some recent works try to process hierarchical remote sensing segmentation results representing them by a tree structure [RD.131], [RD.132]. In the case of MS results, the processing of several segmentation results to extract regions from different levels could be performed by using OTB. Following the same strategy than watershed algorithm, the ITK library used in OTB enables the generation of a tree structure from the region merging results. Accordingly, the use of OTB opens the door to explore new possibilities in order to study different segmentation results obtained by different input parameters. The goal is to process different results obtained by different parameters to detect automatically the best segmentation result (according to our application). Hence, the use of MS included in OTB could reduce the influence of spectral/spatial bandwidth parameters providing almost automatic results. The segmentation of large remote sensing images is a generic problem due to the required computational time or memory resources (see section ). In image segmentation, an important difficulty is that algorithms are global rather than local. Hence, the segmentation of a large remote sensing image can be unfeasible. To handle these problems, there is a need to break image segmentation task into smaller pieces. OTB software proposes an intelligent solution by dividing large images into smaller overlapping tiles. These tiles are separately segmented and finally, the results are patched together. This strategy has been applied on MS tool provided by OTB. Consequently, large remote sensing images can be segmented with an accurate quality in a vector layer format.

9.2.4.2.3 Feature selection

Once the features have been extracted from the input data, it is useful to perform a selection step for several reasons. Often several features yield the same kind of information and they are redundant. Also, some of the features may be completely irrelevant for the problem at hand. Therefore, a feature selection procedure can be implemented in order to keep only relevant and non-redundant features [RD.133]. One side effect of the feature selection is the reduction of computation cost, since there will be fewer features to compute in the first place, but also, the downstream processing in the processing chain will also be simplified. This reduction of processing costs is crucial in the case of large data volumes as those produced by Sentinel-2 (high resolution, multi-spectral time series). The feature analysis performed during the feature selection step can also be useful to gain further understanding of the problem under analysis and may therefore be useful to devise new strategies if needed.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 156 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

9.2.4.3 Estimation/decision

This step of the processing chain is the core of the system for the generation of the final products.

9.2.4.3.1 Image classification

The core of the crop (either cropland mask or crop type map) map production system will be a classification procedure with the appropriate nomenclature. Classification approaches can be either supervised or unsupervised, and both may be suitable for the required products. As the reader may already know, supervised classification uses a set of training examples (pairs composed of the input features on one hand and the corresponding class label on the other hand) in order to learn a decision function – also known as classification model – which can then be used to produce the class labels for all the input objects (pixels or regions) given the computed features (reflectances, vegetation indices, etc.). Typical examples of supervised classification include some types of neural networks (multilayer perceptrons [RD.134]), Support Vector Machines [RD.135] or decision trees. Unsupervised classification (or clustering) uses the input features only in order to learn a decision function which associates the objects to classify to a particular set (cluster) in terms of similarity. Unsupervised classification does not find the correspondence between the clusters and the thematic classes (crop types, for instance), and therefore a class recognition phase is needed in order to produce land cover maps using unsupervised classification. Typical examples of unsupervised classification are the K-Means [RD.136], Kohonen’s Self Organising Map [RD.137], or ISODATA [RD.138]. Both approaches can be combined in order to simplify the problem complexity, for instance by using an unsupervised classification to compress the data and a supervised classification to perform the recognition step. In this compression step, rule-based classifiers [RD.109] can be used. In the recent years, the remote sensing literature has designed SVM [RD.139], Random Forests [RD.140], Deep Learning [RD.141], as the preferred approaches to supervised classification, as the IEEE GRSS Data Fusion Contests results [RD.142], [RD.143] indicate. All the above-mentioned algorithms for supervised and unsupervised classification are available in OTB. 1) Steps of the classification The design of a classification machine may involve more than the training and application of the obtained model and we may distinguish 5 main tasks, namely: 1. Model learning and validation 2. Choice of the degree of supervision 3. Classifier decision fusion 4. Model learning

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 157 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The model learning phase is not as trivial as feeding the learning algorithm with the input data (supervised and unsupervised cases) and the example labels (supervised case only). The way in which the training samples are selected can be optimised (reduce the number of needed samples or increase their pertinence) for example by using stratification [RD.102] based on ancillary data. The initialisation of the learning algorithm may also be tuned and it is of particular interest in the case of incremental learning (for near-real-time approaches where new available images are incorporated in the processing). In order to estimate the quality of the model learned it is mandatory to have a reference data set which allows computing the appropriate statistics (confusion matrix, overall accuracy, kappa index, etc.). Ideally, this reference data set is different from the one used for the training in the supervised case. It can therefore be used to estimate the quality of the resulting learning in a straightforward fashion. When reference data is scarce, learning and validation can be performed by cross-folder approaches. 5. Wrapping Obtaining training samples is always costly and the number of reliable samples is always too low. Several techniques exist to increase the number of training samples needed by supervised classification methods. The most well known technique is the so-called transductive learning [RD.144], which starts with a small set of reliable samples and uses the knowledge obtained after the first learning iterations to add unlabelled samples to the training set. Although not fully automatic as the transductive case, active learning [RD.111] follows a similar approach by selecting dubious samples which are presented to an operator for manual labelling. This approach can be used in the cases where nearly no reliable reference data exists. Since the land-cover map production system we are proposing will continue to operate through the years and on different areas of the globe, it may be useful to assess the use of domain adaptation techniques [RD.145], [RD.146]. These techniques are designed to port a classifier from one data set to another. For instance, a classifier trained in an area for which good reference data can be implemented and then adapted to a similar area where no reference data was available, and therefore no training had been possible. 2) Pixel vs object-based approaches In describing the classification steps of the previous section we have not made any reference about the objects which are to be classified. High resolution images as the ones provided by Sentinel-2 can benefit from the use of the image segmentation techniques as described in section 9.2.4.2.2 and therefore, the use of image regions instead of individual pixels as the objects to be classified can be foreseen.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 158 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Three different approaches need thus to be compared in order to ensure the efficient exploitation of the full potential of Sentinel-2 images: (i) pixel-based approaches, (ii) classification of a segmentation and (iii) object-based classification. The first approach does not need to be further described, since it is the classical way of doing things. The second approach uses the segmentation as a simplification of the image in order to get homogeneous areas, but the classification technique is the same as the pixel-based one, with the difference that radiometric features are the same for all the pixels of a given region and therefore, the computations can be sped up. Object-based classification came to the remote sensing scene with the advent of HR (SPOT5) and VHR (Ikonos, Quickbird) starting about 2001 [RD.119]. At this point the term OBIA for Object-Based Image Analysis was coined. It was popularised by routines available in the eCognition image processing software. Object-based classification goes beyond the simple use of a classification over a segmentation and uses shape features as the ones described in section 9.2.4.2.1 to characterise the regions of the image. Also, adjacency features are usually used. Still richer are the possibilities of spatial reasoning [RD.114], [RD.115] which are not available in commercial software but are implemented in OTB. OBIA allows user friendly implementations of active learning as it is the case in OTB’s Object Labelling application.

9.2.4.3.2 Biophysical parameter estimation

1) LAI, FAPAR and FCOVER Estimations The estimations of biophysical variables such as the GAI (Green Area Index, green part of the plant canopy) and the FAPAR (Fraction of Absorbed Photosynthetically Active Radiation) by using remote sensing data, have been investigated by the scientific community, since the launch of the first spatial missions (Landsat, in 1972). The development of the new spatial missions such as Venµs and Sentinel 2 will offer new perspectives for the follow-up of the land surfaces. For the first time, the users community will benefit from satellite data with a high and temporal resolution and taken with a constant viewing angle. These characteristics are essential if we want to map and to model the functioning of the continental surfaces with a stronger reliability. The estimation of biophysical variables with remote sensing data could be assessed using empirical or physical methods. The retrieval of biophysical variables with empirical relationships based on the use of spectral indices such as the NDVI has shown their benefits but also their limitations. Indeed, these methods require the acquisition of in situ data which represents considerable time and human costs. On the other hand, these relationships, calculated from reflectances (BRDF: Bidirectional Reflectance Distribution Function), are sensitive to the conditions of data acquisitions (viewing and illumination angles). The scientific community has therefore worked on the development of radiative transfer models able to of simulate the BRDF by taking into account biophysical settings of the land surface but also the conditions of remote sensing data acquisitions. The first model that simulated the radiative transfer in the vegetation was the SAIL model (Scattering by

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 159 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Arbitrarily Inclined Leaves), developed by Suits [RD.147], and improved by Verhoef [RD.148]. The low number of input parameters explains the wide use of this model by the remote sensing community. This model was afterwards coupled with a model simulating the optical properties of leaves, the PROSPECT model [RD.149]. The resultant model is the PROSAIL model [RD.150] which is one of main tools used for the characterisation of the vegetation [RD.151] today. The BV-NET tool, able to invert the PROSAIL model, was then developed by the researchers of the EMMAH laboratory (Avignon). BV-NET was already used for neural networks learning in the operational processing chains of global LAI, FAPAR, CYCLOPES [RD.152] and GEOLAND products for the VEGETATION and AVHRR sensors, as well as for the MERIS data processing [RD.153] and within the framework of the MARS project for the AVHRR processing chain. These large-scale products were compared with other existing products (MODIS, GLOBCARBON, JRC FAPAR) and show very good performances, in particular when compared with field data [RD.154], [RD.155] and [RD.156]. This tool was also used to process high spatial resolution data (10-20m) such as airborne POLDER data or satellite data (e.g SPOT). Duveiller and al [RD.157] showed the good performances of BV-NET on SPOT data acquired on wheat crops during the whole cycle of vegetation. Rivalland and al [RD.158] applied BV-NET to three sensors (airborne POLDER, SPOT-HRV, LANDSAT) on different crops (wheat, alfalfa, sunflower) and results showed a good coherence between sensors. Bsaibes and al [RD.159] and Verger [RD.160] also propose a validation of the BV-NET tool with FORMOSAT and CHRIS-PROBA data. In Martin Claverie’s PhD thesis [RD.161], recently defended at CESBIO (2012), the validity of the BV-NET tool was evaluated from in-situ measurements of GAI, FAPAR and FCOVER carried out during 4 years (from 2006 to 2010) on corn, sunflower and soybean crops, during the whole cycle of vegetation. This work was carried out over 3 study sites (South-West of France), with Formosat-2 images acquired between 2006 and 2010 (20 images per year on average). The validation procedure relies on the VALERI methodology (http://www.avignon.inra.fr/valeri) which consists of sampling in-situ areas corresponding to pixel units. The in-situ GAI and FAPAR estimations were assessed from the processing of hemispherical photographs by the CAN-EYE software [RD.162]. The results demonstrated the validity of the BV-Net algorithm which, contrary to the empirical methods, does not require in-situ measurements for its calibration. In this project, we propose to use the BV-NET tool to derive the biophysical parameters (GAI, FAPAR) of crops from high spatial and temporal remote sensing data sets. In order to do that, we plan to use the SPOT images from the SPOT4-Take5 experiment combined, if necessary, with LANDSAT8 and Rapid-Eye images in order to provide regular time series of BRDF similarly to the future Sentinel 2 data sets. The multi-temporal GAI and FAPAR maps provided by BV-NET will be validated against in- situ GAI and FAPAR measurements over different sites. CESBIO has already acquired such data sets for several years in Midi-Pyrénées (France, South-West/Take5-7 site) and Morocco (SudMed/ site), and carries out this measurements during the Take5 experiment.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 160 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The BV-NET tool may not be available as an open-source software, but OTB contains all the building blocks needed for an open source implementation:  PROSPECT+SAIL for the simulation of reflectances from biophysical parameters  6s radiative transfer code if atmospheric effects need to be taken into account  Multi-layer perceptron neural networks for the inversion of the simulation model. PROSPECT+SAIL and 6s in their OTB implementation have been successfully used in [RD.163] and [RD.164].

Figure 9-24: Results of the BVNET evaluation [RD.161]. X-axis : in-situ GAI, FAPAR and FCOVER estimated by the use of hemispherical photographs and CAN-EYE software. Y- axis: variables estimated by the use of the BV-NET tool. 2) Crops phenological indicators Crop yield forecasting and water resource management, for example, need information on land cover and on crop growth. Remote sensing data allow to acquire information on land uses over large areas which is mandatory to policy managers who have to make decisions at regional scales. In addition, the future availability of high spatial and temporal Sentinel-2 data will allow to get information on crop growth and thus on crop functioning. NDVI time series have been widely used in phenology studies to estimate patterns of vegetation dynamics. The approach consists of deriving seasonal satellite metrics from NDVI temporal profiles. Examples of these satellite-metrics include rate of increase and decrease of NDVI, dates of beginning, end and peak of the growing season [RD.165], [RD.166] and [RD.167]. Several methods have been developed to interpolate NDVI time series, usually on a daily basis, from satellite derived discrete vegetation index. The strength of this approach is that it provides a rather simple and straightforward means for comparing crop status in different areas and for analysing crop status in near real time by comparison to the behaviour of crops during the previous years. This approach can also be applied to biophysical variables such as LAI and FAPAR. When weather data are available, it can be coupled to crop models of different complexities, the simpler being the use of degree days models which allows to predict the main phenological stages (e.g. flowering) of crops.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 161 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Finally, fitting smoothing functions to NDVI is a way to interpolate NDVI over large areas and over time in order to make spatial and temporal comparisons easier. The cloud cover is usually uneven over large areas such as the ones observed by Sentinel-2 and varies from year to year. A given pixel or a region of interest can be observed at very different dates, which makes the analysis complex. The proposed approach will facilitate the work of agriculture managers when analysing crop status over large areas or when comparing crop growth in different years. Sentinel-2 will acquire data with a constant view angle for a given pixel location, except within the areas observed from two different orbits. In addition, the geometry of acquisitions differs between Sentinel-2 and Landsat-8, for instance. In order to maximise the usefulness of the satellite data, for example at high latitudes with Sentinel-2, it is therefore needed to implement the normalisation of NDVI for solar and viewing angle effects. For that we will use the BV-NET tool able to simulate reflectances and NDVI for different configurations of acquisition. Note that this approach also allows to spectrally normalize the NDVI derived from reflectances acquired by instruments with slightly different spectral bands. The results of BV-NET will be used to implement look-up tables for a fast application of the approach. The proposed approach paves the way for more complex approaches in the future where crop growth models are driven by satellite and ancillary (e.g. weather) data.

9.2.4.4 Post-processing and fusion

Although this step of the processing chain can also be applied to biophysical parameter estimation, it is more usual to apply it to the results of a land-cover map classification. As anticipated in the SoW, it is important to compare (benchmark) the performance of different choices for different parts of a processing chain. The existing literature and our experience at CESBIO show the difficulty of having an implementation of a land cover map production system where the performances are the best for every land-cover class of a target nomenclature. In this kind of situation it is often useful to try to take advantage of the best results of different classifiers by using fusion techniques. Actually, classifiers as AdaBoost or random forest already use the approach of running a high number of weak classifiers and using them together in a kind of voting mechanism. However, the idea here is to use a post-processing approach by applying specific high level fusion rules as a post-processing step. These rules can take advantage of the measured performances of different classifiers for instance by using Bayesian or evidential (i.e. Dempster-Shafer) decision rules established from information derived from the confusion matrices of each classifier. Added to these generic fusion approaches, site specific rules can also be implemented. These can take the form of post-processing to make the results coherent with ancillary data. A classifier fusion framework has recently been integrated in OTB.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 162 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

10 Algorithm benchmarking concept and plan

The task 3 (and more precisely the sub-task 3.3) of the Sen2-Agri project involves an “algorithms inter-comparison, benchmarking and selection” activity, which shall allow selecting the “best” algorithms or strategies for fulfilling to a maximum extent to the user requirements and the products specifications (section 4.3.3). The activity should contribute to improve the understanding of algorithms performance in enabling their inter-comparison. The “best” algorithms or strategies selection is achieved following algorithm inter-comparison and product assessment exercises in considering the user-focus with critical attention. To this end, the algorithms performance and products quality are evaluated towards a list of objective criteria. The entire effort should be as transparent as possible. The scope of the exercise, the input dataset, the expected output, the validation dataset and the evaluation methodology need to be defined in advance. 10.1 Benchmarking exercise within the project framework

The Sen2-Agri project aims at developing, demonstrating and facilitating the Sentinel-2 time series contribution to the satellite EO component of the agriculture monitoring for many agricultural systems distributed all over the world. The overall objective is to provide to the international user community validated algorithms, open source code and best practices to process Sentinel-2 data in an operational manner for major worldwide representative agriculture systems. To this end, a prototype system will be developed, based on a set of pre-processing, compositing and image interpretation algorithms. The pre-processing module starts from L1A imagery and produces L2A products (i.e. ortho-rectified product expressed in surface reflectance, provided with a cloud mask). This module is run in Task 2 and generates the TDS that will serve as input to the benchmarking exercise. The benchmarking exercise will be dedicated to the testing of algorithms previously identified from the state of the art of literature (WP 3200 - section 4.3.2). A minimum of 5 algorithms needs to have been selected for each EO product, which consist of cloud free composites, a dynamic cropland mask, a cultivated crop type and area map and a vegetation status map. The articulation between these products and associated processing modules is shown in Figure 10-1.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 163 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 10-1: Schematic illustration of the articulation of the processing modules that will be tested in the benchmarking exercise 10.2 Benchmarking plan

For each product, the benchmarking exercise can be viewed as composed of:  an input dataset, which can be either the TDS for the compositing module and possibly, some agricultural products or in the other cases, the multi-date cloud-free composites ;  a set of alternative processing algorithms ;  different output products to compare;  a methodology for comparison (e.g. graphical visualisation methods and/or statistical analyses), along with appropriate validation dataset made of in-situ data. In order to make this exercise as transparent as possible, the following information need to be available in advance:  the specifications of the input products: o The TDS (which is the input at least for the compositing module) will be characterized during the WP 2600 (see section 4.2.6), when validating the developed pre-processing chain;

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 164 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

o The multi-date cloud-free composites (which are possible inputs for the other processing modules) will be characterized within the benchmarking exercise, when evaluating the performances of the different tested compositing algorithms.  the specifications of the expected output products : it will be done in the WP 3100 of the same task (section 4.3.1);  the detailed characterization of each test site: this information will be defined during Task 2 (WP 2100 - section 4.2.1), when making the final selection of the test sites. This information is found to be critical to guide the comparison and validation processes and to ensure selecting the “best” algorithm(s).  the analytical approach for products inter-comparison: it is part of this benchmarking activity. Starting from the users’ requirements defined in the URD and from the derived products specifications, a list of objective criteria (and possibly of baseline products against which comparing the products) need to be established. The capacities to be adapted to the Sentinel-2 capabilities (multi-temporal revisit, specific spectral band, the amount of data) and to be globally consistent (i.e. to deal with varying cropping system, landscape patterns and agro-meteorological contexts) will be additional key criteria to consider;  the characterization of the in-situ data, which will be done along with the collection process during Task 2 (WP 2200 - section 0). The results of this benchmarking exercise, along with all important design choices, trade-offs and analyses, will be documented in the Design Justification File, in the form of an Algorithm Theoretical Basis Document. More precisely, for each benchmarked algorithmic step, the following information will be reported:  The input (EO and validation) dataset description;  The characteristics of the test site that make the test particularly relevant (if any);  The design choice (statement of the issue that leads to design choice);  The solution (proposed best solution to the design decision);  The alternatives solutions considered (with a substantial description) and reasons for non-selection;  The analyses that support the selection of the best solution and the non-selection of the alternatives;  The (positive and negative) impact(s) of the best solution on the whole Sen2-Agri system and products;  Additional comments (if any).

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 165 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

11 Technical description of tools and analysis system

This section describes the solution proposed for Sen2-Agri software development. It is composed of the following elements:  Solution overview (section 11.1);  Solution description (section 11.2);  Presentation of proposed re-used software (section 11.3);  Global architecture (section 11.4). As required by the STC and Sow, the use of open-source software has been privileged and all constraints and eventual license conditions linked to their usage are clearly identified and taken into account for architecture definition in order to respect Sen2-Agri software constraints expressed in the ITT. 11.1 Solution overview

The Sen2-Agri software proposed by the consortium is composed of:  S2Agri-SC: processing components which build image processing chains to obtain the products required by the users;  S2Agri-SC Orchestrator: a tool which launches the different processing chains following a Data Driven processes and monitors the different chain. For the processing components, two tools have been selected for their quality, re-use degree and perfect mastering by the partnerships of the project:  Orfeo ToolBox (OTB): for composite generation, biophysical and added value products generation  Sentinel Exploitation Tools (BEAM ToolBox + S2PAD module): for atmospheric corrections and cloud detection. For the S2Agri-SC Orchestrator, we have selected the SLURM tool to manage the resource of the S2Agri system. This tool allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job) on a set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work. Even this tools is mainly used for cluster task management, it is very useful also on a single platform with multiple cores. The data driven mechanism will be done by a main script which scan the input directory. The criteria defining our solution are the software requirements of the SoW document as much as those precised in the STC. We propose a fully open source solution, performing and mastered by the partners. All the proposed software are open source. The prototype activities will be done in OTB which will

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 166 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 be re-used for the system implementation. This dual aspect decreases the risk and the cost of the system implementation. It is important to notice that those libraries are themselves based on other widely used open source software. OTB relies mainly on OSSIM and GDAL/OGR, from which functionalities will also be re-used in S2Agri-SC development. This solution is the best answer to the Sentinel-2 Agriculture requirements concerning:  Planning respect: The Sentinel-2 Agriculture is an important opportunity for ESA to contribute to enhance the agriculture monitoring. We propose to re-use existing open source solutions so easily modifiable and are perfectly known by the partners of the project. The re-used libraries have been already used in similar context: Venus L2/L3 chain, MACCS software and radiometric part of the S2-IPF made by CS with OTB, a large set of research projects made by CESBIO with OTB. This reduces development risks and warrant planning respect. All prototypes will be done with OTB and can be easily integrated into the operational system which will add interface support and will optimize the processing.  Performance: The huge data volume generated by Sentinel-2 time series at large scale imposes a great constraint on components performance. Proposed solutions are well- tested software used in operational software and employed by an important user community. Moreover, potential optimizations required are bound to source code availability. Selected libraries provide multi-threading and streaming capabilities for processing huge images. Indeed, OTB was developed to facilitate high resolution images from Pléiades and CSK (http://smsc.cnes.fr/PLEIADES/Fr/lien3_vm.htm, http://blog.orfeo-toolbox.org/news/jpeg2000-and-pleiades-data-support-in-otb).  Meet the user requirement: Sen2Agri-SC components have in charge products generation from L1c to added value products with a strong accuracy. The software proposed for re-use are image processing libraries specialized in remote sensing image processing and atmospheric corrections. They have been already used in image processors for similar sensors. S2PAD has been used for L2/L3 S2 products generation which concerns mostly atmospheric aspects. ORFEO toolbox has been used in various contexts by CESBIO to monitor agriculture activities at large scale.  Capacity building and maintainability by users: Sentinel 2 constellation should have a life of at least 15 years. It makes the maintainability an important issue. Obviously, open source code simplifies and improves maintainability. As far as source code is available it is possible to check and modify it. Moreover, selected software has high quality in terms code development practice (OTB: C++ templates, generic programming) tests (OTB:~3000 nightly tests on 3 platforms) and very good documentation in English (http://www.orfeo- toolbox.org/packages/OTBSoftwareGuide.pdf). Moreover a large community of users and developers can provide support to end users.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 167 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

11.2 Solution description

FOSS components are used throughout the processing chain. They ensure that no restriction can be applied by IPR holders to the use of the system. ESA and end-users do not need any special conditions to be negotiated with original providers for the current system and also for reuse of the components in future systems if desired. The selected libraries are also recognized state of the art components, already used in many systems (including operational space systems: Venus ground segment for OTB or S2 commissioning phase for S2PAD). They are widely deployed and it is easy to find skilled engineers to integrate them or maintain them throughout the lifetime of the project. Source code is readily available for all of them. A typical example of such components is Orfeo Toolbox, which was created by CNES to support Pleiades and Cosmo-Skymed systems, and that we plan to re-use for the S2Agri. It is a widely renowned component which is freely available for everyone to use, regardless of domain or country and without any time limitation. The solution we propose is to use several different FOSS libraries to build the various components of the system. A great care has been used to select the libraries, considering both their technical capabilities (scope, validation, support ...) and their licensing terms. All the selected components allow full modification and redistribution rights; hence they perfectly meet projects requirements about IPR. Some of the components belong to the permissive category, some to the weak copyleft category and some to the strong copyleft category. However all can be licensed with a GPL license as required in the SoW. The requirements related to re-use software are covered in the Annex B of the technical proposal corresponding to the SRF. A description of these software and some details about the parts of the S2Agri components where they will be used are provided in the following sections. 11.3 Presentation of proposed re-use software

11.3.1 Sentinel Exploitation Tools (BEAM + S2PAD)

The Sentinel Exploitation Tools is considered as a CFI by the consortium as strongly recommended by the SoW. It is based on the BEAM toolbox which integrates strong algorithms to manipulate and pre-process EO data from various captors and the S2PAD processor which produced Sentinel-2 L2a product. The ESA BEAM toolbox is a JAVA library well-known in the domain of EO applications. It was developed to support ENVISAT mission and to pre-process and visualize data for scientific users. Recently Sentinel-2 ingestion has been integrated into the software and we will use it when this new version will be released. When the new Sentinel-2 Toolbox will be released we will evaluate its performances to replace BEAM toolbox. The S2PAD processor has been developed in Python and is based on the ATCOR philosophy. It provides aerosol estimation based on the DDV method, a cloud detection mechanism and a general land cover map. It have been tested and validated on European latitude with rural aerosol model. It is interesting for the Sentinel Agriculture project and we will use it because it will be available as open-source in the future Sentinel-2 Toolbox. However the fact that it

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 168 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 has not been validated with African latitudes could be problematic and it is a risk for the project. Nevertheless we will use it for the following reasons:  It is the only one FOSS solution for Sentinel-2 atmospheric corrections;  We can check the results of S2PAD with close source MACCS solution at CESBIO facility;  We know that it will be enhanced during the S2-MPC contract managed by CS-SI. These two elements will be used into the Atmospheric correction components of the Sentinel- 2 Agriculture.

11.3.2 Orfeo ToolBox

11.3.2.1 General Presentation

In the frame of the Methodological Part of the ORFEO Accompaniment Program to prepare, accompany and promote the use and the exploitation of the images derived from Pleiades (PHR) and Cosmo-Skymed (CSK) satellites, the French Space Agency (CNES) decided to develop the Orfeo Toolbox (OTB): a set of algorithmic components, adapted to large remote sensing images, which allows to capitalize the methodological know how, and therefore use an incremental approach to benefit from the results of the methodological research (extracted from http://www.orfeo-toolbox.org/otb/). ORFEO Toolbox (OTB) is distributed as an open source library of image processing algorithms. As the motto of OTB goes, Orfeo Toolbox is not a black box, OTB encourages full access to the details of all the algorithms. OTB is based on the medical image processing library ITK and offers particular functionalities for remote sensing image processing in general and for high spatial resolution images in particular. Targeted algorithms for high resolution optical images (SPOT, Quickbird, Worldview, Landsat, Ikonos), hyperspectral sensors (Hyperion) or SAR (TerraSarX, ERS, Palsar) are available. OTB is distributed under a free software license CeCILL (similar to GPL) to encourage contribution from users and to promote reproducible research. The library is intensively tested on several platforms as Linux, Unix and Windows. Most features are also adapted to process huge images without the need for a supercomputer using streaming and multi-threading as often as possible (extracted from http://www.orfeo-toolbox.org/otb/). The main raisons to use it in the project are:  Re-use: This library contains a huge quantity of image processing algorithms and provides a very good coverage of the algorithms required to produce UR products.  Framework of development: the OTB is a C++ library in which the code characteristics allow to use it as a development framework for developing prototypes and operational chains. The use of a common development framework will simplify development phases.  Open source and optimization: the OTB is distributed by the CNES under CECILL- B license. Code source availability makes it possible to optimize algorithms implementation in order to meet performance requirements. Moreover, the OTB has

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 169 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

been successfully used for L2, L3 mission completely satisfying performance requirement.  Multithreading and streaming: these mechanisms are native in the OTB and allows to process quickly huge images. A common OTB code can be launched on one or several process. These mechanisms were implemented by the CNES in order to improve processing of the high-resolution Pleiades images. They will allow reducing time consuming of the S2Agri-SC components to deal with huge data volume at national scale.  Code and documentation quality: as far as CNES distribute the source code and documentation, efforts have been done to provide a very good documentation in English and to open high quality software (generic programming, templates, etc.).  Tested and validated: the OTB has been used in several operational softwares, is nightly tested on several platforms and distributed since more than six years. The OTB has a large users community which provides daily evaluation of the library  Already used in similar context by CESBIO and CS with for example the automatic processing of a large data volume of Pleiades data requested by thematic users of ORFEO program.  Well known: CS is the main developer of the Orfeo ToolBox and perfectly know the library. CESBIO is also one of the main contributors of the library via different research activities.

11.3.2.2 Description and utilization

OTB is based on the medical image processing library ITK, the OTB provides to its users an extensive set of algorithms and functionalities dedicated to remote sensing data exploitation. More specifically, the OTB embeds efficient approaches to handle large data using advanced streaming and multi-threading strategies. Thus, OTB-based processing chains take advantages of both optimized Input/Output access and streamed/multi-threaded filtering to perform efficient processing. The library also offers a powerful development framework to build efficiently processing chains and, in case a specific functionality was not already available, high-end filters by including extensive basic materials. The already included set of functionality regroup, for example, some fully documented radiometric correction, and orthorectification tools. The library documentation is available online (and in English) to provide the user a better understanding of the details of all the algorithms. This documentation is divided between class documentations, the software guide and a coockbook with use cases. According to the fact that the OTB is open-source, its continuous development and testing take advantages from its user and contributor community through its bug tracker or its mailing lists. Nightly testing is performed on multiple platforms and operating systems to ensure the code portability consistency and integrity. Moreover, strong basic coding and architectural rules are also applied to ensure both the modularity and the durability of the library.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 170 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

One of the fundamental aspects of the OTB library is that itself and all of its components are open-source. Table 11-1 gives an overview of some component licenses.

OTB CeCILL ITK BSD-like http://www.itk.org/ GDAL X/MIT http://www.gdal.org/ OSSIM GNU LGPL http://trac.osgeo.org/ossim/w iki OpenJPEG BSD http://www.openjpeg.org/ Zlib zlib/libpng http://www.zlib.net/ Libpng zlib/libpng http://www.libpng.org/ Libtiff BSD http://www.libtiff.org/ Libgeotiff X/MIT http://trac.osgeo.org/geotiff/ Libjpeg JPEG http://www.ijg.org/ Boost Boost Software License 1.0 http://www.boost.org/

Table 11-1: OTB third party

11.3.2.3 Quality

The source code is publicly available and can be downloaded from the official repository: http://hg.orfeo-toolbox.org/OTB/. To ensure the quality of the code, the coding rules and quality are inherited for the ITK library, check at the repository level and popular IDE configuration files are also publicly available for potential users and contributors. Thus, the library is build and tested on a nightly basis. The results are available publicly: http://dash.orfeo- toolbox.org/index.php?project=OTB. It ensures multi-platform consistency and continuous validation. Moreover, OTB documentation is written in English, extensive and constituted of three parts:  OTB class documentation is available both online (http://www.orfeo- toolbox.org/doxygen-current/classes.html) and offline;  the software guide downloadable as a pdf;  a cookbook providing use cases. In addition, the mailing lists are also a way to get information about OTB. The version history is available through the mercurial repository capabilities: http://hg.orfeo- toolbox.org/OTB/summary.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 171 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Re-used Source Documentati Version Val. Tests & Operational Software Code Coding rules/quality on history coverage uses Items Availability Aval./Quality -Venµs -Publicly Ground available Segment: L2, - Inherited from ITK online L3 chains Available OTB YES - Configuration files - In English YES -Sentinel-2: Online available - Extensive MACCS, IPF - Including radiometric use-cases components .

Table 11-2: Availability and quality status

11.3.2.4 Development

The Orfeo ToolBox will be re-used for several S2Agri-SC components. The following section presents more precisely the OTB framework and an example of using OTB in a ground segment. The OTB is based on the basics components, presented in Figure 11-1.

Figure 11-1: Orfeo ToolBox ecosystem The essential system concepts of OTB/ITK are base of the core concepts and implementation in ITK and therefore also in OTB:  Generic Programming a method of organizing libraries consisting of generic software components. Generic programming is implemented in C++ with the template programming mechanism and the use of the STL Standard Template Library;

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 172 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Object Factories mechanism, by using the Standard C++ capabilities. Object- oriented programs use inheritance and virtual functions to achieve powerful abstractions and good modularity;  Smart Pointers and Memory Management, to improve robustness of pointer and memory managements;  Multi-Threading: Multithreading is handled in OTB through ITK’s high-level design abstraction. This approach provides portable multithreading and hides the complexity of differing thread implementations on the many systems supported by OTB

For more details, you can report to the OTB SoftwareGuide (in pdf or html format), available in the OTB official web site. Some OTB functionalities (non-exhaustive list) are:  Filtering: Optical/SAR, Morphological operation, denoising, wavelets  Segmentation: MeanShift, Watershed, Connected Component, LevelSet, etc.  Registration: Transformations, Interpolators, Optimizer  Learning, Classification: Markov, KMeans, SVM, SOM, Random Forest, etc.  Similarity measures: CC, NCC, Mutual Information, Kullback Leibler  Geometrical corrections: projections, sensor modeling, P+XS fusion  Change detection: MAD/MAF, MeanRatio/MeanDifference, CBA-MI, etc.  Radiometric correction: TOA, TOC  Indices: vegetation, soil, water, etc.  Feature extraction: radiometric/geometric moments, textures, SIFT/SURF, line/right angle detection, etc.  Classifier fusion  Dimensionality reduction  Hyperspectral image analysis: dimensionnality estimation, endmember extraction, unmixing, anomaly detection  Temporal interpolation  Object Based Image Analysis  Spatial reasoning  SAR Polarimetry, soon interferometry  Sensor simulation: vegetation modeling, sensor spectral response, FTM, etc.

These mechanisms and basics functionalities offer basics tools to help the developer.

11.3.2.5 Example of OTB use

The OTB framework has been used to develop the Venµs L2/L3 ground segment and its derived project MACCS (Mutli-mission Atmospheric and Cloud Correction Software). More recently the radiometric components of S2-IPF chain used OTB.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 173 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The MACCS project is a relevant example of using OTB in a context which met research algorithm, strong validation constraints and operational requirements. Indeed, for the development of the Venµs L2/L3 ground segment chain, the main requirements of the CNES were:  Quality and robustness of algorithms of the chain,  Performances of the chain (in memory and time)

The main decisive aspects for using OTB for the Venµs L2/L3 ground segment (and MACCS) have been:  Re-use of image processing algorithms,  Strong development framework,  Open source software,  Performances and Validation environment

For the Development of the Venµs L2/L3 chains and MACCS, image processing algorithms available in OTB re-used have been:  OTB filters: statistics, basics filters, resampling, interpolators, reading and writing TIF/JPG/J2K/HDF images data and XML data, DTM reading, etc.  OTB algorithms frameworks: correlation, interpolation, composite filters, IO factories, basics image filters and functions filters, etc.

Its help to implement the following more complex algorithms:  Aerosol LUT and algorithms of extraction;  Clouds detection;  Cirrus correction and cirrus detection;  Atmospheric absorption correction;  Shadow detection;  Rain detection;  Water detection;  Rayleigh correction;  Snow detection (for S2 image);  Scattering correction;  Estimation of effect environment.

The fact of use strong open source solution decrease the development cost of the Venµs chain via re-use of existing code. The robustness and reliability of the OTB library decrease the risk of the development (CNES and CS teams) and the project can take advantage of upgrade of the OTB and new image processing algorithms. The design of the Venµs L2/L3 chains and MACCS software has used the best of OTB to:  Get a generic implementation of multi spectral camera capability via the Factories mechanism available in OTB (C++). With this mechanism it is possible to manage

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 174 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

(read/write) products from several spectral cameras: Formosat, Venµs, Landsat, Sentinel-2 and others future spectral cameras as Landsat 8, etc.;  Have a multiplatform system (Linux, Mac, Windows);  Stream and multithread the algorithm used.

The performance requirements expected by the CNES for the Venµs L2/L3 have been met. For example the chain produces a L2 Venµs image product (in INIT mode) in about 30 minutes, with less than 4Go of RAM and with one single core. Moreovern tilling split method and multithreading configuration offers the opportunity to play with the host machine capability to reach the best performance. A continuous integration environment has been set for the Venµs L2/L3 chains and MACCS projects. As OTB, daily and nightly testing are performed on operating system to ensure the code consistency and integrity. Testing reports are sent to an online dashboard as in Figure 11-2.

Figure 11-2: Dashboard of the MACCS project A set of data baseline sets for each class and for all algorithms are used to monitor the quality of the processing. Moreover, the validation and improvement process are supported by the user’s community: operational users in CNES, scientific users involved in the Venµs project and ESTEC during the Sentinel-2 commissioning. To conclude, using OTB for the development of the Venµs L2/L3 and MACCS chain is a success, shared by the CNES and CS. We think that the same goal can be achieved during the Sentinel-2 Agriculture project with OTB.

11.3.3 GDAL/OGR library

The GDAL/OGR library is a C/C++ common library used by several software and system to read/write various type of raster or vector data. It provides a common interface to handle this

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 175 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 type of data. Moreover it provides some interesting functionalities for the Sentinel Agriculture project:  Handle vrt (Virtual raster) file which enable to handle virtual GDAL dataset composed from other GDAL dataset. Additional information can be found at http://www.gdal.org/gdal_vrttut.html.  Created programmatically this vrt file through the GDAL API  Handle some basic operations on the polygon data as intersection, aggregation, union.

The first point extremely interesting to mosaic all the tile of a Sentinel-2 product as an unique dataset for the processing. The second one allow to create automatically the dataset when a new products is available. CS-SI handles regularly this type of operations when for example we create a Sentinel-2 catalogue viewer for the Sentinel-2 MPA project. Concerning the geometric operations on the GML masks included into the Sentinel-2 products, CS-SI uses GDAL/OGR and the GEOS library into the Sentinel-2 IPF project. They are very useful to manipulate several vector data and rasterize it if necessary. On the other hand, the GDAL polygonize functionality allow to transform a label map from a segmentation into a vector data file. The GDAL functionalities are also used in several projects at CS-SI as for example Pleiades Thematic platform to handle Pleiades data before add it into the catalog. The OTB library used GDAL API to read the great majority of the data and to handle rasterization and poligonalization operations. Concerning the JPEG2000 support, GDAL embeds several drivers: JasPer, OpenJPEG, ECW and kakadu. The two last ones are commercial solutions and cannot be used during the Sentinel-2 Agriculture project. The first one is an outdated open source library which has no support for large files and it is not well optimized for remote sensing application. On the other hand OpenJPEG has a support for large files and can deal efficiently with tiles data to avoid large memory consumptions. It is used by OTB to read the Pleiades and SPOT 6 data. The OTB team has strongly participated to the development version 2.0 of the OpenJPEG library. We propose to use this driver into GDAL and made some optimizations if necessary to handle Sentinel-2 data. It will be greatly positive for the end-users to get a FOSS solution to manipulate JPEG2000 data.

11.3.4 SLURM

SLURM is an open-source solution to manage job scheduling on large cluster or on a single one machine with multi-threading capabilities. It can monitor the system to allocate the best resource to a list of pending jobs that need to be executed. It can dispatch and execute these jobs on several threads. Last but not least, it provides a high level monitoring of the state and the progression of the job. CS-SI has already used this type of software (more precisely TORQUE solution) for the management of works into the Pleiades Thematic processing server. SLURM is organized following a server/client framework which can be adapted to a single machine. The different tools available into SLURM will be the central part of the S2Agri orchestrator to provide a fully automatic system drive by input data. For example SLURM offers the possibility to log the event related to the job previously launched.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 176 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Our solution is based solely on open-source software. Some of these libraries have already been used in operational Earth observation programs or software or systems. From a technical point of view, the selected free software components cover already a large part of the system. 11.4 Global architecture

According to software design standards, the following aspects should be considered, as a whole, to ensure a cost-effective design, implementation and maintenance of an operational system:  Configurability: in terms of functionality (e.g. to be modular and parameter driven) to facilitate usage, accommodate evolution and enable re-use across environments;  Scalability: to facilitate expansion of the hardware or software configuration of the system without major redesign. The system must be expandable to support additional future processing needs;  Portability: to reduce the cost of transferring the system to new computer platforms or operating systems. This is especially important to cope with the obsolescence of hardware or software;  Openness: to facilitate usage, integration of new functionality and interfacing with other systems without major redesigns;  Re-usability: to permit re-usability across environments. Customization to specific environments should not involve massive modifications.  Standards: i.e. wherever possible, widely used (including de-facto) standards should be utilized.

With the above in mind, the principal drivers affecting the choice of the S2Agri architecture are as follows:  Performance: This tends to be one of the main goals of any development, but as far as the S2Agri is concerned, it is critical to explore the means to obtain the best possible performance to handle large data volume. To achieve a high processing performance where large data volumes and complex algorithms are involved, a multiprocessing environment with efficient management of process, storage and data transfer is needed. This is why we decided to deploy the system into a hosting facility. Moreover we decide to select a high performance hardware configuration and a well-known resource manager used in high computing context. Last but not least we decide to use a library (Orfeo ToolBox) which have been design to handle Pleiades data in user context similar to the Sen2Agri one (cartography update, precision farming, change detection).  Flexibility: Given the nature of the Sen2Agri objectives, it is clear that the system should allow easy and fast development cycle to deal with users request update and assessment needed at each phase (algorithm prototyping, products prototype

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 177 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

generation). This can be achieved by means of a high modularity, efficient programming methods and open source tools mastered by all partners.;  Expandability: To achieve a high degree of flexibility and be able to cope with increasingly users demands, the system must be expandable. The flexible multiprocessing system selected and the tools used must be relatively expandable if user requirement increase;  Usability: This will be important from a users’ point of view. The system and, in particular, the interfaces with the user must facilitate configuration, management, control and assessment of the system in the best possible way. The design should allow the “lambda” user to have a full automatic system with useful analysis system which provides system logs and errors reporting. On the other hand the authorized user can choose different settings, enabling and disabling parts of the software to get a better control over the system.

To achieve the above mentioned goals we need to address two key points into the design phase:  Modular design and standardization of interfaces  Efficient resource monitoring and management

A modular design is perhaps the most important means to achieve the characteristics described above. A modular design is needed to accommodate changes. Processing algorithms will be constantly evolving and the design should be able to accommodate such changes. Orfeo ToolBox offers this advantage through the pipeline mechanism which connects some filters with a standard pattern. This design offers a great flexibility to design processing chain. The streaming and threading mechanisms of OTB offer the possibility to adapt the processing to the hardware capacity. One of key factor of the Sentinel-2 processing is to deal with large data volume. Efficient resource management will be required mainly from the perspective of performance, although it is also linked to expandability and flexibility. Efficiency implies that the system will put as little demand as possible on the hardware resources. The approach will be to load as much data as needed for optimal processing while keeping I/O to a minimum. For optimal performance, disk swapping must be avoided and efficient disk (SAS) must be used. The streaming capacity of OTB must be used at maximum. For efficient memory usage, there should be no memory leaks in the system. The usage of well tested libraries as OTB which used smart point principle should help in this respect but the system should be checked exhaustively for memory leaks with appropriate tools. COTS such as valgrind can be used for such purposes. As per the SoW, the S2Agri system can be broken down in two main components that will be further decomposed in the following sections. Those two main components are:  A set of S2Agri_SC: Each S2Agri_SC is an independent executable that represents an algorithm or a set of algorithms.  An S2Agri-Orchestrator: the main component which is used to manage the above S2Agri-SC on the system: monitor and execute processing jobs.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 178 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

A global view of the proposed system is provided in Figure 11-3.

Figure 11-3: Proposal architecture

11.4.1 S2Agri-SC component

As said, for these processing two tools has been selected for its quality, re-use degree and perfect command by the partnerships of the project:  Orfeo ToolBox: for composite processing and User product generation;  S2PAD: for atmospheric corrections and cloud detection.

This section describes the architecture of the S2Agri component excluded the atmospheric correction and cloud detection. This last one will be handled by Sentinel Exploitation Tools as recommended by ESA in the SoW. The general view of an S2Agri component is presented in Figure 11-4.

Figure 11-4: General organization of a S2Agri Software component Error and log reporter are common to all S2Agri-SC whereas the product reader and writer are designed for each S2Agri-SC. However some generic operations will be stored into a small common library to enhance the re-use. The error and log manager used the log and error output of the OTB-Applications and format them following the S2A system specifications. The product reader module aggregate the different tiles of the Sentinel-2 product to provide a vrt file to the OTB applications for example. With that we can mosaic the different tiles and

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 179 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 the bands needed to the processing. The same job is done for the mask and all auxiliary data. This module use typically GDAL/OGR library and a xml parser. The product writer will format the output of the OTB applications following the current interface requirements and store the data at the final directory if necessary.

11.4.2 Orchestrator component

The orchestrator component is fully based on the SLURM tools to manage the pending processing jobs. The pending jobs are created when a new L1c product is delivered by ngEO downloader into a specific repository. With this strategy we can optimize the computation activity of the system.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 180 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

12 Preliminary proposal for use case studies

This section provides a detailed discussion of the Use Case where in-situ production will be implemented (section 12.1 to 12.4) and a more general presentation of the 5 other Use Case for which the S2-Agri EO products will be generated at consortium’s premises (section 12.5). It is important to mention that the proposed approach has been convincing enough to already get several letters of interest from users to become a national demonstration case. These letters sent by the key national operators in agriculture monitoring are included in section 19 (appendix F) and concern the Senegal, the Russia, the Kenya and the Morocco.

12.1 In-situ 1 : Senegal

Senegal is a Sahelian country with an area of 9.5 million ha of agriculture area and one of the world's top food importers. Agriculture employs around 75 % of the working population and comprises 17 % of Gross Domestic Product (GDP). Groundnuts, cotton, gum arabic and sugarcane are the primary cash crops. Millet, corn, sorghum and rice are the main food crops. With groundnut production accounting for 40 % of cultivated land and cotton production another 33 %, cash crops dominate agriculture. The vast majority of crops are rain-fed, making water availability one of the country's biggest agricultural challenges. Rising food prices have hit Senegal hard, and the country has seen protests and riots as food has become increasingly unaffordable. The government responded with an ambitious national strategy to become food self-sufficient by 2015 but, faced with enduring problems of drought and poor soils, increasing production sustainably will be a considerable challenge.

12.1.1 User presentation

The Centre de Suivi Ecologique (CSE) located in Dakar is an autonomous center established in 1986 by the national authorities with the United Nations Sudano-Sahelian Office (UNSO) support which is directly today related to Ministries. Since decades, they combine satellite observation with ground measurements to deliver relevant information about the growing season in a timely manner. They have set up a system based on the use of satellite remote sensing to identify the location of zones affected by drought. From beginning of May to the end of October, a multidisciplinary working group meets every 10 days to deliver bulletins to decision makers, ministries and farmer’s organizations (RD.214) UCL-Geomatics has been closely related with the CSE for many years and some of their specialists were trained on-the-job in the UCL laboratory.

12.1.2 Expected result and impact

The monitoring of the growing vegetation is currently completed from SPOT-VGT and MODIS time series and delivered on decadal basis to the users. The coarse spatial resolution of these sensors prevents to separate the natural vegetation pattern from the agriculture lands. However, the croplands area is developing much slower than natural vegetation and is more vulnerable to droughts for certain period of the growing cycle.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 181 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The Sen2-Agri EO vegetation status will allow a crop specific monitoring, for the priority crops, i.e. millet, corn, sorghum, rice, and groundnuts. Starting with the dynamic croplands mask, such a monitoring will combine regularly updated mask and then crop type map along the season to progressively focus on each crop using the vegetation status product. Their already existing field campaign should save some effort for the validation data collection at the national scale. Every year during the growing season, the CSE publishes a bulletin about the crop growth at mid-season as well as a final bulletin available in September (Figure 12-1).

Figure 12-1: Agro-pastoral bulletin of August 2012 as delivered by CSE (http://svr- web.cse.sn/IMG/pdf/Suivi_de_la_campagne_agro-pastorale_2012_bilan_de_fin_de_saison_des_pluies-2.pdf)

12.1.3 Work to be performed

The overall work will be performed according to the Demonstration Plan described in section 4.4.5(WP 4500). It is related to the Phase 3 description provided in section 4.5: system implementation, training, in-situ data collection, technical support (WP 5400) and the validation and performance assessment.

12.1.3.1 Technical point of view

As operational entity active in agriculture monitoring since many years, the local technical and thematic expertise is quite relevant and functional. The growing season for rain-fed crops

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 182 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

(most of them are indeed not irrigated) starts in May to finish in October. At this stage we propose to focus mainly on the cereals, i.e millet, sorghum, rice and corn.

12.1.3.2 System point of view

The system to be implemented at the premises of the CSE should be integrated in their workflow and their computing facilities. It seems quite relevant to discard the rangelands part of the country both for the processing point of view and for the validation effort.

12.1.3.3 User interaction point of view

The CSE currently used the NDVI and associated indices such as the Vegetation Condition Index (VCI) and the Indice de Croissance Normalisé (ICN). It could be of interest that the Sen2-Agri project also delivers these indices in order to fit in their own decision system, even if this is sub-optimal from a remote sensing point of view.

12.1.4 Interaction with other project activities

The ESA GMFS project has developed various activities in Senegal more related to agro- meteorological modeling. The CGMS software developed by the EU MARS project is currently used for input and results visualization in the framework of the FP7 AGRICAB project as agro-meteorological model. The production of vegetation status from Sentinel-2 time series will surely improve the yield forecasting. The mandate for the agricultural statistics production belongs to the DAPSA (Direction de l’Analyse de la Prévision et des Statistiques Agricoles) du Ministère d’Agriculture et de l’Equipement Rural (MAER). The current hot issue is to investigate the possible improvement of the agriculture statistics thanks to remote sensing. The Sen2-Agri project will therefore be very timely to this regards. Currently, maps are produced from a method interpreting points and the alternative produced maps are tested and compared.

12.1.5 Expected value of the demonstration

Senegal is one emblematic Sahelian country corresponding to one of the most food unsecure region of the world. The US FEWSnet project is also a partner of the CSE and could be influenced by the results obtained here. The obtained results will immediately be transferable to all the countries from the Comité Permanent Inter Etats de lutte contre la Sécheresse (CILSS) and in particular to Mali, Niger and Burkina Faso, because of the great similarity of agricultural landscapes and cropping practices. Once the demonstration is completed with the support of the CSE expertise, it would be probably feasible - if the political stability is back - to transfer a similar system to the regional center of AGHRYMET.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 183 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

12.2 In-situ 2 : Kenya

Kenya is a densely populated country where only 20 % of the land supports significant crop cultivation. Maize and wheat are the main staple food accounting for over 80 % of total cereals used at a household level. Rice is the third most consumed cereal. Each year, the Food Steering Committee (FSC) of the “Office of State - Special Programmes” requires information on area, yield and production. Each year, Kenya has to import very large quantities of maize and wheat.

12.2.1 User presentation

The Department of Resource Survey and Remote Sensing of the Ministry of the Environment has acquired a very long and strong expertise in areal frame sampling to derive annually agriculture statistics from aerial photography. More recently, they have acquired a good experience in crop growth monitoring by satellite remote sensing.

12.2.2 Expected result and impact

The Sen2-Agri EO croplands, crop type and areas will be nicely validated by the comprehensive aerial survey conducted on annual basis. This would be probably the best demonstration case for the crop type mapping thanks to this capability to make direct comparisons with alternative systems. The analysis should concern the accuracy, the cost benefit ratio and the timeliness of both systems. In addition, the mapping products will serve to support their own vegetation status monitoring approach. Of course, it will also allow proposing a crop specific monitoring, for the priority crops, i.e. wheat, corn and rice. Starting with the dynamic croplands mask, such a monitoring will combine regularly updated mask and later on, crop type map along the season to focus progressively to each crop using the vegetation status product. Their already existing field campaign should save some effort for the validation data collection at the national scale.

12.2.3 Work to be performed

The overall work will be performed according to the Demonstration Plan described in section 4.4.5 (WP 4500). It is related to the Phase 3 description provided in section 4.5: system implementation, training, in-situ data collection, technical support (WP 5400) and the validation and performance assessment. Because of their routine activities in crop monitoring and their well trained staff, we do not anticipate any particular concern for the in-situ implementation.

12.2.4 Interaction with other project activities

The EU MARS FoodSec team produces a monthly bulletin for Kenya since May 2007. The bulletin monitors both agricultural and pastoral vegetation in Kenya based on SPOT-VGT time series and modeling data (in particular on NDVI). Similarly, the US FEWS Net system delivers on a regular basis the status of the on-going season.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 184 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The two major operators will be very interested to follow this demonstration in a country where they are both actively involved, knowing that, unlike most African countries, a very precise source of information already exists, thus allowing assessing the Sen2-Agri outcomes in much details. For instance, it will be quite feasible to investigate about the discrepancy sources thanks to the aerial photographs, results comparison at different levels of aggregation, etc.

Figure 12-2: Point sampling frame for visual interpretation of annual aerial coverage of the country in order to deliver area statistics (C. Situma, AGRISAT 2010, Brussels)

12.2.5 Expected value of the demonstration

As explained here above, this will be the perfect demonstration country because of the availability of very precise information on annual basis. Furthermore, this country also hosts the Regional Center for Mapping of Resources for Development (RCMRD) which is an intergovernmental institution with 18 African Member Countries. Its main mission is to support sustainable development through provision of geo- information services that include EO data, products, training, project implementation and advisory service, including in agriculture applications and early warning system. Thanks to their training facilities and their regional mandate, they could report widely the Sen2-Agri demonstration, in particular if it is found quite successful in Kenya. 12.3 In-situ 3 : Russia

Russia with more than 200 million ha of agricultural land is a major export country for the international market. However, agriculture in Russia survived a severe transition decline in the early 1990s as it struggled to transform from a command economy to a market-oriented system. Following the breakup of the Soviet Union in 1991, large collective and state farms had to contend with the sudden loss of state-guaranteed marketing and supply channels and a changing legal environment that created pressure for reorganization and restructuring.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 185 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

12.3.1 User presentation

Space Research Institute of Russian Academy of Sciences (IKI) is one of the leading in Russia and internationally recognized organization with strong focus on development of highly-automated methods and technologies for land cover mapping and monitoring based on EO data from satellites. IKI is strongly involved in the agricultural monitoring R&D activity along with providing near-real-time information on arable lands and crops to end-users, such as governmental agencies, food producing and insurance companies. In particular, IKI has developed automated methods based on the MODIS time-series data for annual arable lands mapping, winter and summer crops recognition along with their intra-seasonal developments monitoring in order to detect anomalies caused by unfavorable conditions, such as e.g. extreme weather conditions.

12.3.2 Work to be performed

As for the other national demonstration cases, the overall work will be performed according to the Demonstration Plan described in 4.4.5 (WP 4500). It is related to the Phase 3 description provided in section 4.5: system implementation, training, in-situ data collection, technical support (WP 5400) and the validation and performance assessment.

12.3.2.1 Technical point of view

The size of Russia territory prevents from a nationwide approach for a demonstration case. Instead, we propose to focus on 6 to 8 Oblasts in the main grains producing regions located in the Southern, North Caucasian and Volga Federal Districts. The target size of 500 000 sq. km will be reached anyway. The main crops will be winter wheat, winter barley, sunflower and corn. Some specific aspects like the winterkill impact will be tackled early spring if appropriate data are available. The most challenging part in this demonstration case is the spatial heterogeneity of the fields cultivated in a very extensive way.

12.3.2.2 System point of view

IKI has its own computing facilities already able to handle all available Landsat over Russia. Therefore, further discussion is needed to figure if the software solution should run directly in their existing environment or if this should set up as a stand-alone system.

12.3.2.3 User interaction point of view

The current operational system produces croplands masks using Landsat imagery and crop type maps using MODIS time series. In spite of the parcel size, this does not allow to have a precise estimation of the agriculture statistics while the Sen2-Agri products should fulfil all requirements.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 186 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 12-3: Example of IKI output for a Southern part of Russia (RD.208)

12.3.3 Interaction with other project activities

IKI involved in number of international projects with partners from European Union and USA, and participating in key international initiatives, such as GOFC-GOLD and GEOGLAM. For instance, the FP-7 MOCCCASIN project aims to develop the agro- meteorological CGMS system for part of Russia and to detect by MODIS time series the frost kill impact on the winter wheat. IKI has completed several agricultural monitoring applications based on use of the technical developments of IKI, such as:  Satellite data pre-processing algorithms for cloud/cloud-shadow screening and temporal compositing of cloud-free images;  Locally-adaptive classification algorithm for automated land cover recognition over large territories taking into account spatial variations of classes’ features without any preliminarily geographical stratification of mapping area;  Methods for operating the super-large archives of satellite Earth observation data to allow a fully automated management of distributed archives of the original satellite data and the results of their processing;  Technology for creating web based map interfaces providing users with quick access to satellite data and their processing results. Such a level of expertise allows comparing the existing technical solutions available at IKI versus the proposed Sen2Agri open source solution.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 187 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

12.3.4 Expected value of the demonstration

The interest of this demonstration case is twofold:  on one hand, the proposed solution may be rapidly adopted by the national partners instead of its Landsat-based system as they master already very well this level of resolution; indeed, the spectral resolution of Sentinel-2 imagery is expected to be more appropriate to capture the very large heterogeneity between and within the fields than the Landsat one;  on the other hand, there is keen interest in improving the Russian agriculture figures to reach some level of precision because of its critical importance to reduce the volatility of the international market prizes. 12.4 In-situ 4 : Morocco

Morocco area is 710.850 km2 and its climate is rather diverse, from Mediterranean in the north to arid in the south. The Atlas Mountain plays an important role in the water supply of irrigated plains. Agriculture accounts for only around 17% of GDP but employs 40-45% of the Moroccan working population. Crop area is estimated at 95.000 km². Agriculture production is mainly based on wheat, barley and other cereals, orchards (citrus, olives) and also vegetables (tomatoes, potatoes…). Agriculture production, and especially cereal yields, is strongly dependent on weather conditions. To cope with this issue, large irrigated perimeters have been developed by the government. The total water storage capacity mainly devoted to irrigation increased from 2.3x109 m3 in 1967 to 16x109 m3 in 2004.

12.4.1 User presentation

The users would be the nine “Regional Offices of Agricultural Development” (ORMVA for “Office Régional de Mise en Valeur Agricole” in French) who are in charge of managing the irrigated perimeters of the country. ORMVAs, which are financially autonomous public institutions under the Ministry of Agriculture and Agricultural Development, are responsible for the planning and management of water resources for agricultural use, design, construction and management of large hydraulic perimeters. ORMVAs are also responsible for the small and medium hydraulic their geographic jurisdictions. They play a major role in the Moroccan agriculture sector. ORMVA would need Sen2-Agri products for managing water allocation both with a tactical (short-term management) and strategic perspective. If the national coverage of Morocco is decided in the frame of Sen2-Agri, the location of the production entity shall be discussed with local partners and policy-makers. It could be for example the CRTS (Centre Royal de Télédétection Spatiale), the Morocco Met Office, or even a new dedicated department in a university. Through the excellent relationships and partnerships with ORMVAH (ORMVA of Haouz plain, around Marrakech) and Morocco Met Office, we believe it will be quite easy to push

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 188 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 the idea at the upper level of the government. The most difficult part will be the negotiation for the choice of the local operator.

12.4.2 Expected result and impact

CESBIO is working with ORMVAH since more than 10 years. Depending on the year, 3 to 5 CESBIO scientists are assigned in Marrakech where they develop academic and applied researches, as well as masters and PhDs training in close collaboration with the Cady Ayyad University, the Tensift Hydraulic Basin Agency and ORMVAH. CESBIO has also close collaboration with the Morocco Met Office. Compared to other countries in Africa, Morocco has a sound organization of the agricultural sector. This sector is facing a number of challenges, from weather fluctuations to climate change threat. Morocco has also strong training formations and well trained personnel. Agriculture is a strategic sector for Morocco, one of the reasons being that this sector provides about 40% of the employment of the country. Another reason is the willing of the government to insure as far as possible food self-sufficiency (currently around 60%). Morocco has defined its new strategy for agriculture with the “Plan Vert (Green Plan)” issued in 2008. This plan considers major evolutions of the agricultural sector which will benefit of the Sen2-Agri system. More information about this “Plan Vert” can be found on http://www.ada.gov.ma/Plan_Maroc_Vert/plan-maroc-vert.php.

12.4.3 Proposed approach

The proposed approach is to take the opportunity of Sen2-Agri Phase 1 and Phase 2 to demonstrate the strong interest of Sentinel-2/Landsat 8, of the system developed and of the products. One or two Take-5 sites in Morocco could be used for such a demonstration, in close collaboration with ORMVAs and other local partners. Additional high spatial resolution data, including times series, are available on the Haouz area for several years (SPOT, Formosat, ERS, Landsat), and the acquisition of time series of Spot data is planned for 2014 (own CESBIO’s resources). The results over these tests sites will form the basis for promotional activities. In the same time, informal discussions could take place in order to prepare phase 3 and a possible national coverage of Morocco. 12.5 Local production

The preliminary list of sites selected for the local demonstration case should be as widely distributed as possible and should allow learning from these very first results. This is the reason why we kept both sites managed by the consortium members. The list is as follows:  Midy-Pyrénées (managed by CESBIO);  Belgium (managed by UCL);  China;

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 189 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Moroco or Tunisa;  Argentina. It is also foreseen that a site could be proposed in Romania for the phase 3 while not appropriate for the phase 1 in the absence of already existing data set. Romania is an important European producer and the national authorities could be interested in being part of the local demonstration cases.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 190 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

13 Proposed approach for the sentinel-2 EO product processing

13.1 Sentinel-2 EO product processing

13.1.1 Huge data volume to consider

The huge data volume induced by large S2 time series need to deal carefully with performance and facilities capacity. If we consider existing UR products with some hypothesis (6 month seasons, revisit period of five days, internal raster format in JPEG2000 LOSSLESS and output format in GeoTIFF and five biophysical products), the processing system much deal with:  At regional scale (290kx290km) around 230 GBytes for one site;  At national scale (around 500000 km²) around 1.4 TBytes for one site.

These preliminary estimations have a strong impact on the hardware configuration because it needs to handle large data volume. Therefore the hardware and the software used must be carefully chosen.

13.1.2 Hardware configuration

We estimated that a hardware system with high multi-threading (minimum 8 threads available) and high memory access (minimum 24 GBytes) must be the most efficient to deal with national scale. For regional scale a high level commercial PC must be the minimum. User’s hardware configuration will be investigated to select the user premises. One common point at all these facilities will be the data storage and the input data collection. Indeed the data will be retrieved from ESA facilities through internet network (automatically via ngEO downloader if possible) and need to be stored on disks with efficient I/O access (for example SAS disks). To obtain the best internet bandwidth (10Gbps to Internet and 300Mbps from Internet) we propose to use a dedicated hosting service to host the main Sentinel-2 Agriculture production facility (see Figure 4-1 in section 4.1.3). It will enhance the availability of data to end-users through the website. Moreover using a hosted device offers the guaranty to have the material always available during the service subscription without additional cost. We can propose to end-user to host their own system at their premises or to host their system close to an internet node to enhance our support. This choice depends to the technical level of the users and their facility (internet bandwidth can be low in some African countries for example). The choice of efficient I/O disk access is also a critical point for the processing chain because we need to avoid losing time with I/O disk operations. Therefore we propose to use SAS disk to store intermediate products and SATA disk to store final user’s products. Moreover to avoid saving all data and system on a third server, we propose to configure the data disk of the processing server with RAID6 method. The operating system of the processing server will be saved on the website service and on the other hand critical part of the website server will be saved on the processing server.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 191 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

13.1.3 System processing

The proposed approach for sentinel-2 EO product processing will be based on the Sen2-Agri system developed and on the Sen2-Agri software components, both being described in detail in section 9. For the sake of clarity, the main message of this section is recalled here. The Sen2-Agri system can be broken down in 2 main components:  A set of Sen2-Agri_SC: each S2Agri_SC is an independent executable that represents an algorithm or a set of algorithms;  A Sen2-Agri-Orchestrator: the main component which is used to manage the above S2Agri-SC on the system: monitor and execute processing jobs.

The global view of the proposed system is provided in Figure 11-3 (section 11.4) and the general view of a Sen2-Agri component is presented in Figure 11-4 (section 11.4.1) Error and log reporter are common to all Sen2Agri-SC whereas the product reader and writer are designed for each Sen2Agri-SC. However, some generic operations will be stored into a small common library to enhance the re-use. The error and log manager used the log and error output of the OTB-Applications and format them following the S2A system specifications. The product reader module aggregate the different tiles of the Sentinel-2 product to provide a vrt file to the OTB applications for example. With that we can mosaic the different tiles and the bands needed to the processing. The same job is done for the mask and all auxiliary data. This module use typically GDAL/OGR library and a xml parser. The product writer will format the output of the OTB applications following the current interface requirements and store the data at the final directory if necessary. The orchestrator component is a tool which launches the different processing chains following a Data Driven processes and monitors the different chain. This component will manage the pending processing jobs. The pending jobs are created when a new L1c product is delivered by ngEO downloader into a specific repository. With this strategy we can optimize the computation activity of the system. .

13.1.4 Sentinel-2 EO product dissemination

The general issue of product dissemination is described with more details in section 4.1.3 describing the Sentinel-2 exploitation scenario development. As the product writer from Sen2-AgriSC stores the final product data in final directory, two possibilities exist to download the product regarding the user case.  For at least 3 user cases where user will have an Sen2-Agri system installed at user premises and could product by themselves their own products, download could be done very easily from previous final directory;  For user cases where Sentinel-2 EO products will have been processed at consortium premise, final product access and dissemination could be done by Delivery of data on

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 192 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

secured ftp (SFTP) or via automatic direct download link generation which can be send via e-mail when products are available. 13.2 Sentinel-2 EO product validation

There are several definitions of validation available from various agencies. We propose to adopt the one given by the Committee on Earth Observing Satellites Working Group on Calibration and Validation (CEOS-WGCV): “The process of assessing, by independent means, the quality of the data products derived from the system outputs”. The Sen2-Agri project will produce a set of output products (in particular the Sentinel-2 products) that require validation. Ideally the validation process should follow clearly defined protocols and should be independent from the production process. The independence of the validation process should follow three requirements: 1. Sen2-Agri project shall use, for validation, in situ or other suitable reference datasets that have not been used during the production of their products. 2. Sen2-Agri project teams shall consider the independence of the geophysical process and ensure that if a particular auxiliary dataset is used in the production of their products then, the same dataset is not used in the validation. 3. Sen2-Agri project teams shall ensure that the validation is carried out by staff not involved in the final algorithm selection; ideally the validation of the CCI products should be carried out by external parties, i.e. by staff / institutions not involved in the production of the products. The adopted validation strategy (presented in section 4.4.4) adheres to the above three requirements regarding independence. In particular, the independence of the in-situ data is mentioned as a requirement for the collection design (see section 4.2.2) and all validation activities in the project (test dataset validation, prototype products validation and Sentinel-2 products validation) are carried out by UCL, which is not directly involved in the production. The validation of the Sentinel-2 EO products will rely on 3 complementary pillars (see Figure 4-8): (i) the confidence-building, (ii) the statistical accuracy assessment and (iii) the comparison with existing products. In addition, a user-oriented assessment will be set up, in order to assess the Sentinel-2 demonstration products utility and benefit. This approach has the advantages of:  reinforcing the overall acceptance of the product by users by removing macroscopic errors;  providing accuracy figures obtained from an independent quantitative validation in line with current standards;  characterizing the strengths and weaknesses of the new Sen2-Agri EO products with respect to other existing agricultural products;  involving on a real user dialogue in which users feedbacks feed into final discussions and recommendations.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 193 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

14 Appendix A - Draft software development plan

14.1 Introduction

14.1.1 Purpose of the annex

The aim of this document is to define the management approach and the software development strategy, methodology, and means to produce the Sentinel-2 Agriculture operational system required by the European Space Agency (referred to as "ESA" or "the Agency"). The content of this Software Development Plan is Compliant with [AD.3], as tailored in [AD.1]. It is provided as a Draft document included in the Sen2-Agri consortium proposal in response to ESRIN ITT ref. AO/1-7455/13/NL/BJ “Sentinel-2 Agriculture”. It will be updated and re-delivered at PDR. In this draft version, to avoid redundancies, for some sections, references are made to other volumes of our proposal, mainly:  Volume 3: Management & Administrative Proposal,  Volume 4.1: Implementation Proposal,  Volume 4.2: WBS and WP Description. However, the content of most of these sections will be integrated in the issue delivered at PDR.

14.1.2 Structure of the annex

The document is structured as described below:  This section gives the purpose and the structure of the document (applicable and reference documents, acronyms and abbreviations being given at the beginning of this document in section 1.3).  Section 14.2 gives a description of the component being produced and of the scope of the project.  Section 14.3 presents the management approach by referring to the Project Management Plan and other management documentation.  Section 14.4 presents the software development approach. 14.2 Project overview

The overall objective of the contract is to develop the Sen2-Agri system such that it is compliant with its software requirements and interface specifications defined with the support of end-users.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 194 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

S2 is a multispectral imaging mission, jointly implemented by ESA and the EC (European Commission), for global land observation (data on vegetation, soil and water cover for land, inland waterways and coastal areas, and also provide atmospheric absorption and distortion data corrections) at high resolution with high revisit capability. The objective of Sen2-Agri project is to define and prototype a Sentinel 2 Agriculture Prototype (S2AgriP), which will be composed of four modules which process data:  from the Level 1c to the Level 2a (L2a processing module);  from the Level 2a to the Level 3 (composite processing module);  from the Level 3 to the vegetation status indicator;  from the Level 3 to the L3 agriculture products. 14.3 Management approach

The management approach for Sentinel-2 Agriculture is defined in the Project Management Plan (PMP) and associated Work Breakdown Structure & WorkPackage Description (PMP-WBS-WPD). These documents cover the following items:  Management objectives and priorities;  Master schedule;  Assumptions, dependencies and constraints;  Work Breakdown Structure and work packages descriptions;  Monitoring and controlling mechanisms ;  Staffing plan, roles and responsibilities and key personnel;  Procurement. Procurement and supplier management are covered in the project Product Assurance Plan (PAP). In addition, risk management issues are covered in the project Risk Management Plan (PAP- RP) and associated Risk register (PAP-RR). Moreover, a list of procured software is included in the project Software Reuse File (Appendix B in Section o). 14.4 Development approach

14.4.1 Software development strategy

14.4.1.1 Key aspects

The key strategic design concepts that drive the software development approach for the Sen2- Agri project are:

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 195 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Reusability: the development of Sen2-Agri components is to be based on existing tools;  Source availability: existing software to be re-used for Sen2-Agri development is open source and well documented, it is recognized and well suited for efficient development;  Adaptability: the Sen2-Agri is able to integrate changes resulting from assessment of products prototype by the user group;  Validation approach: Sen2-Agri algorithm benchmarking and product prototype assessment are the key issues within the project. The final quality of the data processing software components shall lay down on the accuracy and reliability of the product assessment and benchmarking processes. What is more, the key criteria regarding the implementation of the component of Sen2-Agri software are:  Performance requirements: both in terms of huge data volume and computing time;  Delivery of source code;  Continuity of developments: incremental development with respect to the iterative approach based on the end users validation at each iteration.

14.4.1.2 Performance

One of the key points of the Sen2-Agri development is the respect of performance in terms of computing time due to the huge data volume and the complexity of the algorithms which can be used (segmentation and classification on large scale). In order to meet this requirement, we will deal with this issue as soon as possible during the algorithm selection and benchmarking. Thus, we propose to develop mainly the Sen2-Agri on the basis of the same performing software, the Orfeo ToolBox, which on one hand has proven to be efficient and on the other hand, is used simultaneously for prototyping (during Task 3) and development activities (during Task 4). The research (CESBIO and UCL) and development (CS-F and CS-R) teams have a strong experience on this tool. Besides, all the components used in this project also offer the advantage of being open source. The main re-used software Orfeo ToolBox (mainly for UR-1 to UR-3 and for generation of composite) and Sentinel-2 ToolBox/Sentinel Exploitation Tools (mainly with for the L2A processing) provide streaming and multi-threading mechanism allowing reducing computer time and memory consumption. They have been designed to process big images (HR Pleiades images by Orfeo ToolBox and S2 and various EO products for BEAM toolbox) and have been used in operational contexts similar to the Sentinel-2 framework (ESA Land Cover CCI project for BEAM ToolBox and Venµs L2&L3 and S2-IPF chains for OTB). Moreover, the availability of source code (due to the fact that these are open source components) makes possible code optimization. Thanks to expertise in image processing, very good knowledge of Sentinel-2 issues, but also in software development and optimization, our consortium is able to take in charge optimization tasks on the basis of re-used libraries to satisfy the users’ requirements.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 196 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Moreover, the expertise of the proposed team in development and use of Orfeo ToolBox and BEAM (core of the Sentinel Exploitation Tools /S2 Toolbox) libraries will reduce the risks and secure the planning.

14.4.1.3 Strategy

The development strategy proposed by the CS SI consortium is based on a classic V lifecycle Model. In this model, phases are organised in a sequential manner and completion of a phase is formalized by means of a review of the deliverable outputs and their approval. This approach provides regular visibility on software development, which leads to better quality and suitability as required in compliance with needs and objectives. But the Sen2-Agri project shows specificities mentioned in the Statement of Work [AD.1], such as (i) schedule constraints, (ii) study and benchmarking activities and (iii) iterative delivery of the product prototype to user groups. In order to take into account these specificities, the consortium proposes to tailor the V lifecycle with regard to the following topics:  EO products specification and algorithms design This phase is not included in a classic V Cycle model. Several activities, some related to test activities and some related to algorithm analysis and improvement activities, are encapsulated in this phase. It starts at the URR and is scheduled in parallel of the Technical Specification phase, the Architecture and Interface Design and a part of the Software Design and Development phases.  User oriented and incremental validation In order to increase the robustness and reliability of the EO products and more generally of the Sen2-Agri software chain, the consortium creates as soon as possible a “STUB” version of the Sen2-Agri Software to validate the output format regarding to the ICD and the user requirement. The integration of the algorithm will be done on this version when the output format and means of distribution will be validated by users. At each step, the packages are made available to the end-users.  Verification and validation strategy As required in the Statement of Work, the validation procedure should be guided by users. This approach is more effective in terms of the requirements validation mode definition and in terms of anomaly detection. It will be led by the UCL for the outputs of the pre-processing chains, the algorithms selection and the sentinel-2 EO products.

14.4.2 Development life cycle

14.4.2.1 Software development life cycle identification

As explained in the previous section, the consortium proposes a tailored V lifecycle. This tailored model identifies the following phases:  User requirements consolidation;  Technical specification;

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 197 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Architecture and interface design;  Algorithm selections and benchmarking;  Software design and development phase;  User oriented verification and validation phase;  Software deployment and production on use cases;  User support. The section hereunder describes the activities that are conducted during each phase of the development of Sentinel-2 Agriculture software.

14.4.2.2 Activities and tasks description

14.4.2.2.1 User requirements consolidation

The initial user’s requirements have been established during a consultation exercise organized by ESA in April 2012 with about 50 members of the agricultural user and expert communities. The initial requirements focus on dynamics crop masks, crop area extent and type as well as vegetation status indicators. These requirements specify, amongst others, the needed coverage, delivery time, spatial resolution and thematic accuracies. Additional requirements concern data format, provision of quality flags, and access to the data, products and softwares.

14.4.2.2.2 Technical specifications

The requirements analysis aims at consolidate needs expressed by the users through the various documents provided as inputs (mainly the final URD). This step is based on documentation and CFI software available at the URR meeting of the project. It focuses primarily on raising the maximum TBC / TBD in order to achieve / finalize a Software Requirements Specification (SRS), which is as clear and non-ambiguous. This phase is particularly important for the development of the overall system or when the initial specifications are not completely finalized. In summary, the software requirements analysis consists in:  Analyse Statement Of Work and the final URD requirements,  Define and document software requirements using input documentation and CFI software,  Build an implementation-independent model of software requirements,  Identify each requirement,  Evaluate the software requirements,  Create the Software Requirements Specification (SRS) draft document. This phase ends up with the first organized Progress Meeting. The aim is that ESA informally reviews a draft of the Software Requirements Specification document before the beginning of the Architecture and Interface Design phase.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 198 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

This allows ESA to check the software requirements in anticipation of the PDR. Thus, if remarks or changes are to be considered, they could be taken into account as soon as possible to minimize the impact on the other phases.

14.4.2.2.3 Architecture and Interface Design

The purpose of this phase is to produce and refine the top-level architectural design of the software product, i.e. the top-level structure and software components meeting software requirements. A top-level design for external interfaces (i.e. to other software pieces or systems) and internal interfaces (i.e. between software components of software product) is identified. The activities performed in this phase apply on the design of entire Sentinel-2 Agriculture processing chains even if the algorithms are analysed in parallel of this phase. Indeed, the definition of the processing chain will not impact the design but only the core of the process. a) Preliminary design The activities run during this phase are:  Management and analysis of ESA remarks and changes following the ESA review of the Software Requirements Specification draft document;  Consolidation of the Software Requirements Specification document;  Progressive definition of the software structure, the software components (including data converters), the functions or objects that compose it, as well as the links between them;  Initialization of the Design Definition File (DDF);  Definition of the data, especially their meaning and their electronic organization;  Production of the Interfaces Control Document draft (ICD);  Specification of the validation strategy based on TDS and production of associated documentation (ATD);  Identification and definition of tools and means for Sentinel-2 Agriculture verification and validation. Note: The hardware platform definition study is part of Sentinel-2 Agriculture exploitation scenario activities. At the end of this phase, the Preliminary Design Review (PDR) is held, in order to review the software architecture and interfaces, and to review the verification and validation plans. b) Detailed design The purpose is to detail the design in order to be in coding conditions. Each software component is refined into lower levels containing software units that can be coded, compiled, and tested. Another main purpose is to definitely finalise the Software Requirements Specification and the Interfaces Control document. In summary, the detailed design of each software item consists in:

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 199 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Design of each software component;  Production of the Design Definition File to deliver a final version;  Development and documentation of low-level interfaces (including minor last changes in external interfaces when required);  Update of the Interfaces Control Document (ICD) to deliver a final version;  Definition and set up of the tools to be used for unit testing and continuous integration.

14.4.2.2.4 Algorithm selections and benchmarking

This phase is a Sentinel-2 Agriculture project crucial phase, not included in the classic V lifecycle Model. It starts after the URR, at the same time as the Technical specification phase, and ends before the system implementation task. Its aim is to perform activities needed to define the algorithm used in the processing chain for the operational system. We includes into this phase the generation of relevant TDS for each test site. The inputs of this phase are reference documents attached to the ESA contract and related to algorithms for atmospheric corrections and cloud free composite generation: [RD7, RD12]. The activities run during this phase are:  Analysis of the state of the art and best practices review to select the algorithms which will benchmarking ,  Building TDS from EO and in-situ collected data  Benchmarking of the selected algorithms with the TDS,  Definition of the final algorithms and creation of the Sentinel-2 Agriculture Algorithms Theoretical Basis Document (ATBD), A draft version (plan design) of this document is delivered at the PDR.  Validation of the defined solution using the reference data and images and report of the results in the Report describing the findings and justifying the choices made (DJF). A draft version (plan design) of this Report is delivered at the PDR.  Study of the characteristics of the final Sentinel-2 Agriculture hardware platform according to the performance requirements and algorithms defined, then proposition and justification of a hardware platform. During this phase, the consortium also analyses the performance aspects of the architecture (load CPU units, memory usage, timing of tasks ...) and produce a first issue of the System Technical Budget Document (BGT), including performance benchmark. At the end of this phase, a specific progress meeting, named Progress Meeting Design Review and anticipating the CDR, is held in order to review the deliverable provided during this definition phase.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 200 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

14.4.2.2.5 Software Design and Development phase

The software design and development engineering process is subdivided in two set of activities: coding and testing and software integration. a) Coding and testing (unit tests and continuous integration) This stage concerns the coding of the software components of the system. The developers perform coding, unit tests on these components and continuous integration. Thus, the integration is conducted in a natural way by ascendant assembly of components, in order to identify as soon as possible eventual problems. The implementation activities of the Sentinel-2 Agriculture project begin straight at the end of the Software Design phase. Naming conventions concerning source codes and Coding standards are fully applied according to the project’s rules. Once the source code of the different code units is complete, the source files are generated, compiled, linked, unitary tested and integrated within a module. Code units are tested as soon as they are coded. Unit tests are kept to be executed at any time to allow regression testing. In summary, the coding and testing of each software component consists in:  Development and documentation of software units (coding), unit test procedures and unit test data based on TDS;  Integration of prototype selected reviewed and selected during the phase 1;  Preparation of the test environment;  Testing of each software unit;  Progressive ascending integration;  Integration tests execution (If the results obtained do not conform, carry out the necessary developments and re-run the test);  Evaluation of the software code and test results. b) Processing chain integration Figure 14-1 shows the principle of the functioning between the Sen2Agri partners for the development and integration steps:

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 201 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 14-1 : Principle of incremental integration The goal is to increase the robustness and reliability of the algorithms and more generally of the Sen2-Agri software chain. The Sen2-Agri software is tested by the consortium and more precisely by the UCL for assessment of product prototype. In this case, during the development, Sen2-Agri products and software packages shall be made available to partners, in order to iterate and ensure the validation of (i) the input and output interfaces, (ii) the good progress of the modules and (iii) the algorithms implementation regardless to the prototype definition. The packages are made available in the main Sen2-Agri server. The following steps are carried out: (1) CS-R produces a “stub” SL2P version; (2) CS-R submits the source code of the Sen2-Agri software in the server; (3) CS-F performs gradual technical validation and the product of prototype products; (4) UCL performs thematic analysis of the prototype products with the end-users; (5) UCL submit products problems reports; (6) CS-F, CS-R and CESBIO analyse these SPR and provides a new version of the Sen2-Agri software. The main steps are described here below. o STUB SL2P version All components (L2a, L2b, L3, L3Agri modules) are created; the core of functionalities is dummy but interfaces meet the specifications. Figure 14-2 shows the Sen2-Agi Software in “STUB” version:

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 202 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 14-2 : First V0 “STUB” version of the Sen2Agri Software o Developing Sen2Agri V1 Afterwards, the developers fill the different modules for the V1, as follows (Figure 14-3):

Figure 14-3 : Submission of source code based on ATBD and prototype source code o Upgrade the Sen2Agri V1.x Finally, the latest version V2 is developed by including the feedback given by the products assessments (Figure 14-4).

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 203 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Figure 14-4 : Regular submission of source code for the updated V1 modules o Possible provision of preliminary versions to ESA The consortium extend and share the server dedicated working space with ESA, in order to propose ESA the possibility to download preliminary versions of the Sen2-Agri software. For instance, in this case, ESA could install and test the interfaces of Sen2-Agri Software and in this way, check software specifications as soon as possible. If ESA takes this opportunity, ESA is not expected to raise SPR on those preliminary versions, and the consortium does not commit to take into account potential SPR raised by ESA on those versions. Concerning the prototype products, ESA could be also implicated to assess their validity. o Additional remark The aim of this incremental integration is the integration of the different modules of the processing chain in order to verify the correct execution flow and data exchange between modules according to the applicable interfaces. It does not include a full validation of the processing chain. It is carried out with the datasets available at this time, which do not fully match with the datasets used for validation on uses cases. c) Additional activities o Platform procurement In this phase, a consolidation of the proposed final Sen2-Agri Hardware platform is performed (a first description & justification has been delivered at the PDR). The System Technical Budget Document (BGT) is updated in order to consider the performances of the complete system including the selected algorithm. o Documentation production In parallel with the production/update of the Design documentation (DDF, ICD), the following documentation activities are performed: . A full description of the verification and validation tests, previously identified in the ATD, is produced and included in the Prototype

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 204 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Validation and Assessment Report (PVAR). If necessary, the ATD is updated. . The Software User Manual (SUM) is produced. d) End of the Development phase At the end of this phase, a Qualification Review (QR) is held in order to review the processing system, its architecture and interfaces based on the Acceptance Test Document. All the relevant information and issues raised during this meeting will be documented into the Qualification Review Report (QRR). If some problems are detected a new iteration of the implementation phase is started, and a new review is planned. Further description will be done into the software deployment phase.

14.4.2.2.6 User oriented Verification and Validation Phase

The verification and validation phase has the following objectives:  To produce products prototype;  To ensure, in complement to the various tests performed at component level, a comprehensive verification of the prototype products with regard to Sen2Agri specification and the end-users.

14.4.2.2.7 Software deployment and production on use cases

One version of the SL2P system is delivered to ESA. For each delivered version, the corresponding documentation and Test Data Set are also supplied. Plus, the final hardware platform will be open to selected end-users. To obtain the ESA acceptance of the deliveries, the software should pass the following process:  A Factory Acceptance Tests (FAT) session, aiming at validating Sen2Agri software on the target platform before authorizing its delivery;  A Qualification Review (QR) aiming at reviewing the FAT results and authorising the end-users assess to the first prototype product;  The product delivery;  An Acceptance Review (AR) formalizing the acceptance by ESA of the Sen2Agri prototype product based on TDS data;  A special progress meeting to review the first final products based on Sentienel-2 products. a) FAT To verify a product before its delivery, Factory Acceptance Tests (FAT) are executed at CS-R premises, CS-R being responsible of the development. A FAT for each delivered version is identified. Each FAT is carried out on the Sen2Agri target hardware platform, procured by the consortium and available in CS-R premises.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 205 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

This organization allows to better ensure the success of tests, which will be performed on production site. FAT are performed by the CS-F team with support of UCL which coordinated the users’ feedback. ESA can participate to this activity, having a witness role. FAT aim at:  Testing and demonstrating the closure of problems reported in previous phases;  Verifying by inspection, analysis and tests that the Sen2-Agri version to be delivered meets all requirements associated to this version;  Reporting tests results on the PVAR document. During the previous phases, all was implemented for insuring the success of FAT. Except unforeseen, only some minor anomalies or comments are raised during a FAT. They are then managed and follow-up by CS-F (under Gforge). If software anomalies are raised during Sen2-Agri V1 FAT, ESA and the consortium will agree on anomalies priority. b) QR A Qualification Review before each software delivery is scheduled on the project. This review aims at:  Reporting on the FAT results;  If necessary, agreeing on the anomalies correction plan;  Verifying that all required conditions are set for the OSAT;  Authorizing OSAT. At QR, ESA has to give the authorisation for the shipment of the target platform to be installed at the user promises. c) OSAT After QR, On-Site Acceptance Tests (OSAT) are performed by CS-F at user premises on the reference Platform provided by consortium. End-users runs the acceptance tests and check that suitable documentation is provided with the software. If required, a member of the consortium will execute these tests but end-users will keep the responsibility of this task. OSAT aims at:  Installing and generating the test product and verifying the associated procedures;  Verifying by inspection, analysis and tests that the delivered Sen2-Agri version meets all requirements of the end-users;  Reporting tests results on the Software Verification and Validation reports document. The software anomalies raised during Sen2Agri V1 OSAT will be managed during the Verification and Validation phase.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 206 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

d) AR An Acceptance Review for each software version is scheduled on the project. This review aims at:  Reporting on the OSAT results,  Reporting on the anomalies (if any) raised during the OSAT and agreeing on an anomalies correction plan, o Formally concluding on the acceptance of the delivery: o Accepted  Accepted with restriction o Rejected

14.4.2.2.8 User support

Purpose of this section is to give an overall description of activities that are performed in the frame of the user support tasks. These tasks are dedicated  to investigate and explain the cause of any errors found in the usage of the Sen2-Agri system, to resolve them and to deliver a new version;  to provide training to use and maintain the deployed system;  to enhance the capacity of the users to deal with EO data (Sentinel-2 and LandSat-8) in the field of agriculture monitoring. Remark: Anomalies correction deadlines are usually dependant on the severity and/or of the urgency of the anomaly. Indeed, when an anomaly has a minor impact on the software, it could be foreseen to include its correction in a patch grouping several anomalies corrections and to agree with ESA on this patch delivery date. This process will be discussed and agreed with ESA at AR meeting. a) Technical user support A full description of the activities covered in this task will be provided at the AR meeting through a final technical support plan. Technical support plan describes the following items:  Scope of the technical user support;  Identification of the initial status of the software product;  Support organisation;  Mean of access to the technical support;  Users records and reports.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 207 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Because maintenance tasks are not tailored, the technical support will be provided to users according to their place into the project. b) Capacity building plan The aim of this activity is to support users so that they become self-sufficient in the use of the developed system and use it in an effective and sustainable way for their needs. A capacity building plan will be developed. This plan will have to consider the respective situations of the involved entities in terms of, for instance, objectives, area of interest, existing approaches and tools, manpower and skills, technical facilities (computing and network resources, equipment for ground surveys, power supply and cooling system). As an initial proposition, the capacity building will cover three main topics all combining theoretical background and on-the-job training. The first topic concerns the remote sensing aspects and the related agriculture applications, the second address the in situ data collection and the validation protocol and the third introduces the software and technical aspects of the system. c) Training activities A training plan will be drafted at the beginning of Phase 3 and progressively improved in order to incorporate the lessons and feedbacks of the first trainings and demonstration use cases. Given the variety of demonstration use cases, it is expected to accumulate a significant number of practical examples on the use of the system and on the characteristics of the products in many different situations (climate, cropping systems, etc.). Different categories of users are to be identified. Once target users are defined, the training material can be identified and developed. Training material will be built on the basis of demonstration use case. Its detailed components will be defined in the training plan. As a first guess, this material could consist in several packages:  Several presentations (Power point or equivalent) dealing with the scientific and technical basis, hardware system and software, products, products generation, products use;  Tutorials (short presentations and user manuals) for using the system, at different level of skills and responsibilities (data acquisition, product use, system operator, etc.);  Example of data sets which can be used for learning how to use the system, from data ingestion to advanced uses.

14.4.2.3 Relationship with the system development cycle

The milestones defined by ESA ensure the compliance with the system development cycle. The consortium is compliant with these milestones.

14.4.2.4 Reviews and milestones identification and associated documentation

The reviews and milestones are defined in Volume 3 of our proposal: Management & Administrative Proposal.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 208 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

The associated documentation is defined in Volume 4.1 of our proposal: Implementation Proposal.

14.4.2.5 Expected CFI

Expected CFI are defined in Volume 4.1 of our proposal: Implementation Proposal.

14.4.2.6 Software deliveries

Software deliveries are defined in Volume 4.1 of our proposal: Implementation Proposal.

14.4.3 Software engineering standards and techniques

14.4.3.1 Coding languages and standards

Languages used to develop SLP2 are:  The C++ language mainly used for the L2b module and conversion tools;  The Python language mainly used for the launcher and the GUI;  Shell for scripts used in the project. CS-F and CS-R applies the following coding rules:  Its internal guide for a good use of C++ language. This guide is strongly inspired of “C + + Coding Standard” of Herbb Sutter and Andrei Alexadrescu, published by Addison Wesley, and includes elements from the C + + FAQ of the Developpez.com site. Its main purpose is to establish a set of best practices critical to the development in C + +. The first part provides a set of mainly stylistic conventions, the second part provides a way to design and develop in C + + to increase robustness and maintainability;  Its internal standard for a good use of Python language. This guide includes coding rules, naming conventions and code presentation model.

14.4.3.2 Naming conventions

For the C++ language, in general:  Names are constructed by using case change to indicate separate words, as in TimeStamp (versus Time Stamp);  Underscores are not used;  Variable names are chosen carefully with the intention to convey the meaning behind the code;  Names are generally spelled out; use of abbreviations is discouraged. (Abbreviations are allowable when in common use, and should be in uppercase as in RGB.) While this does result in long names, it self-documents the code.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 209 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

For class name, file name, variable name or other name, the naming conventions defined in http://www.itk.org/Wiki/ITK_Coding_Style_Guide#Naming_Convention are applicable for Sen2-Agri development (replacing "itk" keyword by "s2agri"). The naming conventions applied for the Python language are defined in the corresponding CS SI internal standard.

14.4.4 Software development and software testing environment

Project dedicated facilities are defined in Volume 3 of our proposal: Management & Administrative Proposal [MNGT].

14.4.5 Software documentation plan

14.4.5.1 Software documentation identification

With a view of homogeneity, CS-R will use ESA documentation identification rules. If no such rules are defined, CS-R proposes the following rules:

14.4.5.1.1 Document name

A reference number never changes. The reference for the Sen2-Agri is composed as follows: S2AGRI-XXXX, where XXXX specifies the document type. For document where several documents of the same type are to be issued during the project - such as the Technical Note (TN), the Report (RP), the Progress Report (PRPT) and the Minutes of Meeting (MMM) - an ordering number on four digits (numbering is specific to each type of document) is added. Examples: S2AGRI-TN-0001  First technical note edited on the project; S2AGRI-PRPT-0004  Fourth progress report edited on the project.

14.4.5.1.2 Issue and revision

The documentation versioning is Vxx.yy where xx is a two digit counter to be increased for each document version, and yy is a two digit counter to be increased for each document revision. A “draft” annotation (with a number if necessary) can be added at the end of the document version when relevant. The issue evolves as follows:  when being written: issue = DR (draft),  from the distribution of the first version of the document: first issue = 01, the following ones = 02, 03, …  when a new version is being written: issue = Vxx.yydraft, where xx is the number of the version being written. Examples: V01.00  First version of a given document;

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 210 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

V04.00draft2  Second draft of the fourth version of a given document.

14.4.5.1.3 Document file name

Document file naming scheme is defined as follows: [-Comment>]_V.[.d]., where “Comment” is optional and used if needed to further clarifies the content of the document (spaces are avoided in this field). Examples: S2AGRI -TN-01_V01.00.doc; S2AGRI -PMP_V01.01.pdf; S2AGRI -TN-0005-PrototypeResults_V01.00.doc  First issue of the fifth technical note, this technical note presents the results of the prototyping activity; S2AGRI -PRPT-0004-Oct2012_V01.00.d2.doc Second draft of the first issue of the fourth progress report, which is edited end of October, 2012;

14.4.5.2 Documentation deliveries

The list of deliverable documents is given in Volume 4.1 of our proposal: Implementation Proposal.

14.4.5.3 Software documentation standards

The deliverables provided by the CS SI consortium are compliant with the applicable ECSS standards. When ECSS DRDs are provided, they are applied, thus reinforcing the compliance to ECSS standards. For documents where no ECSS DRDs are provided, CS SI will propose document templates to ESA, except if ESA provides CS SI with specific templates. In both cases, the compliance to ESA documentation requirements is warrantied. The used electronic formats are the following ones:  PDF for the signed documents;  Word, Excel or Powerpoint for files native of text, spreadsheet or presentation types;  JPEG or TIFF for the images;  Zip for the multi-documents volumes. All the documents are delivered:  on computer readable media (CD-ROM, DVD-ROM if needed);  plus two paper copies. The project documentation is managed in contents in the Configuration Repository associated with GForge, using SVN, in the same way as the software.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 211 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

15 Appendix B - Sofware Reuse File

15.1 Introduction

15.1.1 Purpose of the annex

This document, Software Reuse File (SRF), describes any item of software, which it proposes for reuse. The content is that defined by ECSS-DRD and it has been defined in relation to the requirements of the STC of the Sen2-Agri ITT. It explains the reason why the software is proposed for reuse, where and the extent to which the software would be integrated in the software deliverables, the ownership of the software item and the license conditions on which the software could be used by the Tenderer /ESA/the Sen2-Agri Users/a third party during the contract, and after the contract’s conclusion.

15.1.2 Scope of the annex

The document describes the software to be re-used for Sen2-Agri components development: L2a processor, L3a Cloud-Free generator, L2b Sen2-Agri bio-physical generator and L3b Sen2Agri products generator. The software proposed to re-use is open source software in order to answer to the requirements in terms of BIPR and operational software constraints corresponding to the framework of the Sen2-Agri development.

15.1.3 Structure of the annex

The document is structured as described below:  This section gives the purpose, the scope and the structure of the document (applicable and reference documents, acronyms and abbreviations being given at the beginning of this document in section 1.3).  Section 15.2 gives a description of the component being produced and of the scope of the project.  Section 15.3 presents the conclusion of the annex. 15.2 Third Party products and required Software licenses

This section provides an analysis of the 3rd party products proposed for re-use in the Sen2- Agri SoW [AD.1] and STC. For each of these products, the next characteristics have been analysed:  Sen2-Agri functionalities coverage;  Code source availability;

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 212 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

 Documentation quality;  Versions evolution;  Code covered by validation and testing;  Operational use case. These items are an adaptation of the table in Annex 4 required in the STC document. The characteristics analysed are those available during the ITT answer phase, previous to the Sen2-Agri development. This chapter also identifies required Software Licenses, and lists all the development and documentation production tools.

15.2.1 Software Licence and Intellectual property on the proposed solution

This chapter provides also an analysis of the technical solution proposed for re-use regarding the intellectual property requirements described in the contract draft and the Part II, Option A of the GCCs. This chapter also identifies and describes the required Software Licenses of the Sen2-Agri development based on the OTB applications solution and the future ESA-S2- ToolBox. However, before describing these elements, we introduce a reminder on free license products, in order to categorize the impact on ESA, Sen2-Agri users and other third party use during and after the project.

15.2.1.1 Free license categorization and meaning

Free license are generally classified into the following three main categories, according to ascending permissivity:  Strong copyleft licenses (GPL, CeCILL);  Weak copyleft licenses (LGPL);  Permissive licenses (BSD, MIT, and Apache). All these licenses categories share some general features. They all allow free use regardless of domain or country2. They all allow redistribution. They all allow modification. They all allow distribution of the modifications (these are known as the four freedoms of free software). The categories differ in how redistributed code can be licensed if someone decides to exercise his right to redistribute. Strong copyleft licenses like GPL or CeCILL mandates that derived products are redistributed under the same terms as the original FOSS component that is used to build the product. This means that an image processing filter built using a CeCILL licensed library will also be subject to the same CeCILL license. This characteristic of the strong copyleft licenses is sometimes known as a "reciprocal" property: if I use code from someone under a copyleft license for building a product, I will also distribute this product under the same license so other people can also built something else on top of it.

2 Pratically some licence have been adapted to the respective national laws, like CECILL which is a french adaptation of GPL licence. The warranty clauses may be amended by the local laws

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 213 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Weak copyleft licenses like LGPL, EPL or CeCILL-C are similar in spirit but the license spreading feature can be limited to modification of the original code. As an example, if an image processing filter uses an LGPL based library and is linked to it using dynamic linking only, then only the changes to the library must be distributed under the terms of the LGPL and the complete program can be distributed under other license terms if desired. So the "weak" term refers to the fact license reciprocity is more limited. Permissive licenses like MIT, BSD or Apache licenses do not mandate any licensing terms for derived product. This means an image processing filter built using an Apache license library can be distributed under any licenses terms, even if the original Apache code itself has been modified. All these licenses share one common point which is the respect of the original copyright notice and licenses which cannot be modified and must be mentioned in the derived product. As seen, copyleft notion has to deal with distribution agreement. It is better identified as a “reciprocal” effect but may sometimes be negatively referred to as “viral propagation”, “infection” or “contamination” in some cases. For instance, let’s consider a project which includes any amount of source code from free licensed product “A” and there is a need to make changes to some part of source code, corresponding to additional code “A’ ”, on the one hand, plus a need to add a wrapping layer “B”, on the other hand, “A” + “A’ ”+ “B” aiming to create a new “Alpha” application. “Alpha” product diffusion license can be chosen only according to “A” original license itself as explained hereafter:  whenever “A” is distributed under the terms of a strong copyleft license, the entire new or modified pieces of code (“A’ ” in this example) or derived work (“B”) becomes subject to the terms of the original license,  whenever “A” is distributed under the terms of a weak copyleft license, in some cases only modified work becomes subject to the terms of the original license. Thus, whereas “A” and “A’ ” will be subject to the terms of “A” original license, yet “B” may be submitted to another kind of license. Some conditions must be fulfilled in such a case: if both “A” and “A’ ” are part of a dynamically linked library and the final user is given the capability of replacing “A+A’ ” in order to introduce his own modification ““A+A’ +A’’ ”, then “B” may be distributed under a different license. On the other side, if both “A” and “A’ ” are part of a statically linked library, then “B” should be distributed under the terms of the same license. Exact conditions of distribution are described within the license terms themselves. A careful attention must be paid onto the distribution license version, either LGPL v2.1 or LGPL v3, which differ on this point.  Whenever “A” is distributed under the terms of a permissive license, then “A’ ” and “B” may be distributed under any kind of license, in fact, even “A” can be relicensed if needed. Distribution licenses type depends on two major characteristics. First, the kind of distribution of a given license is conditioned by the intention or not to have it distributed to third parties or not. Thus, whenever developers use a given product, even

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 214 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21 modified, for private usage (private may be understood even within a firm), the derived product may be kept private and secret and need no specific license itself. However, whenever it is intended to distribute the product to third parties, then either updated or derived products should be distributed under free licenses as well. Moreover, whenever one intends to distribute pieces of code under the terms of a copyleft license, then this distribution may be strictly limited to the product recipient alone. It is not mandatory to publish it on internet or to deliver it back to the original product former authors or community. Yet exception may be found in some cases (see later).

15.2.1.2 Impact of free licenses on customers

Customers may be led to change their distribution policy for some products originally developed for their own internal use with no initial intention to have them shared or edited. Whenever they decide later on to have these products finally distributed to other space agencies for instance or industrial, they have to reconsider the licensing terms of the included free software components. In order to make this kind of distribution policy changes possible, one way is to avoid using strong copyleft components event for internal products. This prevents expensive developments to get rid of some restrictive COTS for instance and replace them by more permissive equivalent. Using weak copyleft licenses products without any change enables to guarantee that no code developed within the project should fall under distribution rules that may get incompatible to related intellectual property laws.

15.2.1.3 Free license products in the proposed solution

There are two very important things to understand about copyleft type licenses. The first one is that they apply to the derived product software. They do not apply to the data processed by the software. Moreover, they do not apply to any separated works (not linked against the software or remote launching). The second one is that license terms apply only downstream, i.e. from developer to users, not upstream towards original developers. In our case concerning image processing, the images themselves are not subject to any of the licenses, regardless of the fact that the processing programs used are subject to the terms of GPL, LGPL, Apache or any proprietary license. A system that simply runs executable with a copyleft program is also not considered to be a derived product; it is "simple use". As a concrete example, if we consider a processing script in Python that schedules several executables, neither the processing chain framework, nor the scripts that launch the various executables, nor the processed data is subject to copyleft, even if one or two of the executables are copyleft because they use a copyleft library. We can use any license. The main software to re-use for the Sen2-Agri development has the next license:  Orfeo ToolBox (http:/www.orfeo-toolbox.org/otb/): OTB is distributed under a strong copyleft free software license CeCILL. This licence was created by CNRS, INRIA and CEA in order to have a license that is a complete equivalent of GNU GPL and is compatible with the French law (cf.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 215 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

http://www.cecill.info/licences/Licence_CeCILL_V2-en.html). We can consider that CeCILL is similar to GPL in the rest of the document;  New S2-ToolBox which will be based on BEAM ToolBox (http://www.brockmann- consult.de/cms/web/beam/): we can consider that it will be distributed under a strong copyleft free software license GPL according to the ESA Intended Invitation To Tender 13.155.13;  GDAL/OGR library (http://www.gdal.org/) which is distributed under a under an X/MIT style Open Source license by the Open Source Geospatial Foundation;  OpenJPEG which is distributed under a permissive free software license New BSD;  SLURM which is distributed under GPL license. As the license applies to the program only and in fact it applies only when the program is distributed to another legal entity, there is no need to publish automatically the source code to everyone or even to provide it to the original upstream library developer. This means that if an image processing filter program is built using a strong copyleft GPL license; this program will be subject to the GPL license. As ESA decides to distribute this program to end-user entity, then this program will be bound to the GPL license and the organization that receives the binary (and only this organization) may ask for its source code too. If some contributions to the original projects are necessary to enhance some functionality and in order to ensure maintainability, they will be done in respect to the existing licences. For other questions concerning OTB license see FAQ page in the OTB web site (http://www.orfeo-toolbox.org/FAQ/OTB-FAQ.html# SECTION00030000000000000000). Concerning the new ESA-S2-ToolBox, ESA should provide more information about the license used by this project as soon as possible. We will consider in the rest of the document that the ESA-S2-ToolBox is provided by ESA with the GPL license and with no copyright issues. For GDAL all information could be found into the FAQ (http://trac.osgeo.org/gdal/wiki/FAQGeneral#WhatlicensedoesGDALOGRuses).

15.2.1.4 Backgound Intellectual property in the proposed software solution

The only code that neither ESA nor S2-Agri consortium owns is the original underlying code from the existing third party products (Orfeo Toolbox, GDAL/OGR library, OpenJPEG and S2-ToolBox). This code already exists, it is distributed under the terms of the CeCILL, X/MIT, New BSD or GPL license, and the IPR is owned respectively by CNES (and other third parties), OSGeo Foundation, UCL and other partners and ESA, not S2-Agri consortium. The GPL license used into these allows us to comply with all requirements from ESA and be compliant with OTB, GDAL/OGR, OpenJPEG and S2-ToolBox too. There is really no limitation for S2-Agri consortium: we have the code of these FOSS softwares so we can distribute derived developpments as described in the SOW and the SCT with a GPL licence. Concerning existing processing chains available in the consortium, their copyrights will be clearly identified and respected if they are used into the Sentinel-2 project. In this case, they will distribute with a GPL licence.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 216 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

15.2.1.5 Intellectual property in the proposed software solution

According to the fact that the Clause 39 of the ESA General Clauses and Condition document will be applicable in this project, the Sen2-Agri consortium claims the IPR ownership on the code development done during the project. This claim is also compatible with the free and open source licence used into this project. The copyright of the code doped into this project could be shared with ESA if the agency requested it because the Sen2-Agri solution is based on open source code. In any cases the use of a free and open source licence enable any third party to use, distribute and modify the code. If they do any modifications of the source code after its release, they can keep their copyright on them.

15.2.1.6 Conclusion

The solution proposed by S2-Agri consortium is completely based on Free and Open- Source Software (FOSS), so ESA and end-users have complete control over the solution on the long term. From both technical and legal points of view, ESA or end-users have the possibility to change the code either for long-term features evolutions or urgent and critical bug fixes. There are no IPR issues if the status of the S2-ToolBox license is compatible with the project, neither for code developed as part of the study nor considering the existing tools and libraries that will be delivered. For all these items, ESA, end-users and S2-Agri consortium will have full freedom to use, modify and distribute the products, or integrate them in its own systems.

15.2.2 Presentation of the software intended to be reused

Following the requirement described into the section 4.5.2 of the SCT software re-used are justified and described (specification, design, source code, heritage and validation status) in this preliminary version of the SRF-Sen2Agri. Issues such as why the software is proposed for reuse, where and the extent to which the software would be integrated in the software deliverables, the ownership of the software item and the licence conditions on which the software could be used by the Tenderer/ESA/Sen2-Agri users/a third party during the contract, and after the contract’s conclusion are described. The Background Intellectual Property Rights (BIPR) which are connected to the proposed reuse software must not impose any restrictions which would limit the free use which ESA and the Sen2-Agri users intend to make of the software deliverables. Next sections provide a presentation of the software to be re-used as a preliminary version of the SRF. The content is that defined by ECSS-DRD and it has been defined in relation to the requirements of the STC of the Sen2-Agri ITT. List updating will be done in an incremental way following the Sen2-Agri development phases.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 217 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

15.2.3 [Orfeo ToolBox]

Name OTB Main Features Image processing library adapted for huge image processing from L0 to added-value products Developer/Ownership Developer: CS / Owner: CNES Licencing conditions Open-source under CeCill v2 licence agreement Industrial Property The source code is freely available and can be re-used, modified Constraints distributed by Tenderer, ESA, the Sen2-Agri users or any third-party during et after the contract. Applicable dispositions for The library benefits from : maintenance, installation - a bug tracker and two mailing list to provide efficient support and training - extensive nightly testing on various platform ans OSs to ensure validation and consistency - active maintenance from the community - regular training courses Commercial SW needed for None, the OTB library is based on open-source components execution Development and The library is multi-platform and portable (available under multiple execution environment operating systems) Version and components OTB version 3.16 based mainly based on ITK, GDAL, OSSIM, OpenJPEG, Boost Language C++ Size 259000 lines

15.2.4 [Sentinel2 ToolBox and Sentinel Exploitation Tools]

This table has been completed with our current knowledge of the Sentinel Exploitation Tools developed on BEAM and the future S2 ToolBox based also on BEAM.

Name S2 ToolBox / Sentinel Exploitation Tools Main Features L2A Pre-processing of data products from the Sentinel-2 optical and high spatial resolution mission. Support other optical missions by reusing existing ESA software and tools and additional new operational optical missions. Integration in the toolbox of the Sentinel-2 Level-2A processor and a tool for converting reflectance values into radiances and providing the radiometric uncertainties associated to Sentinel-2 top-of-atmosphere images. Developer/Ownership Developer: unknown at the ITT submission / Owner: ESA Licencing conditions Open-source under GPL licence agreement Industrial Property The source code is freely available and can be re-used, modified

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 218 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Constraints distributed by Tenderer, ESA, the Sen2-Agri users or any third-party during et after the contract. Applicable dispositions for The library benefits from strong support offers by the BEAM maintenance, installation community and all the advantages offers by a mature open source and training solution. Commercial SW needed for None to our knowledge the Sentinel-2 toolbox library will be based on execution open-source components (BEAM toolbox) Development and The library is multi-platform and portable (available under multiple execution environment operating systems) as BEAM. Version and components S2 ToolBox version 1.0 based mainly on BEAM / Released version of Sentinel Exploitation Tools Language Java Size Re-use from beam provides 387000 lines

15.2.5 [Geospatial Data Abstraction Library]

Name GDAL Main Features I/O management of various raster and vector data format Developer/Ownershi Developer: GDAL community / Owner: OSGeo foundation p Licencing conditions Open-source under X/MIT licence Industrial Property The source code is freely available and can be re-used, modified distributed Constraints by Tenderer, ESA, the Sen2-Agri or any third-party during et after the contract. Applicable The library benefits from : dispositions for - a bug tracker and two mailing list to provide efficient support maintenance, - extensive release testing on various platform and OSs to ensure installation and validation and consistency training - active maintenance from the community - regular training courses - online documentation and wiki pages Commercial SW None, the GDAL/OGR library is based on open-source components needed for execution Development and The library is multi-platform and portable (available under multiple execution operating systems and compilers : environment - http://trac.osgeo.org/gdal/wiki/FAQGeneral#WhatoperatingsystemsdoesGDAL- OGRrunon - http://trac.osgeo.org/gdal/wiki/SupportedCompilers Version and GDAL version 1.10 based mainly on various raster (TIFF, PNG, JPEG2000) I/O components library and vector manipulation functionalities (GEOS library) Language C++

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 219 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Size 907000 lines

15.2.6 [OpenJPEG]

Name OpenJPEG Main Features I/O management of JPEG2000 format Developer/Ownership Developer: OpenJPEG community / Owner: UCL, IntoPix, CNES, CSSI Licencing conditions Open-source under New BSD licence Industrial Property The source code is freely available and can be re-used, modified Constraints distributed by Tenderer, ESA, the Sen2-Agri or any third-party during et after the contract. Applicable dispositions for The library benefits from strong support offers by the OpenJPEG maintenance, installation community and all the advantages offers by an open source solution. and training Commercial SW needed for None, the OpenJPEG library is based on C library and have no execution dependences. It will be used as a GDAL driver Development and The library is multi-platform and portable (available under multiple execution environment operating systems and compilers) thanks to the C library. Version and components OpenJPEG 2.0 Language C Size 30000 lines

15.2.7 [SLURM]

Name Simple Linux Utility for Resource Management Main Features Job scheduler Developer/Ownership Developer: SLURM community (primarily by Lawrence Livermore National Laboratory) / Owner: UCL, IntoPix, CNES, CSSI Licencing conditions Open-source under GPL license Industrial Property The source code is freely available and can be re-used, modified Constraints distributed by Tenderer, ESA, the Sen2-Agri users or any third-party during et after the contract. Applicable dispositions for The library benefits from strong support offers by the SLURM maintenance, installation community and all the advantages offers by an open source solution. It and training is used in several computer clusters in major projects around the wolrd. Commercial SW needed for None execution Development and The library has been developed for Linux but it is also supported by execution environment Solaris, MacOSX operating system.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 220 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Version and components SLURM v2.5 Language C Size 1000000 lines

15.2.8 Others

Existing processing chains developed by CESBIO on OTB and selected during the analysis and benchmarking steps will be release under a CECILL license. Other software such as Python or Bash language will be re-used for the Sen2-Agri Orchestrator. They are not described here since they are regular open source language. Associated detailed description will be added during the project if necessary. The ngEO download manager is considered as a CFI provided by ESA to optimize the download management of the Sentinel-2 L1c products.

15.2.9 Compatibility of existing software items with project requirements

15.2.9.1 Functions implemented

This section describes which part of the project requirements (RB) are intended to be implemented through software re-use.

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 221 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

New Re-used Improvement of existing Component Re-used modules developm Developer Software functionnalities ent CS-R R with support of Sen2-Agri Orchestrator SLURM Job schudeling and monitoring - X CS-F - I/O classes (especially for JPEG2000 via GDAL/OG OpenJPEG) If necessary into OpenJPEG to Mosaicing and I/O R and - Polygonize and rasterize operations CS-R with support of ensure the stability and the - management OpenJPE CS-F - OGRGeometry classes and functionalities: performance of the JPEG2000 I/O. G intersection, aggregation, polygon simplification Sentinel S2PAD module - - CS-R will only deploy S2 Atmospheric Exploitati the Sentinel corrections on Tools Exploitation Tool on the OS - Base Filters: thresholding, convolution, interpolation, rescaling, morphological operations - Functor based image filters Cloud free generator - Iterators X CS-R with support of OTB X (UR-4) - Composite filter design (Based on the [RD-12]) CS-F - GML and XML data access - Auxiliary and Ancillary data access - Polygonization and rasterization - Venµs L3 generator classes if necessary

Page 222 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

New Re-used Improvement of existing Component Re-used modules developm Developer Software functionnalities ent - Existing vegetation indicies, - threshold filters, - Functor based image filters - Iterators Biophysical indicator - mathematical operators and BandMath CS-R with support of OTB X X generator (UR-3) functionalities CS-F - GML and XML data access - Auxiliary and Ancillary data access - Internal CESBIO code about simulation and segmentation/classification methods - Features extraction framework: radiometric indices, moment computing on neighboord, texture analysis, edge detection, LSD, … - Segmentation methods: Mean Shift, morphological profiles Dynamic cropland - Machine Learning framework: SVM, K- CS-R with support of product generator (UR- OTB X X Means, SOM, Random Forest, KNN, … CS-F 1) - Fusion of classifier framework: majority voting, Dempster Schafer framework - Object Based Image Anlaysis framework: label statistics computation, connected component analysis, vector ouput management

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 223 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

New Re-used Improvement of existing Component Re-used modules developm Developer Software functionnalities ent - Temporal analysis - Internal CESBIO code about simulation and segmentation/classification methods - Features extraction framework: radiometric indices, moment computing on neighboord, texture analysis, edge detection, LSD, … - Segmentation methods: Mean Shift, morphological profiles - Machine Learning framework: SVM, K- Cultivated crop type and Means, SOM, Random Forest, KNN, … CS-R with support of OTB X X area extent product - Fusion of classifier framework: majority CS-F generator (UR-2) voting, Dempster Schafer framework - Object Based Image Anlaysis framework: label statistics computation, connected component analysis, vector ouput management - Temporal analysis: ??? (JIA) - Internal CESBIO code about simulation and segmentation/classification methods

Table 15-1: Functions implemented

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 224 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

15.2.9.2 Availability and quality status

Next table describe the availability (for source code) and quality status of the intended re-used software. The quality criteria are the documentation availability and its quality, in particular for development, the history of code evolution, the quality of validation and some examples

of operational uses.

ion ion

used used

-

Re Software Items Source Code Availabili ty Coding rules/qua lity Documen tation Avail./Qu ality Vers history Val. Tests & coverage Operatio nal Uses

High, User guide & around OTB & Doxygen / Very 3000 integrated libs Y C++ good Y nightly tests Y Sentinel-2 Java and ToolBox and Python for Sentinel S2PAD Unknown at Unknown at Exploitation software the ITT the ITT Tools Y submission Y submission - C++ Fuzzy Wiki pages & testing at Doxygen / Very each GDAL/OGR Y good Y release Y C Wiki pages & High, Doxygen / around 300 OpenJPEG Y good Y nightly tests Y C Online documentation SLURM Y and tutiorials Y - Y

Table 15-2: Availability and quality status 15.3 Conclusion

This table shows a synthesis of the re-use strategy:

SW element Reuse justification Orfeo ToolBox High performance and high re-use for processing: vegetation indices, segmentation method, classification method, radiometric corrections and resampling Sentinel-2 Ingestion of S2 L1c product and production of L2a product with S2PAD ToolBox and component. Sentinel Integration of S2Agri standalone component into the S2 ToolBox Exploitation Tools UCL-Geomatics Earth and Life Institute Université catholique de Louvain Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM Phone +32 (0) 10 47 23 74 - Fax +32 (0) 10 47 88 98 Page 225 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

GDAL/OGR I/O management of raster and vector data OpenJPEG I/O operation of JPEG2000 format. SLURM Main component of the S2Agri Orchestrator (job schudeling and monitoring)

Table 15-3: Summary of SW reuse

UCL\ELI-Geomatics Croix du Sud, 2 bte L7.05.16 B - 1348 Louvain-la-Neuve BELGIUM

Page 226 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

16 Appendix C - Document Requirement Definition

This section provides the Document Requirement Definition (DRD), detailing the Table of Contents (TOC) for each contractually deliverable document that will be provided in PDF format to ESA. The list of deliverables included in this annex is presented in Table 16-1.

ID Acronym Name D.1 URD User Requirements Document D.3 TS Technical Specifications D.4 DJF Design Justification File D.5 DDF Design Definition File D.6 ATD Acceptance Test Document D.8 QRR Qualification Review Report D.9 PVAR Prototype Validation and Assessment Report D.10 DP Demonstration Plan D.11 CB Capacity Building Plan D.12 VR Validation Report D.14 ER Exploitation Report D.15 FR Final Report

Table 16-1: List of deliverables described in the Document Requirement Definition The following deliverables (Table 16-2) are not documents, and therefore not included within this section.

ID Acronym Deliverable name or description D.2 TDS Test Data Sets D.7 - Accepted processing system and source codes (including tools) D.13 - EO products delivered within the use case D.16 PW Project Website D.17 PP Promotional Package

Table 16-2: List of deliverables not described in the Document Requirement Definition

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 227 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

16.1 User Requirements Document (URD)

This document constitutes the deliverable D.1 for the Sen2-Agri project.

Deliverable User Requirements Document (URD) Description and objective The URD defines the consolidated user requirements for agricultural EO products, the test sites and the demonstration cases of the Sen2- Agri project Related documents ESA Sentinel-2 agriculture SoW Preliminary contents 1. Introduction 2. Applicable and reference documents 3. Terms, definition, abbreviation terms 4. Summary of initial requirements of the Users 5. Consolidated Sen2-Agri users’ requirements - Champion user group presentation - User consultation methodology - Results 6. Selection of potential sites and main crops - Criteria - Analysis and results 7. Sentinel-2 exploitation scenario - Constraints - Scenarios - Resulting requirements 8. Summary Owning Work Package ID WP 1000 Update methodology The URD is delivered at the end of the User Requirements activity (WP 1000) in advance of the milestone 1 (User Requirement Review - URR). Comments arising from this meeting shall be addressed to update and close this document. Once the URD is accepted, it is not envisaged to undergo further updates.

16.2 Technical Specifications (TS)

This document constitutes the deliverable D.3 for the Sen2-Agri project.

Deliverable Technical Specifications (TS) Description and objective The TS defines all Sen2-Agri EO products and services, including data formats and metadata and proposes a development and an ©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 228 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

implementation plan for the processing system. It aims at providing a technical answer to the D.1-URD which details the technical specifications of the four S2-Agri EO products (dynamic cropland masks, cultivated crop type and area extent, vegetation status indicator and composites of cloud-free surface reflectance) and which presents how to deliver the four S2-Agri EO products to the end-users (data formats, metadata, data access, service access). It also describes the functional and non-functional requirements applicable to the system items and all (preliminary and update) external interfaces. Related documents ESA Sentinel-2 agriculture SoW D.1 - URD Minutes of the User Requirement Review (URR) meeting S2 and S2 toolbox documentation Preliminary contents 1. Introduction 2. Applicable and reference documents 3. Terms, definition, abbreviation terms 4. Synthesis of the users’ requirements 5. Sen2-Agri products definition - Cloud free composites - Dynamic cropland mask - Cultivated crop type and area - Vegetation status 6. Sen2-Agri products validation plan - Validation dataset - Methodology 7. Sen2-Agri products technical specification For each product: - Naming convention - Spatial extent, spatial resolution, temporal resolution - Projection - Layers description - Format - Metadata - Estimated size - Data access and distribution 8. System requirements specification - System overview

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 229 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

- System requirements - Validation requirements - Traceability - Logical model description 9. Interface Control Document - System overview - Requirement and design - Validation requirements - Traceability 10. Link between URD and TS Owning Work Package ID WP 3100 Update methodology The TS is delivered at the beginning of the EO Products specification and algorithm design activity (WP 3000) for the first Progress Meeting (PM1) and in advance to the milestone 2 (Critical Design Review - CDR). Comments arising from this meeting shall be addressed to update and close this document. Once the TS is accepted, it is not envisaged to undergo further updates.

16.3 Design Justification File

This document constitutes the deliverable D.4 for the Sen2-Agri project.

Deliverable Design Justification File (DJF) Description and objective The DJF summarizes the algorithm review and presents the results of the benchmarking exercise. It describes all important design choice justifications, trade-offs, feasibility analyses as well as the supporting technical assessment (test procedure, results analysis and evaluation) that show that the products and system meet the requirements. All elements of the system testing activities (code, unit testing and integration testing) are also included. Related documents ESA Sentinel-2 agriculture SoW D.1 - URD D.3 - TS Preliminary contents 1. Introduction 2. Applicable and reference documents 3. Terms, definition, abbreviation terms 4. For each benchmarked algorithmic step: - Design choice (statement of the issue that leads to design

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 230 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

choice) - Solution (proposed best solution to the design decision) - Alternatives (alternative solutions considered and reasons for non-selection, supported by analyses) - Impact - Comment - Analyses (to support the selection of the best solution and the non-selection of the alternatives) 5. System design - Verification plan - Validation plan - Unit and integration test plan - Validation specification - Reuse files - Verification report Owning Work Package ID WP 3300 & 3400 Update methodology The DJF is delivered during the EO Products specification and algorithm design activity (WP 3000) for the Preliminary Design Review (PDR) and in advance of the milestone 2 (Critical Design Review - CDR). The DJF is updated not only based on the comments arising from this meeting, but also at all stages of the development and review processes of the project.

16.4 Design Definition File

This document constitutes the deliverable D.5 for the Sen2-Agri project.

Deliverable Design Definition File (DDF) Description and objective The DDF is a supplier-generated document. It documents all levels of design engineering results, description of the final algorithms (in the form of an Algorithm Theoretical Basis Document) and processing chains. Related documents ESA Sentinel-2 agriculture SoW D.3 TS D.4 DJF Preliminary contents 1. Introduction 2. Applicable and reference documents 3. Terms, definition, abbreviation terms

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 231 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

4. Sen2-Agri system overview 5. Algorithm Theoretical Basis Document - Pre-processing - Processing chain for time series of cloud-free composites - Processing chain for dynamic cropland mask - Processing chain for cultivated crop type and area product - Processing chain for vegetation status product The description of each algorithmic step will include: (i) the physics of the problem, (ii) the scope of the algorithm, (iii) the detailed mathematic presentation of the algorithm, (iv) the algorithm assumptions and limitations and (v) the definition of the input data required and of the output generated. 6. System design - Design documentation - Configuration file - Release document - User manual - Source codes and media labels Owning Work Package ID WP 3400 Update methodology The DDF is delivered at the end of the EO Products specification and algorithm design activity (WP 3000) in advance of the milestone 2 (Critical Design Review - CDR). Comments arising from this meeting shall be addressed to update and close this document. Once the DDF is accepted, it is not envisaged to undergo further updates.

16.5 Acceptance Test Document

This document constitutes the deliverable D.6 for the Sen2-Agri project.

Deliverable Acceptance Test Document (ATD) Description and objective Defines all relevant tests and a validation procedure to ensure the full functionality of the final processing system Related documents ESA Sentinel-2 agriculture SoW D.1 URD D.3 TS D.5 DDF Preliminary contents 1. Introduction 2. Applicable and reference documents ©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 232 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

3. Terms, definition, abbreviation terms 4. Sen2-Agri system overview 5. Acceptance test approach 6. Acceptance criteria 7. Required configuration settings 8. Unit testing procedure 9. Integration testing procedure 10. Validation testing procedure - Method and tools - Validation datasets 11. Validation of the testing procedures 12. Schedule 13. Validation test scripts - For the installation qualification - For the operational qualification Owning Work Package ID WP 3500 Update methodology The ATD is delivered at the end of the EO Products specification and algorithm design activity (WP 3000) in advance of the milestone 2 (Critical Design Review - CDR). Comments arising from this meeting shall be addressed to update and close this document. Once the ATD is accepted, it is not envisaged to undergo further updates.

16.6 Qualification Review Report

This document constitutes the deliverable D.8 for the Sen2-Agri project.

Deliverable Qualification Review Report (QRR) Description and objective Documents the review of the processing system that takes place during the Qualification Review (QR) following the procedures defined in the ATD. It aims at assessing that the system meets its requirements and is ready to be installed and undergo acceptance testing at users’ premises. Related documents ESA Sentinel-2 agriculture SoW D.6 ATD Preliminary contents 1. Introduction 2. Applicable and reference documents 3. Terms, definition, abbreviation terms ©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 233 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

4. System identification 5. System overview 6. Qualification test results 7. Recommendations 8. Test operations 9. Test data analysis Owning Work Package ID WP 4200 Update methodology The QRR is delivered at the end of the during the System Development activity (WP 4000) in advance of the milestone 3 (Qualification Review - QR). Comments arising from this meeting shall be addressed to update and close this document. Once the QRR is accepted, it is not envisaged to undergo further updates.

16.7 Prototype Validation and Assessment Report

This document constitutes the deliverable D.9 for the Sen2-Agri project.

Deliverable Prototype Validation and Assessment Report (PVAR) Description and objective Documents the results of the validation and assessment process of the implemented Sen2-Agri system, following the procedures defined in the TS. It gives a complete report of the activities executed to assess the quality of the generated Sen2-Agri prototype products and system and the results achieved. It aims at assessing that the prototype products and system meet their requirements and are ready to be implemented with Sentinel-2 imagery, over larger extent and in closer interactions with users. Related documents ESA Sentinel-2 agriculture SoW D.3 TS Preliminary contents 1. Introduction 2. Applicable and reference documents 3. Terms, definition, abbreviation terms 4. System identification 5. System overview 6. Description of in-situ data 7. Description of alternative products (for inter-comparison) 8. Qualitative validation 9. Quantitative validation 10. Test operations and results

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 234 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

- Products inter-comparison - Products assessment from the user point of view 11. Recommendations - For fixing errors and/or improving the overall products quality - For users Owning Work Package ID WP 4400 Update methodology The PVAR is delivered at the end of the System Development activity (WP 4000) in advance of the milestone 3 (Qualification Review - QR). Comments arising from this meeting shall be addressed to update and close this document. As the validation process is an integral part of the system development and prototype production itself, this report will be used to generate iterative and improved versions of the prototype products and processing system. Iterative versions of the PVAR can thus exist. However, once the PVAR is accepted at the milestone 3, it is not envisaged to undergo further updates.

16.8 Demonstration Plan

This document constitutes the deliverable D.10 for the Sen2-Agri project.

Deliverable Demonstration Plan (DP) Description and objective Documents the selected use cases and includes a dedicated implementation plan. It aims at preparing and guiding the execution of the task 5, with the demonstration cases. Related documents ESA Sentinel-2 agriculture SoW Preliminary contents 1. Introduction 2. Applicable and reference documents 3. Terms, definition, abbreviation terms 4. Learning lessons from the test cases 5. Use cases description - User presentation - Agro-ecological context - Landscape patterns and agriculture practices - Crop types - Actual satellite observation conditions 6. Updated user requirements 7. Data acquisition - EO time series ©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 235 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

- In-situ data 8. Critical success criteria 9. Validation plan Owning Work Package ID WP 4500 Update methodology The DP is delivered at the end of the System Development activity (WP 4000) in advance of the milestone 4 (Acceptance Review - AR). Comments arising from this meeting shall be addressed to update and close this document. Once the DP is accepted, it is not envisaged to undergo further updates.

16.9 Capacity Building Plan

This document constitutes the deliverable D.11 for the Sen2-Agri project.

Deliverable Capacity Building Plan (CB) Description Describes the planned training courses and capacity building activities tailored to the respective use cases and involved user entities Related documents ESA Sentinel-2 agriculture SoW Preliminary contents 1. Introduction 2. Applicable and reference documents 3. Terms, definition, abbreviation terms 4. User entities characterization For each entity: - Specific products and system requirements - Specific working environment - Specific needs in terms of capacity building 5. Capacity building plan - Computing infrastructures - Facilitation to data access 6. Training courses 7. Workshops Owning Work Package ID WP 5200 Update methodology The CB is delivered during the Demonstration Use Case activity (WP 5000) for the third Progress Meeting (PM3) and in advance of the milestone 5 (Final Meeting - FM). Comments arising from this meeting shall be addressed to update and close this document. Iterative versions of the CB could be generated during the WP5000.

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 236 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

16.10 Validation Report

This document constitutes the deliverable D.12 for the Sen2-Agri project.

Deliverable Validation Report (VR) Description and objective Documents the acquired EO and in-situ data of the use cases and in particular, reports the performance of the delivered EO products Related documents ESA Sentinel-2 agriculture SoW Preliminary contents 1. Introduction 2. Applicable and reference documents 3. Terms, definition, abbreviation terms 4. Description of EO time series 5. Description of in-situ data 6. Description of alternative products (for inter-comparison) 7. Qualitative validation 8. Quantitative validation 9. Products inter-comparison 10. Recommendations for use Owning Work Package ID WP 5600 Update methodology The VR is delivered at the end of the Demonstration Use Cases activity (WP 5000) for the User Demonstration meeting and in advance to the milestone 5 (Final Meeting - FM). Comments arising from this meeting shall be addressed to update and close this document. Once the VR is accepted, it is not envisaged to undergo further updates.

16.11 Exploitation Report

This document constitutes the deliverable D.14 for the Sen2-Agri project.

Deliverable Exploitation Report (ER) Description and objective Documents the implementation, results and an utility assessment of each use case Related documents ESA Sentinel-2 agriculture SoW D.10 - DP Preliminary contents 1. Introduction 2. Applicable and reference documents 3. Terms, definition, abbreviation terms 4. Summary of users’ requirements and successful criteria

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 237 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

5. Assessment procedure 6. Results 7. Utility and benefit Owning Work Package ID WP 5600 Update methodology The ER is delivered at the end of the Demonstration Use Cases activity (WP 5000) in advance of the milestone 5 (Final Meeting - FM). Comments arising from this meeting shall be addressed to update and close this document. Once the ER is accepted, it is not envisaged to undergo further updates.

16.12 Final Report

This document constitutes the deliverable D.15 for the Sen2-Agri project.

Deliverable Final Report (FR) Description and objective Final report of the Sen2-Agri project The public part of the document summarizes the data set, algorithms, products and final services achieved within the Sen2-Agri project. An internal part of the document provides the Executive with recommendations for future activities to further support the development and uptake of agricultural products based on the Sentienl-2 missions. Related documents All projects deliverables Preliminary contents 1. Introduction 2. Applicable and reference documents 3. Terms, definition, abbreviation terms 4. Sen2-Agri project objectives 5. Overview of the project activities and results - Cloud free composites - Dynamic cropland mask - Cultivated crop type and area - Vegetation status For each product: - Product description - Input dataset - Algorithms and processing chain - Product development, evaluation and validation 6. Assessment and recommendations - From the consortium (external and internal sections) ©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 238 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

- Form the users - Summary for a sustainable and efficient agricultural service for the future Owning Work Package ID WP 6400 Update methodology The FR is produced at the end of the project and is reviewed by ESA. Once accepted, it is made available on the Sen2-Agri web portal. This document is produced following professional standards since it is planned to distribute it to a wider audience.

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 239 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

17 Appendix D - Risk register

The Sen2-Agri science leader (with support from the Sen2-Agri Project Manager) is responsible for ensuring that the activities are completed to ESA's satisfaction, on time and to specification. Day to day responsibility will reside with the Project Manager. He will provide inputs to the overall Sen2-Agri risk register to aid in controlling priorities and allocation of resources. Risk Management is an iterative process, involving the following steps:  Risk Management Planning – a robust plan will be put in place to form a reference point for all personnel working on that project.  Risk Identification - identify all potential Risks.  Qualitative Risk Assessment - analyse each potential Risk in order to assess the likelihood of the Risk occurring and the Impact (on Cost, Schedule and/or Performance) if it does occur. Likely mechanisms or events that trigger the risk events should be identified where possible.  Risk Response Planning - for each Risk, which has significant impact, devise an effective risk handling option designed to prevent the risk developing and/or to minimise the impact if it does occur. Where a risk handling option is likely to require significant effort or resources to address the risk, these must be identified, as must the potential effectiveness of the risk response. Secondary Risks, arising from the initial risk response, need to be considered. Risk Owners will be identified.  Risk Monitoring & Control - draw up plans to routinely monitor, control, and report risks. These plans must involve the identification of new risks as well as the re- assessment of known risks. After the initial risks have been assessed and a plan produced to manage the most significant risks, the results are summarized in a Project Risk Register. This will contain Risk Owners, Probability and Impact figures, and risk reduction actions. The Risk Register will be updated on a monthly basis as part of the monthly reporting process. It will be maintained throughout all phases of the program, and is presented alongside the monthly reports at Review meetings. The current status of the Project Risk Register is provided in the following table.

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 240 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

ID Risk description Mitigation

Probability (H/M/L) Impact (H/M/L) Sen2-Agri 1 Existing open-source Adapt the used libraries to H H libraries do not fully project needs. satisfy the project requirements. This would lead to an increase of the development time. Sen2-Agri 2 During validation testing Make sure that the system L M it is found that a single design allows for easily scaling server deployment model out the hardware resources. does not satisfy the performance requirements. Sen2-Agri 3 Differences between Adapt the development platform M M development platform to match as close as possible the

(CS RO) and integration integration platform. platform (CS) may result Perform regular deployments in different software and tests on the integration behaviour. platform to catch potential misbehaviours as early as possible. Sen2-Agri 4 Final users do not have (1) Make provisions for M H an adequate hardware additional hardware needs at platform for system user premises or (2) select deployment. another user that possesses the adequate hardware infrastructure. Sen2-Agri 5 Sentinel-2 toolbox Reschedule, together with ESA M L interface requirements representatives, the Sentinel-2 and/or specification is toolbox integration not available in due time.

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 241 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

Sen2-Agri 6 During the project Train additional resources on L M execution, one or more project domain to be able to take key persons become over if such situation arises. unavailable. Sen2-Agri 7 Proposed engineers are CS engineers have a broad L H not qualified for the expertise in space projects. A project training plan can be conceived to train the CS RO engineers. Sen2-Agri 8 Some tasks may take Appropriate margin should be M L more time than initially included in the schedule. scheduled Activities should be parallelized as much as possible. Sen2-Agri 9 Quality of atmospheric Comparison with MACCS M H correction module of processing or in situ data, and S2PAD previous evolution of S2PAD from MPC (maintenance) Sen2-Agri 10 Unavailability of Preliminary test done by CS to M M Toolbox for interface connect BEAM to OTB via development GPT frame work Sen2-Agri 11 High Delay of S2 launch M H and problems during commissioning Sen2-Agri 12 New and complex URP Promote software re use to M M to generate decrease the cost impact Sen2-Agri 13 Lack of technical M M documentation for S2PAD (need DPM) and his connexion with BEAM

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 242 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

18 Appendix E - Traceability and compliance matrix

This appendix is in Chapter 3 of this proposal.

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 243 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

19 Appendix F - Letters of support

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 244 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 245 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 246 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 247 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 248 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)

Page 249 Sen2-Agri Ref. UCL – Geomatics – 2013 Part 2: Technical proposal Issue 0 Rev. 1 Date 2013/05/21

©UCLouvain – Geomatics 2013 This document is the property of the Sentinel-2 Agriculture consortium, no part of it shall be reproduced or transmitted without the express prior written authorization of UCLouvain-Geomatics (Belgium)