National Technical University of Athens Master

National Technical University of Athens School of Electrical and Computer Engineering MSc in Data Science and Machine Learning Paddy Rice Mapping based on Multi-temporal Sentinel-1 and Sentinel-2 Data in a High Performance Data Analytics Environment Master Thesis KOUKOS ALKIVIAIDS-MARIOS Supervisor: Karantzalos Konstantinos Associate Professor NTUA Athens, November 2020 National Technical University of Athens School of Electrical and Computer Engineering MSc in Data Science and Machine Learning Paddy Rice Mapping based on Multi-temporal Sentinel-1 and Sentinel-2 Data in a High Performance Data Analytics Environment Master Thesis KOUKOS ALKIVIAIDS-MARIOS Supervisor: Karantzalos Konstantinos Associate Professor NTUA Approved by the three-member examination committee on 12th November 2020. (Signature) (Signature) (Signature) ........................ ........................ ........................ K. Karantzalos C. Kontoes G. Goumas Assoc. Professor NTUA Reearch Director NOA Assistant Professor NTUA Athens, November 2020 (Signature) ......................................... Koukos Alkiviadis-Marios © 2020 – All rights reserved National Technical University of Athens School of Electrical and Computer Engineering MSc in Data Science and Machine Learning Copyright ©–All rights reserved Koukos Alkiviadis-Marios, 2020. It is forbidden to copy, store and distribute this work, all or part of it, for commercial purposes. Reprinting storage and distribution is allowed, for non - profit, educational or of a research nature, provided that the source is indicated and that this message is retained. Questions regarding the use of work for profit should be addressed to the author. The aspects and the conclusions contained in this document are those of the author and should not be construed as representing the official positions National Technical University of Athens. Abstract Over the last years, the continuous increase of global population, together with the climate change, is expected to affect the food sector significantly. Rice is a main source of nutrition for more than half of world’s population and it contributes significantly to food security, global economy and climate change. Towards the efficient rice growth and yield monitoring, the main objective of this thesis is to develop a generic and transferable model for rice crop mapping based on Sentinel-1 and Sentinel-2 imagery. Specifically, the present thesis deals with agriculture monitoring challenges, for the purposes of food security monitoring in South Korea. South Korea’s food security problems include the overproduction of rice, which consequently leads to low self-sufficiency in the pro- duction of other major crops. For this reason, the accurate and large scale mapping of the paddy rice offers valuable information for the high-level decision-making related to food security. Therefore, in order to address this problem, a big data paddy rice mapping application was developed. In this regard, a set of processes has been implemented, using the computing framework Apache Spark and the Hadoop Distributed File System (HDFS) in a High Per- formance Data Analytics (HPDA). The input data comprise of long time-series of Sentinel-1 and Sentinel-2 images, but also pertinent vegetation indices produced from them. At first, a 2-step data interpolation methodology was implemented to create a robust feature space with fixed timestamps. Furthermore, an unsupervised pixel-based technique, which utilizes the K-means algorithm, was approached for creating trustworthy and close-to- reality training data. Finally, a Random Forest model was trained using the generated training data in order to classify paddy rice in every pixel of the Area of Interest. The proposed paddy rice classification method achieves an accuracy of more than 92%, from as early as the end of July, for a study area in Northwestern South Korea. Keywords Sentinel, Big Data, Remote Sensing, Earth Observation, Machine Learning, High Performance Data Analytics, Hadoop, HDFS, Spark, Rice Mapping, Food Security 1 PerÐlhyh Ta teleutaÐa χρόnia, h συνεχόμενη αύξηση tou παγκόσμιou πληθυσμού, se συνάρτηση me thn κλιματική αλλαγή, anamènetai na επηρεάσουν σημαντικά ton episitisτικό tomèa. To ρύζι apoteleÐ κύριo προιόn gia pollèc q¸rec ghc kai trèfei periσσόtero από ton miσό πληθυσμό thc. Epomènwc, h parakoλούθησή tou suneisfèrei σημαντικά ston èlegqo thc episitisτικής ασφάλειας, thc παγκόσμιας oikonomÐac kaj¸c thc κλιματικής αλλαγής. Me σκοπό thn apo- τελεσματική parakoλούθηση thc ανάπτυξης kai παραγωγής tou ruziού, o βασικός stόqoc thc παρούσας diplwmatikής ergasÐac eÐnai na anaπτύξει èna genikeumèno kai χωρικά metαβιβάσιμο montèlo, me σκοπό th qartoγράφηση thc kallièrgeiac tou ruziού, κάνοntac χρήση dedomènwn Sentinel-1 kai Senintel-2. Sugkekrimèna, h παρούσα ergasÐa asqoleÐtai me προκλήσειc parakoλούθησης thc gewrgÐac, gia touc σκοπούς thc parakoλούθησης thc episitisτικής ασφάλειας sth Νόtia Korèa. Ta pro- βλήματa episitisτικής ασφάλειας thc Νόtiac Korèac, αφορούν sthn υπερπαραγωγή ruziού, h opoÐa katά sunèpeia odhgeÐ se υψηλά κόστη αποθήκευση tou pleoνάσματoc. Epiplèon, αυτή h υπερπαραγωγή συνεπάγετai χαμηλή autάρκεια sthn παραγωγή άλλων shmantik¸n kalliergei- ¸n. Gia autόn ton λόgo, h υψηλής akrÐbeiac kai μεγάλης klÐmakac qartoγράφηση tou ruziού prosfèrei poλύtimec plhroforÐec gia th λήψη αποφάσεων υψηλού epipèdou, όσοn αφορά thn episitisτική ασφάλεια. Epomènwc, gia na antimetwpisteÐ autό to πρόβλημα, αναπτύχθηκε mia efarmoγή μεγάλης dedomènwn gia th qartoγράφηση oruz¸nwn. Pio αναλυτικά, èqei ulopoihjeÐ èna σύνοlo diadikasi¸n, pou κάνει χρήση thc upologisτικής πλατφόρμας katanemhmènhc epe- xergasÐac Apache Spark kai tou katanemhmènou apojhkeυτικό σύσthma Hadoop (HDFS) se èna upologisτικό σύσthma uyhl¸n epιδόsewn (High Performance Data Analytics - HPDA). Ta dedomèna eiσόδου apoτελούντai από μεγάλες qronoseirèc εικόnwn Sentinel-1 kai Sentinel-2, kaj¸c kai σχετικούς deÐktec βλάσthshc pou paράγοntai από autèc. Αρχικά, eφαρμόστηκe mia mejodologÐa paremboλής dedomènwn 2 βημάτwn gia th dhmiourgÐa ενός dunamikού q¸rou qarakthristik¸n me stαθερό qroνικό βήμα. Epiplèon, υλοποιήθηκε me mia mèjodoc mh επιβλεπόμενης μάθησης se epÐpedo eikonostoiqeÐou, qrhsimopoi¸ntac ton αλγόριjmo twn (K-means), gia th dhmiourgÐa αξιόπιstwn dedomènwn ekpaÐdeushc. Tèloc, ekpαιδεύτηκε èna montèlo Random Forest qrhsimopoi¸ntac ta παραγόμενα dedomèna ekpaÐdeushc, prokeimènou na taxiνομήσει to ρύζι se κάθε eikonostoiqeÐo thc perioχής endiafèrontoc. H proteiνόμενη mèjodoc taxiνόmhshc ruziού epiτυγχάνει akrÐbeia άνω tou 92%, από to tèloc IoulÐou, gia mia perioχή melèthc sth boreiοδυτική Νόtia Korèa. 3 4 PerÐlhyh Lèxeic Kλειδιά Μεγάλα Dedomèna, Sentinel, Thlepiσκόπηση, Παρατήρηση Ghc, Μηχανική Μάθηση, Upologi- sτικό Σύσthma Uyhl¸n Επιδόσεων , Hadoop, HDFS, Spark, Qartoγράφηση Ruziού Acknowledgments I would like to express my sincere gratitude to my supervisor, Prof. Dr. Konstantinos Karantzalos, for his persistent support and supervision, as well as his engagement through the learning process of this thesis. I am also equally thankful to my colleague and friend Vasileios Sitokonstantinou for his continuous guidance and technical support. I am also deeply thankful to my colleague and friend Thanasis Drivas for his valuable advice and constant help. Their contribution to this work was very crucial and the result would not be the same if is wasn’t for their help. I am deeply grateful to my supervsors Dr. Charalampos (Haris) Kontoes and Dr. Ioannis Papoutsis, with whom I collaborated excellently through the entire process of this study and their contribution has been more than worth mentioning. I consider myself truly lucky to had had the opportunity to work with all the aforementioned people, who have mentored me, guided me, supported me and most importantly inspired me. I am also thankful to my friend Antonis Koutroumpas, with whom I had been working side by side through the entire duration of this study. I would also like to express my gratitude to Assistant Prof. Dr. Georgios Goumas, member of the evaluation committee. Moreover, I am also very grateful to my beloved, Maria, for her unconditional support, love and patience, and for always being there. Finally, I would like to thank all of my friends for all the emotional support. A special appreciation should be expressed to Giannis, who is like family to me and has been nothing less than a continuous source of support and motivation for several years now. Finally, I would like to thank my family who has supported and provided me with countless opportunities, every step of the way. This work has been supported by the EOPEN project, which has been funded from the European Union’s Horizon 2020 research and innovation programme under grant agreement 776019. Author acknowledge also the Copernicus Open Access Hub and the Hellenic National Sentinel Data Mirror Site for providing free access to Sentinel-1 and Sentinel-2 images. 5 Contents Abstract 1 PerÐlhyh 3 Acknowledgments 5 Contents 8 List of Figures 10 List of Tables 11 1 Introduction 13 1.1 Problem Statement . 14 1.2 Thesis Objectives and Contributions . 14 2 Literature Review 17 2.1 Background on rice cultivation . 17 2.2 Remote Sensing . 18 2.2.1 Earth Observation . 18 2.2.2 Multispectral remote sensing . 19 2.2.3 SAR remote sensing . 19 2.2.4 Satellites in this study . 19 2.2.5 Sentinel-2 . 20 2.3 Rice Classification in Earth Observation . 21 2.4 Big Data technologies for remote sensing . 22 3 Study Area, Datasets and Data Analytics Framework 27 3.1 Area of Study . 27 3.2 Data

National Technical University of Athens Master

Conservation Studies of Korean Stone Heritages

Community Adaptation to the Hebei-Spirit Oil Spill

A Collaborative Trans-Regional R&D Strategy for the South Korea Green New Deal to Achieve Future Mobility

Case Study of Seosan Smart Water Management

Democratic People's Republic of Korea

Count of Members by Females & Males in Clubs

Chungcheong Region Coursea. Daejeon

I Love Korea!

Fertility and the Proportion of Newlyweds in Different Municipalities

Activity-Based Exposure Levels and Cancer Risk Assessment Due to Naturally Occurring Asbestos for the Residents Near Abandoned Asbestos Mines in South Korea

The Hydrogen Economy South Korea Market Intelligence Report January 2021 Forewords

Korean Red List of Threatened Species Korean Red List Second Edition of Threatened Species Second Edition Korean Red List of Threatened Species Second Edition