<<

CUBESAT CLOUD: A FRAMEWORK FOR DISTRIBUTED STORAGE, PROCESSING AND COMMUNICATION OF REMOTE SENSING DATA ON CUBESAT CLUSTERS

By

OBULAPATHI NAYUDU CHALLA

A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA 2013 © 2013 Obulapathi Nayudu Challa

2 I dedicate this to my family, my wife Sreevidya Inturi, my mother Rangamma Challa, my

father Ananthaiah Challa, my sister Sreelatha Chowdary Lingutla, my brother-in-laws Ramesh Naidu Lingutla, Sreekanth Chowdary Inturi, my father-in-law Sreenivasulu Chowdary Inturi, my mother-in-law Venkatalakshmi Inturi, my brothers Akshay Kumar

Anugu and Dheeraj Kota and my uncle Venkatanarayana Pattipaati, for all their love and support.

3 ACKNOWLEDGMENTS

It has been a great experience being a student of Dr. Janise Y. McNair for the last five and half years. There was never a time that I did not feel cared for, thanks to her

constant support and guidance. I would like to thank my committee Dr. Xiaolin (Andy) Li, Dr. Norman G. Fitz-Coy and Dr. Haniph A. Latchman for agreeing to serve on my committee. I would like to

thank them for providing valuable feedback in completing my dissertation. Thanks to the professors at University of Florida Dr. Patrick Oscar Boykin, Ms. Wenhsing Wu, Dr.

Ramakant Srivastava, Dr. Erik Sander, Dr. A. Antonio Arroyo, Dr. Jose A. B. Fortes, Dr. John M. Shea, Dr. Greg Stitt, Dr. Sartaj Sahni and Dr. Shigang Chen, for teaching me

what all I know today. Thanks to staff at University of Florida Ray E. McClure II, Jason Kawaja, Shannon M Chillingworth, Cheryl Rhoden and Stephenie A. Sparkman, for their patience with my countless requests and administrative questions. I would like to

take this opportunity to thank all my Wireless and Mobile Group colleagues, past and present, for being there with me and helping me all along in one way or other. I would

like to thank Alexander Verbitski for his mentorship during my internship. I would like to thank my teachers Sreedevi, Uma Kantha, Nalini Sreenivasan, K.

Ramakrishna, K. Bhaskar Naidu, Sambasiva Reddy, A Koteswar Rao, A. K. Rama Rao, Dr. Vijay Kumar Chakka, Dr. Gautam Dutta and Dr. Prabhat Ranjan who greatly influenced my life. and Open Source have made this world a true Vasudhaika

Kutumbam for me. I would like to thank Linus Torvalds, creator of ; Richard Matthew Stallman, founder of GNU; Vint Cerf, father of Internet; Tim Berners-Lee,

inventor of the World Wide Web; Guido Rossum, creator of Python programming language; Satoshi Nakamoto, inventor of Bitcoin; Masashi Kishimoto, creator of Naruto;

Mark Shuttleworth, founder of Ubuntu and Tim O’Reilly, the founder of O’Reilly Media. Life at University of Florida has been always fun and exciting, thanks to the wonderful friends around here: Dan Trevino, Dante Buckley, Gokul Bhat, Hrishikesh

4 Pendurkar, Jimmy (Tzu Yu) Lin, Karthik Talloju, Krishna Chaitanya, Kishore Yalamanchili,

Madhulika Dandina, Manu Rastogi, Paul Muri, Rakesh Chalasani, Ravi Shekhar, Seshu Pria, Shruthi Venkatesh, Subhash Guttikonda, Udayan Kumar, Vaibhav Garg, Vijay

Bhaskar Reddy and Vivek Anand. I would like to thank Mr. Iqbal Qaiyumi, Dr. Shaheda Qaiyumi, Mr. Jagat Desai and Mrs. Vatsala Desai for taking care of me like their son. Thanks to my long-distance friends Bhargavi Vanga, Praveen Kumar, Radha Vummadi,

Uday Kumar, Uzumaki Naruto and Vijay Kumar, who have been close even when they were far.

Lastly, I would like to thank my family - my wife Sreevidya Inturi, my mother Rangamma Challa, my father Ananthaiah Challa, my sister Sreelatha Chowdary

Lingutla, my brother-in-laws Ramesh Naidu Lingutla, Sreekanth Chowdary Inturi, my father-in-law Sreenivasulu Chowdary Inturi, my mother-in-law Venkatalakshmi Inturi, my brothers Akshay Kumar Anugu and Dheeraj Kota and my uncle Venkatanarayana

Pattipaati. Their endless love and support throughout the years has meant more to me than words can express. I would like to dedicate my dissertation to them.

5 TABLE OF CONTENTS

ACKNOWLEDGMENTS ...... 4 LIST OF TABLES ...... 10

LIST OF FIGURES ...... 11 ABSTRACT ...... 13

CHAPTER 1 INTRODUCTION ...... 15

1.1 CubeSat Cloud ...... 17 2 BACKGROUND ...... 20

2.1 Remote Sensing ...... 20 2.2 Evolution of CubeSat Networks ...... 21 2.2.1 Summary and Limitations of CubeSat Communications ...... 24 2.3 Distributed Satellite Systems ...... 25 2.4 Classification of Distributed Satellite Systems ...... 27 2.5 Related Work ...... 27 2.5.1 Distributed Storage Systems ...... 27 2.5.2 Distributed Computing Techniques ...... 30

3 NETWORK ARCHITECTURE OF CUBESAT CLOUD ...... 34 3.1 Components of the CubeSat Network ...... 35 3.1.1 Space Segment ...... 35 3.1.2 Ground Segment ...... 37 3.2 System Communication ...... 38 3.2.1 Cluster Communication ...... 38 3.2.2 Space Segment to Ground Segment Communication ...... 39 3.2.3 Ground Segment Network Communication ...... 40 3.3 CubeSat Cloud ...... 40 3.3.1 Storage, Processing and Communication of Remote Sensing Data on CubeSat Clusters ...... 40 3.3.2 Source Coding, Storing and Downlinking of Remote Sensing Data on CubeSat Clusters ...... 41

4 DISTRIBUTED STORAGE OF REMOTE SENSING IMAGES ON CUBESAT CLUSTERS ...... 45 4.1 Key Design Points ...... 45 4.1.1 Need for Simple Design ...... 45 4.1.2 Low Bandwidth Operation ...... 45

6 4.1.3 Network Partition Tolerant ...... 45 4.1.4 Autonomous ...... 46 4.1.5 Data Integrity ...... 46 4.2 Shared Goals Between CDFS, GFS and HDFS ...... 46 4.2.1 Component Failures are Norm ...... 46 4.2.2 Small Number of Large Files ...... 46 4.2.3 Immutable Files and Non-existent Random Read Writes ...... 47 4.3 Architecture of CubeSat Distributed ...... 47 4.3.1 File System Namespace ...... 49 4.3.2 Heartbeats ...... 49 4.4 File Operations ...... 50 4.4.1 Create a File ...... 50 4.4.2 Writing to a File ...... 51 4.4.3 Deleting a File ...... 51 4.5 Enhancements and Optimizations ...... 52 4.5.1 Bandwidth and Energy Efficient Replication ...... 52 4.5.1.1 Number of nodes on communication path = replication factor ...... 54 4.5.1.2 Number of nodes on communication path >replication factor ...... 54 4.5.1.3 Number of nodes on communication path

5 DISTRIBUTED PROCESSING OF REMOTE SENSING IMAGES ON CUBESAT CLUSTERS ...... 60 5.1 CubeSat MapMerge ...... 60 5.2 Command and Data Flow during a CubeSat MapMerge Job ...... 61 5.3 Fault Tolerance, Failures, Granularity and Load Balancing ...... 63 5.3.1 Fault Tolerance ...... 63 5.3.2 Master Failure ...... 64 5.3.3 Worker Failure ...... 64 5.3.4 Task Granularity and Load Balancing ...... 64 5.4 Simulation Results ...... 64 5.5 Summary of CubeSat MapMerge ...... 65

7 6 DISTRIBUTED COMMUNICATION OF REMOTE SENSING IMAGES FROM CUBESAT CLUSTERS ...... 66

6.1 CubeSat Torrent ...... 66 6.2 Command and Data Flow During a Torrent Session ...... 67 6.3 Enhancements and Optimizations ...... 67 6.3.1 Improve Storage Reliability and Decrease Storage Overhead . . . 67 6.3.2 Using Source Coding to Improve Downlink Time ...... 69 6.3.3 Improving the Quality of Service for Real-time Traffic Applications Like VoIP ...... 70 6.4 Fault Tolerance, Failures, Granularity and Load Balancing ...... 71 6.4.1 Fault Tolerance ...... 71 6.4.2 Master Failure ...... 72 6.4.3 Worker Failure ...... 72 6.4.4 Task Granularity ...... 72 6.4.5 Tail Effect and Backup Downloads ...... 73 6.5 Simulation Results and Summary of CubeSat Torrent ...... 73 7 SIMULATOR, EMULATOR AND PERFORMANCE ANALYSIS ...... 75

7.1 Hardware and Software of Master and Worker CubeSats for Emulator . . 75 7.2 Hardware and Software of Server and Ground Station for Emulator .... 77 7.3 Network Programming Frameworks ...... 77 7.3.1 Twisted ...... 77 7.3.2 Eventlet ...... 77 7.3.3 PyEv ...... 78 7.3.4 Asyncore ...... 78 7.3.5 Tornado ...... 78 7.3.6 Concurrence ...... 78 7.4 Twisted Framework ...... 78 7.5 Network Configuration ...... 79 7.6 CubeSat Cloud Emulator Setup ...... 79 7.7 CubeSat Cloud Simulator Setup ...... 79 7.8 CubeSat Reliability Model ...... 82 7.9 Simulation and Emulation Results ...... 82 7.9.1 Profiling Reading and Writing of Remote Sensing Data Chunks on Raspberry Pi ...... 82 7.9.2 Processing, CubeSat to CubeSat and CubeSat to Ground Station Chunk Communication Time ...... 83 7.9.3 Storing Remote Sensing Images using CubeSat Cloud ...... 84 7.9.4 Processing Remote Sensing Images using CubeSat Cloud .... 85 7.9.5 Speedup and Efficiency of CubeSat MapMerge ...... 86 7.9.6 Downlinking Remote Sensing Images Using CubeSat Cloud .... 87 7.9.7 Speedup and Efficiency of CubeSat Torrent ...... 88 7.9.8 Copy On Transmit Overhead ...... 89 7.9.9 Source Coding Overhead ...... 89

8 7.9.10 Metadata and Control Traffic Overhead ...... 90 7.9.11 Comparison of CDFS with GFS and HDFS ...... 90 7.9.12 Simulator vs Emulator ...... 91 7.10 Summary of Simulation Results ...... 95 8 SUMMARY AND FUTURE WORK ...... 96 8.1 Future work ...... 97

REFERENCES ...... 98 BIOGRAPHICAL SKETCH ...... 101

9 LIST OF TABLES Table page

2-1 CubeSat data speeds and downloads ...... 25

10 LIST OF FIGURES Figure page

1-1 CubeSat ...... 16 2-1 Generations of CubeSat networks ...... 23

2-2 Genso ...... 24 2-3 Architectural overview of the File System ...... 29 2-4 Architectural overview of the Hadoop Distributed File System ...... 31

3-1 Architecture of a CubeSat network ...... 34 3-2 Architecture of a CubeSat cluster ...... 35

3-3 A blown up picture of ESTCube-I CubeSat, showing its subsystems ...... 36 3-4 Ground station ...... 38

3-5 Ground station antenna ...... 39 3-6 Overview of CubeSat Cloud and its component frameworks ...... 42 3-7 Integration of CubeSat Distributed File System and CubeSat Torrent ...... 44

4-1 Architecture of CubeSat Distributed File System ...... 48 4-2 Bandwidth and energy efficient replication ...... 53

4-3 Copy on transmit ...... 55 5-1 Example of CubeSat MapMerge ...... 61

5-2 Overview of execution of CubeSat MapMerge on CubeSat cluster ...... 62 6-1 Overview of CubeSat Torrent ...... 68 7-1 Raspberry Pi mini computer ...... 76

7-2 CubeSat Cloud emulator ...... 80 7-3 CubeSat Cloud simulator ...... 81

7-4 Lifetimes of CubeSats ...... 83 7-5 Read and write times of a chunk ...... 84

7-6 CubeSat to CubeSat and CubeSat to ground station chunk communication profiling ...... 85 7-7 File distribution time for various file sizes and cluster sizes ...... 86

11 7-8 File processing time for various file sizes and cluster sizes ...... 86

7-9 Speedup of CubeSat MapMerge ...... 87 7-10 Efficiency of CubeSat MapMerge ...... 88

7-11 File downlinking time for various file sizes and cluster sizes ...... 89 7-12 Speedup of CubeSat Torrent ...... 90 7-13 Efficiency of CubeSat Torrent ...... 91

7-14 Bandwidth overhead due to replication ...... 92 7-15 Bandwidth overhead due to source coding ...... 92

7-16 Bandwidth and energy overhead ...... 93 7-17 Bandwidth consumption of CDFS vs GFS and HDFS ...... 93

7-18 Write time of CDFS vs GFS and HDFS ...... 94 7-19 Energy consumption of CDFS vs GFS and HDFS ...... 94 7-20 Simulator vs emulator ...... 95

12 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy CUBESAT CLOUD: A FRAMEWORK FOR DISTRIBUTED STORAGE, PROCESSING AND COMMUNICATION OF REMOTE SENSING DATA ON CUBESAT CLUSTERS By

Obulapathi Nayudu Challa December 2013

Chair: Janise Y. McNair Major: Electrical and Computer Engineering CubeSat Cloud is a novel vision for a space based remote sensing network that includes a collection of small satellites (including CubeSats), ground stations, and a server, where a CubeSat is a miniaturized satellite with a volume of a 10x10x10 cm cube and has a weight of approximately 1 kg. The small form factor of CubeSats limits the processing and communication capabilities. Implemented and deployed CubeSats have demonstrated about 1 GHz processing speed and 9.6 kbps communication speed.

A CubeSat in its current state can take hours to process a 100 MB image and more than a day to downlink the same, which prohibits remote sensing, considering the limitations in ground station access time for a CubeSat. This dissertation designs an architecture and supporting networking protocols to create CubeSat Cloud, a distributed processing, storage and communication framework that will enable faster execution of remote sensing missions on CubeSat clusters. The core components of CubeSat Cloud are CubeSat Distributed File System, CubeSat

MapMerge, and CubeSat Torrent. The CubeSat Distributed File System has been created for distributing of large amounts of data among the satellites in the cluster. Once the data is distributed, CubeSat MapReduce has been created to process the data in parallel, thereby reducing the processing load for each CubeSat. Finally, CubeSat

Torrent has been created to downlink the data at each CubeSat to a distributed set of ground stations, enabling faster asynchronous downloads. Ground stations send the

13 downlinked data to the server to reconstruct the original image and store it for later

retrieval. Analysis of the proposed CubeSat Cloud architecture was performed using a custom-designed simulator, called CubeNet and an emulation test bed using Raspberry Pi devices. Results show that for cluster sizes ranging from 5 to 25 small satellites, faster download speeds up to 4 to 22 times faster - can be achieved when using

CubeSat Cloud, compared to a single CubeSat. These improvements are achieved at an almost negligible bandwidth and memory overhead (1%).

14 CHAPTER 1 INTRODUCTION

A CubeSat is a miniaturized satellite primarily used for university space research [1]. It has a volume of exactly one litre, weighs no more than one kilogram and is built using commercial off-the-shelf components [2]. Future satellite systems are envisioned to be made up of a cluster or constellation of smaller satellites like CubeSats in support

of huge monolithic satellites together forming a distributed space network. However, weight, volume, power and geometry constraints of CubeSats must be overcome in order to provide required processing, storage and communication capabilities. Figure

1-1 shows the picture of a CubeSat. A CubeSat has only about 1 GHz processor, 1 GB of RAM, 32 - 64 GB of flash memory and 9.6 kbps communication capability [2]

[3]. On other hand, remote sensing missions like weather monitoring, flood monitoring and volcanic activity monitoring require intensive processing or downlinking large amounts of data. With its limited resources, a CubeSat can take hours to process one remote sensing image and days to downlink the same [4] [5]. Thus, processing and communication systems have become bottlenecks for employing CubeSats on remote

sensing missions. The advantages of CubeSats are its low cost, low round trip time for communication

of ground station and they are easy to experiment with. The manufacturing cost of a typical large satellite weighing about 1000 kg is on the order of hundreds of millions

of dollars [6] because all the components are custom made and need to be tested extensively before launch. However, most of the components of a CubeSat are commercial off the shelf components (COTS). Only the payload is custom designed from ground up. Thus, CubeSats can be engineered at a price of about half a million to a few million dollars. This cost is orders of magnitude less than the cost of a typical large satellite [7] [3]. Launches can be achieved in groups of CubeSats at one time.

15 Figure 1-1. CubeSat

Image courtesy of NASA. Picture by Paul Adams.

Large satellites are launched into geostationary earth orbit (GEO) or highly elliptical orbit or high earth orbit (HEO) orbit which are 36000 km or 50000 km. As a result of the long distance between earth and satellite, signal propagation delay is about 120 ms and round trip time approximates to about 250 ms. CubeSats are launched into low earth orbit (LEO) orbit which is about 600 - 800 km from earth. As a result, the RTT for the signal reduces to about 10 ms, which could better quality of service for applications like real time tracking and voice applications, when compared to that of RTT for GEO or HEO satellite.

Finally, since a CubeSat mission costs half a million to a few million dollars and can be launches in large numbers using a single rocket, mission failure is not fatal. Since mission failure is not fatal, and costs are lower, new technologies can be easily inserted into an existing space network via CubeSats. CubeSats have very limited resources to accomplish meaningful remote sensing missions. Typical processing power of CubeSats is about 1 GHz and has 1 GB of RAM. As a result, computational power of CubeSats is not sufficient for executing

16 processing intensive remote sensing missions. CubeSats use structurally simple low gain antennas like monopole or dipole and have limited power budget of about 500 mW for communication system. The typical communication data rate between a CubeSat and a ground station is about 9.6 kbps [5]. As a result, large amounts of data can not be downloaded to ground stations in a reasonable amount of time. CubeSats have low memory, processing, battery power and communication capabilities. Timing constraints are too tight to have long communication windows. Each CubeSat is controlled individually. Currently, there is no meaningful way of controlling multiple CubeSats using a unified control mechanism. As a result, a single CubeSat cannot perform processing and communication intensive remote sensing missions in a meaningful time.

1.1 CubeSat Cloud

In this work we propose CubeSat Cloud, a framework for distributed storage, processing and communication of remote sensing data. We demonstrate that CubeSat Cloud can store remote sensing data on a CubeSat cluster in a distributed fashion to allow the possibility of distributed computation and communication, speeding up remote sensing missions. For distributing remote sensing data, CubeSat Cloud uses CubeSat

Distributed File System. For distributed processing and communication, CubeSat Cloud uses CubeSat MapMerge and CubeSat Torrent respectively. We reduce the bandwidth and energy consumption by energy efficient replication and liner block source coding.

In chapter 2 we outline the evolution of CubeSat Network and present relevant background in distributed satellite systems, storage systems and computing techniques.

In chapter 3, we describe the architecture of CubeSat Network, which consists of two segments, namely a space segment and a ground segment. the space segment is designed to be a CubeSat cluster with a radius of about 100 km. It consists of Sensor nodes and Worker nodes which are inter-connected using high speed communication links. Sensor CubeSat has sensing subsystem and act as Master of the cluster while executing remote sensing missions. Worker nodes are typical 1U CubeSats (10 x 10 x

17 10 cm cube) with standard subsystems. Ground segment is made up of a ground station server and several ground stations. Ground stations are connected to the server via the Internet. Ground stations act as relays between ground station server and CubeSats.

On top of the described network architecture, we build the CubeSat Cloud platform. In Chapter 4, we describe the three core components of CubeSat Cloud namely CubeSat Distributed File System (CDFS), CubeSat MapMerge and CubeSat Torrent.

CubeSat Distributed File System is used for distributing the remote sensing data to the nodes in the CubeSat cluster. Once the remote sensing data is distributed, CubeSat

MapMerge and CubeSat Torrent are used for processing and downlinking the remote sensing data.

CDFS splits large sized remote sensing data into chunks and distributes the chunks to the Worker nodes in the cluster. CDFS uses ”Copy-On-Transmit” for creating replicas with very low bandwidth and energy overhead. Source coding is used for reducing the storage and bandwidth overhead for missions which require only downlinking of remote sensing data. We demonstrate that CDFS can store data reliably, without loss of any data, even if a limited number of CubeSats go offline. Distributing the remote sensing data to nodes in the cluster allows the possibility of distributed processing and distributed communication for speeding up the remote sensing missions. In chapter 6, we describe the working of CubeSat MapMerge in detail. CubeSat MapMerge is a distributed processing framework inspired by Google MapReduce [8].

Worker nodes process the chunks stored with them in parallel. Failures are detected using the Heartbeat mechanism and failed executions are re-scheduled on other worker nodes. We demonstrate that CubeSat MapMerge can speed up processing of large sizes of remote sensing data on CubeSat clusters by a factor of the size of cluster (i.e., the number of CubeSats in the cluster) and is resilient to worker and communication link failures.

18 In Chapter 7 we explain the downlink process. Multiple raw or processed chunks

are downlinked in parallel to ground stations. Once a chunk is downlinked to a ground station, it is forwarded to the ground station server. After receiving all chunks, the Server

uses the chunks to reproduce sensor image. We demonstrate that CubeSat Torrent can speed up the downlinking of large files by a factor of the cluster size (number of CubeSats in the cluster) and is resilient to worker and communication link failures.

To test the performance of the system, we built a CubeSat Cloud simulation framework and a CubeSat Cloud testbed for emulation. We describe the testbed and simulation setup in detail in chapter 8. CubeSats are emulated using Raspberry Pi mini-computers, while the terrestrial Server and ground stations are emulated using standard desktop computers. CubeSat Cloud is written in Python programming language using Twisted, an event based asynchronous network programming framework. Simulation results indicate that CubeSat MapMerge and CubeSat Torrent, with cluster sizes in the range of 5 - 25 CubeSats together enable 4.75 - 23.15 times faster (compared to a single CubeSat) processing and downlinking of large sized remote sensing data. This speed is achieved at an almost negligible bandwidth and memory overhead (1%). Emulation results from the CubeSat Cloud testbed agree with simulation results and, indicate that our proposed CubeSat Cloud can speed up remote sensing missions by a factor of the size of the CubeSat cluster with minimal overhead, while achieving asynchronous download with short communication windows.

19 CHAPTER 2 BACKGROUND

The CubeSat concept was initiated by Professor Twiggs as a teaching tool to help students learn the process of developing, launching and operating satellite.

CubeSats are currently designed for low Earth orbits. They are well suited for distributed sensing applications and low data rate communications applications. Unlike the large monolithic satellites, CubeSats are built to a large degree using commercial off the shelf components (COTS). Engineering CubeSats using COTS equipment and following standards in design and development have shortened the development cycles and reduced costs. CubeSats are typically launched and deployed using a mechanism called P-POD [9], developed and built by Cal Poly. P-PODs are mounted to a launch vehicle carrying CubeSats and deploy them once the proper signal is received from the launch vehicle. The P-POD Mk III has a capacity for three 1U CubeSats. A P-POD can deploy three 1U or one 1U and one 2U or one 3U CubeSat. CubeSats carry one or two scientific payloads like magnetic field sensor, image sensor or ion concentration finder. Several companies and research institutes offer regular launch opportunities in clusters of several cubes. CubeSat as a specification for constructing and deploying pico satellites accomplishes following goals:

1. Encapsulation of launcher-payload interface: CubeSat standard eliminates a significant amount of managerial work and makes it easy to mate a piggyback satellite with its launcher. 2. Unification among payloads and launchers: Satellites adhering to CubeSat standard can be interchanged quickly for one another and thus enables utilization of launch opportunities on short notice. 3. Simplification of pico satellite infrastructure: CubeSat standard it possible to design and produce an operational small satellite at a very low cost. 2.1 Remote Sensing

Acquiring information about an object with out making physical contact with it is called Remote sensing. Usually it refers to gathering information about atmosphere and earth surface using satellites. Remote sensing can be performed in either passive or

20 active sensors. Passive remote sensors use natural radiation reflected by the object under observation. Film photography, infrared, charge-coupled devices and radiometers are examples of passive remote sensors. Active remote sensors make use of radiation source and observe objects using the scattered or reflected radiation. Examples of active remote sensors include RADAR and LiDAR. It is easy to collect the data from inaccessible and dangerous places using remote sensing. Weather monitoring, deforestation monitoring, glacial activity monitoring, volcano monitoring, flood and other disaster monitoring are some of the examples of remote sensing applications. Each data point collected by remote sensors is typically anywhere between 10 MB to 100 MB. Resolution of a remote sensor varies from 1 m - 1000 m per pixel depending on the sensor. remote sensing data is immutable. It does not change after acquisition. With its given resources, a single CubeSat can take about 10 hours for processing a remote sensing image and 2 days to downlink the same [5]. Execution of remote sensing missions using CubeSats can be speeded up by parallelizing the processing and downlinking of the remote sensing images using a cluster of CubeSats.

2.2 Evolution of CubeSat Networks

Since the launch of the first CubeSat into space in 2003, CubeSat communication networks have evolved in several ways. Very early CubeSats used to communicate with their home ground station only as shown in the Figure 2-1. These networks can be classified as generation 1 CubeSat networks. A typical CubeSat in 600 - 800 km orbit has a window of about 8 minutes and it gets in contact with ground station about

4 times a day. This limited the communication window to about 25 minutes per day. The first generation CubeSats operated at speeds of 1.2 kbps. This limits the total downlink capacity to 1.8 MB. However no CubeSat achieved this high downlink or uplink bandwidth due to various limitations including inefficient protocols, large amount of beacon data, power constraints, unreliable communication systems on board. As a

21 result, most missions collected a modest amount of about a few MB of data (<12 MB) for their whole lifetime [10]. With the introduction of MoreDBs (Massive operations, recording, and experimentation

Database System) [11], CubeSat networks made the next significant step in their evolution. MoreDBs is a system to manage all data generated by Cal Poly small satellites. It is an attempt to consolidate all satellite information into a single, readily

accessible location to make data analysis more efficient. Using networks like MoreDBs, mission controllers can collect the beacons from their small satellites received by other

amateur radio operators as shown in the Figure 2-1. This significantly increased the amount of small satellite health information available and also served to track the

whereabouts of small satellite. These efforts brought CubeSat networks into generation 2. However, MoreDB has following two significant limitations. First, MoreDB architecture

requires mission specific software to be developed and distributed to the ground station facilities. Second, any modification to the packet format requires an upgrade of software

at all the ground stations. This is cumbersome and error prone. As a result of these limitations, the solution is not scalable to a large number of

CubeSats. In order to overcome these limitations, Space Systems Group (SSG) [12] and Wireless And Mobile Lab [13] at University of Florida developed A Architecture for Spacecraft Telemetry Collection (T-C3) [14], a scalable and flexible

means of collecting the telemetry data. T-C3 is an effort to improve MoreDB and make it a universal telemetry decoding solution solution for CubeSats. Instead of decoding the

received beacon at the amateur radio station directly, T-C3 forwards the beacon to the T-C3 central server which fingerprints the beacon and decodes the beacon using that

satellites telemetry format, making it a much more scalable and flexible solution [14]. GENSO (Global Educational Network for Satellite Operations) [15] is the next significant milestone in the evolution of CubeSat networks. GENSO was founded to

22 Figure 2-1. Generation 1 (a) and Generation 2 (b) CubeSat Networks

Image courtesy of Space Systems Group. Picture by Tzu Yu (Jimmy) Lin. create a network of amateur radio stations around the world to support the small satellite operations of various universities. GENSO has been designed as a distributed system connected via the Internet as shown in the Figure 2-2. The satellite can communicate with the main base station through any arbitrary available relay station. With a single ground station, a university can gather about 25 minutes of data from a CubeSat in a day. Using the GENSO network, mission controllers can gather hours of worth data per day by receiving data via hundreds of networked radio stations around the world. It will also allow them to command their spacecraft from the other ground stations.

GENSO and other similar efforts can be classified as generation 3 CubeSat networks. GENSO plans to have a built-in database of all the satellites. This database can be used to predict and automate the tracking of the satellites to collect the telemetry in an efficient way. Once the data is downlinked, the data will be provided to respective mission controllers.

23 Figure 2-2. Architecture of GENSO

2.2.1 Summary and Limitations of CubeSat Communications

Table 2-1 summarizes the data speeds and data downloads of CubeSat communication systems. Typical characteristics of a CubeSat communication subsystem can be summarized as follows. Data rate is 9600 baud, power rating 500 mW with an efficiency of about 25% and a total download of 12 MB has been achieved so far using 13 satellites for a period of 5 years. As one can see, communications is the primary bottleneck for emerging remote sensing missions. In order to improve the downlink speed we developed CubeSat Torrent, a distributed communications framework for

24 CubeSat clusters. This proposal envisions the next generation of CubeSat Networks, which is the distributed satellite system.

Table 2-1. CubeSat data speeds and downloads Parameter Min Max Average Speed 1200bps 38.4kbps 9600bps Power 350mW 1500mW 500mW Frequency 433MHz 900MHz NA TotalDownload 320KB 6.77MB 0.5-5MB

2.3 Distributed Satellite Systems

A distributed system is a collection of independent components that work together to perform a desired task and appears to end user as a single coherent system. Examples of distributed Systems include the World Wide Web (WWW), Clusters,

Network of Workstations or Embedded Systems, Cell processor etc., These distributed systems are fueled by the availability of powerful and low cost microprocessors and high speed communication technologies like Local Area Network (LAN). As the price to performance ratio of microprocessors drop and speed of communication networks increase, distributed computing systems have much better price-performance ratio than a single large centralized system. As more and more CubeSats are launched, it is becoming apparent that some space research needs may be better met by a group of small satellites, rather than by a single large satellite. This is akin to the paradigm shift that happened in the computer industry a few decades ago: shift of focus from large, expensive mainframes to using smaller, cheaper, more adaptable sets of distributed computers for solving challenging problems [16].

Distributed satellite systems have their own advantages and challenges. Due to the advances in modern VLSI technology that create integrated circuits with lower power and smaller in size, and due to subsystems like RelNAV Software Defined Radio [17] that have enabled high speed satellite communication, distributed satellite systems

25 potentially have a much better price to performance ratio than a single large monolithic satellite. Applications like weather monitoring and tracking are inherently distributed in nature and may be better served by a distributed system than a centralized system.

Monolithic satellite architecture requires that each satellite must have all of the sensing, processing, storage and communication peripherals on board. Distributed satellite systems can share resources like sensing, memory, processing and communications, as well as information. The multiplicity of sensors, storage devices, processors and communication devices means there is no single point of failure. Critical information can be duplicated allowing the system to continue to work even if some components fail. Similarly, distributed satellite systems may have better availability than centralized satellite systems, due to the ability of the system to work at reduced capacity when components fail. Finally, distributed satellite systems enjoy the advantage of incremental growth.

The functionality of a distributed satellite system can be gradually increase by adding more satellites as and when need arises. Distributed small satellite systems rely on what is called horizontal scaling, where one employs more satellites to serve an increased need.

On other hand, distributed satellite systems are more complex and difficult to build than monolithic satellite systems. Several challenges such as orbit planning, resource management, communication and data management, security must be addressed [16]. There is very little or no support for distributed data storage, processing or communications for distributed satellite systems. Distributed satellite systems need fast and low power backbone network for data and control information exchange. The backbone network must be reliable and should prevent problems such as message loss, overloading and saturation. Distributed systems store data at several places. It provides more access points for critical information. As a result, additional security measures need to be taken to safeguard data and systems. Finally, finding out problems

26 in distributed satellite systems and troubleshooting them requires detailed analysis of each satellite and communication between them. 2.4 Classification of Distributed Satellite Systems

Constellation, Formation Flying and Swarm / Cluster are three main types of distributed satellite systems [16]. A group of satellites in similar orbits with coordinated ground coverage complementing each other is called a Constellation. Satellites in a constellation do not have on-board control of their relative positions and are controlled separately from ground control stations. Iridium and Teledesic are well known examples of satellite constellations. A group of satellites with coordinated motion control, based on their relative positions, to preserve the topology is called a Flying Formation. Position of a satellite in a flying formation is controlled by onboard closed-loop mechanism.

Satellites of a flying formation work together to perform the function of a single, large, virtual instrument. TICS, F6 and Orbital Express are well known examples of flying formations. A group of satellites, without fixed absolute or relative positions, working together to achieve a joint goal is called a Cluster or Swarm. More about satellite clusters is presented in the Chapter 3.

2.5 Related Work

2.5.1 Distributed Storage Systems

Below we present an overview of related work done in the fields of distributed storage, processing and communications. We surveyed some well known distributed file systems such as The (GFS) [18], Hadoop Distributed File System

(HDFS) [19], [20] and [21]. Owing to simplicity, fault tolerant design and scalability, architecture of GFS and HDFS suits well for distributed storage on CubeSat clusters. Below we present an overview of distributed storage systems, GFS and HDFS. Google File System is the major storage engine for large scale data at Google. A brief summary of GFS is as follows. Architecture of Google File System is shown in the

Figure 2-3. GFS consists of two components: a master and one or more chunk servers.

27 GFS functions similarly to a standard POSIX file library but is not POSIX compatible.

Each file is split into fixed size blocks called chunks. Each chunk is 64 MB in size, by default. Chunks are stored on chunk servers. Metadata information like constituent chunks of a file, file to chunk mapping, chunk to chunk server mapping is stored with master. Chunk servers store the actual data in the form of chunks. When clients interact with Google file system, large share of communication will be between the and chunk servers. This avoids master as the bottleneck for transferring large files in and out of Google File System.

Client machine communicates to the Google File System through the client class library. It translates the open, read, write and delete file system calls into Google File

System calls. Client library communicates with the master for metadata operations and chunk servers for actual data operations. The interface of GFS client is very similar to that of POSIX file system. To work with GFS, no knowledge about distributed systems is required. GFS client abstracts all the required distributed knowledge. However, some localised chunk information is used for scheduling MapReduce jobs on nodes to improve the efficiency. Each of the open, read, write and delete operations are implemented in the following way. GFS client requests master for metadata information including file to chunk mapping, chunk to chunk server mapping and communicates with chunk servers for actual data. Once the operation is complete metadata of the Master is updated to reflect the new state of the file system.

Hadoop Distributed File System (HDFS) is an open-source implementation of Google File System. Hadoop is written in programming language. It can be interfaced with ++, Python, Ruby and many other programming languages using its Thrift [22] interface. It is designed for storing hundreds of gigabytes or even petabytes of data and for fast streaming access to the application data. Similar to GFS, HDFS supports write-once-read-many semantics on files.

28 29

Figure 2-3. Architectural overview of the Google File System HDFS also uses master/slave architecture. Architecture of Hadoop Distributed File

System is shown in the Figure 2-4. In HDFS, NameNode plays the role of the master node of GFS. It controls the namespace and implements the access control mechanism

for data stored in HDFS. DataNodes takes care of managing the local storage hardware. In order to bring the cost of the implementation down, these nodes run open source , typically a GNU/Linux system. When a file copied into HDFS, it is split

into blocks and distributed to DataNodes. Fault tolerance of the stored data is achieved through replication. Each chunk is replicated 3 times, by default. HDFS documentation

is available at their website [23]. There are several limitations of GFS and HDFS for storing remote sensing data

on CubeSat clusters. Unlike wired communication channels, wireless communication channels in space are more unreliable. Communication links break often leading to network partitions. GFS and HDFS are not partition tolerant. GFS and HDFS are

not optimized for power consumption. For CubeSat clusters, power is a very scarce resource. Also, cost of communication is significantly high for wireless links, compared

to wired links. GFS and HDFS are designed as generic data storage platforms. They are not tailored for storing remote sensing data. Their generic design is too complex for

CubeSat Clusters. Using GFS or HDFS causes a lot of overhead in terms of processing, memory, bandwidth and power. We designed CubeSat Distributed File System to overcome the above mentioned problems and tailored it for storing remote sensing data

on CubeSat clusters by using large chunk sizes and load balancing. 2.5.2 Distributed Computing Techniques

Distributed computing is a form of computing where processing is performed

simultaneously on many nodes. The key principle behind distributed computing is that most of the large problems can be divided into smaller problems, which can be solved concurrently. Cheap computing nodes are connected using high speed backbone

network to form a cluster to execute the smaller problems. We surveyed distributed

30 31

Figure 2-4. Architectural overview of the Hadoop Distributed File System computing techniques such as Common Object Request Broker Architecture (COBRA),

Web services, Remote Procedure Call (RPC), Remote Method Invocation (RMI) and MapReduce that are used for distributed processing on computing machines. Below we present a brief overview of them. Distributed objects technique involves distributed objects communicating via messages. Common Object Request Broker Architecture (COBRA), JAVA Remote

Method Invocation (RMI), IBM Websphere MQ, Apple’s NSProxy, Gnustep, Microsoft’s Distributed Component Object Model (DCOM) and .Net are well known examples of this model. Owing to its platform independence and interoperable nature, CORBA programs can work together regardless of the programming languages used. Java RMI,

IBM Websphere MQ, Apple’s NSProxy, Microsoft’s DCOM and .Net are proprietary technologies. They are not independent of the programming language and are not quite versatile.

Web services is the way through which web based applications operate via the HTTP protocol. Web services uses Simple Object Access Protocol (SOAP) for exchanging structured information between the web components; JavaScript Object Notation (JSON) to exchange data between web services in human readable format,

Web Services Description Language (WSDL) to provide a machine readable(machine-readable) description of a web service and Universal Description, Discovery and Integration (UDDI) for describing web services.

Message Passing Interface (MPI), Open Message Passing Interface (OpenMPI), Open Multi Processing (OpenMP) and Parallel Virtual Machine (PVM) are the prevalent technologies in message passing interface category. These technologies are used in massively parallel applications and supercomputing when data needs to be distributed and communicated efficiently. Sockets is a popular option for client server based architectures, like mail servers and web servers. Their availability on any system equipped with TCP/IP stack makes it

32 an attractive option. Traffic can easily be re-routed to different ports using secure shell

(SSH), Secure Sockets Layer (SSL) or (VPN) connections. MapReduce is the recent development in the field of distributed computing.

Introduced by Google Inc. in 2004 [18], design simplicity, fault tolerance and ease of implementation makes it an attractive candidate for large scale distributed processing. It is a based on the map and reduce primitives of functional languages like Lisp.

MapReduce programs are highly parallelizable and thus can be used for large-scale data processing by employing large cluster of computing nodes. Companies like Google,

Yahoo! and Facebook use MapReduce to process many terabytes of data on a large cluster containing thousands of cheap computing machines. MapReduce performs large-scale distributed computation while hiding the complications of parallelization, data distribution, synchronization, locking, load balancing and fault tolerance.

We studied in detail about the advantages and disadvantages of the above mentioned distributed computing techniques. They do not account for salient and unique features of CubeSats and CubeSat clusters like power, memory and communications constraints, unreliable wireless communication, high cost of communication and need for tight locality optimization of data storage and operations. Owing to its simplicity and fault tolerant design, MapReduce suits well for large scale distributed processing of remote sensing data on CubeSat clusters. Based on MapReduce, we designed

CubeSat MapMerge to serve the needs of CubeSat community. To overcome the above mentioned limitations, we designed CubeSat MapMerge and tailored it for processing remote sensing data on CubeSat clusters.

33 CHAPTER 3 NETWORK ARCHITECTURE OF CUBESAT CLOUD

Architecture of CubeSat Network is shown in Figure 3-1. CubeSat Network consists of space and ground segments. Space segment is a CubeSat Cluster.

Architecture of CubeSat Cluster is shown in Figure 3-2. A CubeSat cluster has a radius of about 25 km. It consists of Sensor nodes and Worker nodes inter-connected using high speed communication links. Worker nodes are CubeSats with storage, processing, communication and other standard subsystems. In addition to the standard subsystems, Sensor CubeSat has sensing subsystem. Sensor nodes act as Master of the cluster while orchestrating remote sensing missions. Ground segment is composed of Server and several ground stations. CubeSat to CubeSat communication links are short distance, reliable, directional, low power and high speed. CubeSats to ground station communication links are long distance, high power, low speed and unreliable.

Each CubeSat is connected to a ground station. Ground stations are connected to the Server via the Internet. Ground stations act as relays between Server and CubeSats.

Figure 3-1. Architecture of a CubeSat network

34 Figure 3-2. Architecture of a CubeSat cluster

3.1 Components of the CubeSat Network

3.1.1 Space Segment

A worker CubeSat is a typical 1U CubeSat that has dimensions of 10cmx10cm x 10 cm, volume of exactly one litre, weighs about one kilogram. However, it does not

need to be a 1U CubeSat only. It can be 2U or 3U or any other form factor. Worker CubeSats needs to have storage, processing and communication subsystems. Other

standard subsystems include Satellite Bus, Electrical Power, Structural and Thermal, Attitude Determination and Control. Worker CubeSat has about 1 GHz processor, 1

GB of RAM, 32 - 64 GB memory. CubeSat to ground station communication speed is about 9.6 kbps. Sensor CubeSats has sensing module in addition to above mentioned subsystens. Figure 3-3 shows a blown up ESTCube-I CubeSat showing its various

subsystems. Sensor node is equipped with sensing hardware. It performs sensing (take an image or do a radar scan). While orchestrating a mission, a Sensor node acts as the Master node for CubeSat Cluster. When not orchestrating a mission, Sensor node performs the role of a worker node. Master node is the primary center for receiving

35 Figure 3-3. A blown up picture of ESTCube-I CubeSat, showing its subsystems

Image courtesy of University of Tartu. Picture by Andreas Valdmann.

36 commands from the server and issuing subcommands to worker CubeSats in the cluster. It keeps track of all the metadata related to the mission including the list of participating nodes, their resource capabilities, map jobs, merge jobs, downlink jobs and their status. It keeps track of all the resources available in the cluster, their state and tracks available resources on each node. It is also responsible for taking scheduling decisions like which job needs to be scheduled on which node and when. Worker nodes have limited role of executing the processing and downlinking jobs assigned to them by the Master node.

3.1.2 Ground Segment

Ground station or amateur radio station is an installation that enables communication with CubeSat satellite. Figure 3-4 shows ground station control equipment and Figure

3-5 shows high directional Yagi antenna used for communicating with satellites. It contains high gain directional antennas like Yagi or parabolic dish antenna, communication equipment like modems and computers to send, capture and analyse the data received.

There are several types of amateur radio stations including fixed ground stations, mobile stations, space stations, and temporary field stations. Most of the radio stations are established to provide an educational and recreational purposes for providing technical expertise, skills and volunteer manning to promote attendance by the public, communications education for the public.

Ground station server is a dedicated computer system that is connected to ground stations through Internet. It receives commands from Administrator and uplinks the commands to CubeSats. Once the mission is executed, resulting data is downlinked to the Server and is stored o its local storage disk. Server acts as the command center for

CubeSat network. Administrator issues commands to the Server, which then forwards the commands to the Master CubeSat through a ground station. Server node stores all the downlinked mission data and thus acts as the storage node for the downlinked data from CubeSat cluster. Ground Stations acts as a relay between CubeSats and server.

37 Figure 3-4. Ground station

Image courtesy of Gator Amateur Radio Club. Image by Tzu Yu (Jimmy) Lin.

They downlink the data from CubeSat and send it to server. They upload commands and data from server to worker CubeSats which forward them to Master CubeSat.

3.2 System Communication

3.2.1 Cluster Communication

CubeSats are connected to each other through a high speed (>1 Mbps) and low power consuming backbone network. High gain directed antennas like patch or LASER [24] are used for inter cluster communication. Vescent photonics [25] is

developing a extremely small and low power optical communications modules for CubeSats. There has been research on using tethers for low distance, high speed communication between satellites. RelNav demonstrated a spacecraft subsystem call

38 Figure 3-5. Ground station antenna

Image courtesy of Gator Amateur Radio Club. Image by Tzu Yu (Jimmy) Lin.

SWIFT Software Defined Radio (SDR) [26] [17] that will enable a flock of satellites. SWIFT SDR subsystem demonstrated by RelNav provides provide following services:

• 1 Mbps inter-satellite communication link for data exchange between CubeSats.

• Relative position and orientation for formation flight.

• Cluster synchronization and timing for coordinated operations and coherent sensing.

3.2.2 Space Segment to Ground Segment Communication

CubeSat geometry prohibits the use of complex antennas [27]. As a result,

CubeSats are connected to ground stations through simple antennas like monopole or dipole. Coupled with stringent power constraints and distances of order 600 - 800 km,

this resulted in low speed links between CubeSats and ground stations. Typical CubeSat to ground station speed is about 9.6 kbps [10].

39 3.2.3 Ground Segment Network Communication

Ground stations and Server are connected via the Internet. Internet provides a high

speed (10 Mbps) and reliable wired communication medium between the Server and ground stations. Power is not a constraint for the Server and ground stations as they are

connected to the electrical grid. 3.3 CubeSat Cloud

We propose CubeSat Cloud, a framework for distributed storage, processing and communication of remote sensing data on CubeSat Clusters. CubeSat Cloud uses

CubeSat Distributed File System for distributed storage of remote sensing data on CubeSat Clusters. CubeSat MapMerge is the distributed processing framework used

for processing remote sensing data stored in CDFS. CubeSat Torrent is the distributed communications framework used for downlinking raw or partially processed remote sensing data from CubeSat Clusters. Below we describe how remote sensing missions

are executed using the CubeSat Cloud framework. 3.3.1 Storage, Processing and Communication of Remote Sensing Data on CubeSat Clusters

CubeSat Cloud, as a generic framework, can be used for storing, processing

and downlink of remote sensing data from CubeSat Clusters. Once a remote sensing operation is performed, obtained sensor data is stored on the CubeSat cluster using the

CubeSat Distributed File System. After storing the data on the cluster, it is processed using CubeSat MapMerge and obtained results are downlinked using CubeSat Torrent. Figure 3-6 shows the overview of CubeSat Cloud framework consisting of CubeSat

Distributed File System, CubeSat MapMerge and CubeSat Torrent. Below is a detailed description of how a remote sensing mission is executed using CubeSat Cloud.

1. Server sends the SENSE and STORE command to the Master. Upon receiving the SENSE and STORE command, Master performs remote sensing operation and stores the sensor data on the local file system. Server and Master does not need to have a direct communication link. The command will be relayed through the ground station network and space segment.

40 2. Master node then splits the file into chunks C1, C2, C3, ...Cn. Size of chunks is about 64 Kb. Master node distributes the Chunks to the worker nodes. File to chunk mapping, Chunk to worker mapping and other metadata is stored Master. Splitting the remote sensing data into chunks, distributing them and storing them on worker nodes is achieved using CubeSat Distributed File System. Distributing the data across the worker nodes in the cluster allows the possibility of processing and downlinking the data in distributed fashion.

3. Server sends the PROCESS command to Master. Master CubeSat commands the Worker CubeSats to processes the stored chunks stored to produce partial results. Obtained partial results are stored on the local file system of worker nodes.

4. Server sends the DOWNLINK command to the Master, which then commands the worker nodes in the cluster to downlink the processed chunks to ground stations. Downlinking the processed chunks to the Server is achieved through CubeSat Torrent.

5. Once the Server receives all the processed chunks, it stitches them into full solution. Processing of chunks to produce partial results on Worker nodes and stitching of partial results into the complete solution on Server constitutes the CubeSat MapMerge. 3.3.2 Source Coding, Storing and Downlinking of Remote Sensing Data on CubeSat Clusters

A large number of missions require only downlinking of the remote sensing data without processing it. For these missions, worker nodes does not require access to the raw data. This can be used as an opportunity to optimize the CubeSat Torrent missions by improving the quality of service and reducing the storage overhead. Below we present in detail about how we utilize source coding for improving the quality of service and reduce storage overhead. Figure 3-7 shows the overview of how a downlink only remote sensing missions are executed using CubeSat Cloud and is described below in detail.

1. Master sends the SENSE, CODE and STORE command to the Server.

2. Upon receiving the SENSE, CODE and STORE command from the Server, Master performs remote sensing operation and stores data from the sensor on the local file system.

41 42

Figure 3-6. Overview of CubeSat Cloud and its component frameworks 3. Master node splits the remote sensing data file into chunks C1, C2, C3 ...Cn. Size of each chunk is about 64 Kb. Then, based on the required redundancy, it creates coded chunks C1’, C2’, C3’ ...Cm, where m >n.

4. Master distributes coded chunks C1’, C2’, C3’ ...Cm to the Worker nodes. Master stores metadata, which includes file to chunk mapping, chunk to Worker mapping and chunk status. Splitting the remote sensing data into chunks, performing coding, distributing them and storing them on Worker nodes is performed by CubeSat Distributed File System.

5. Server then sends the DOWNLINK command to the Master, which then commands the Worker nodes in the cluster to downlink the processed chunks to Ground stations. Downlinking the processed chunks to the Server is performed by CubeSat Torrent.

6. Once the Server receives n out of m chunks, it stitches them into the original file. As long as n out m chunks are available, the original data can still be recovered. Details about performance analysis of source coding are discussed in Chapter 8.

43 44

Figure 3-7. CubeSat Cloud: Integration of CubeSat Distributed File System and CubeSat Torrent CHAPTER 4 DISTRIBUTED STORAGE OF REMOTE SENSING IMAGES ON CUBESAT CLUSTERS

The CubeSat Distributed File System (CDFS) is built for storing large sized remote sensing files on small satellite clusters in a distributed fashion. While satisfying the goals of scalability, reliability and performance, CDFS is designed for CubeSat clusters which use wireless backbone network, are partition prone and have severe power and bandwidth constraints. CDFS has successfully met the scalability, performance and reliability goals while adhering to the constraints posed by the harsh environment and limited resources. It is being used as a storage layer for distributed processing and distributed communication on CubeSat clusters. In this chapter, we present the architecture, file system design and several optimizations.

4.1 Key Design Points

4.1.1 Need for Simple Design

A typical CubeSat has about 1 GHz processing capability, 1 GB RAM, 32 GB of

flash storage, 1 Mbps inter cluster communication speed, 9.6 kbps communication capability and 2 W power generation capability[5]. For CubeSats, processing, bandwidth

and battery power are scarce resources. So the system design needs to be simple. 4.1.2 Low Bandwidth Operation

CubeSat network is built using long distance wireless links (10 km for inter cluster and 600 km for CubeSat to ground station). As a result, the cost of communication is very high. As a result, data and control traffic needs to be reduced as much as possible. 4.1.3 Network Partition Tolerant

The backbone medium of communication is wireless and the space environment is harsh. High velocity of satellites (relative to ground stations) in LEO makes the satellite to ground station link failure very common. Topology of CubeSat cluster is also very dynamic, causing the inter satellite links to keep breaking very frequently. Sometimes, nodes go into sleep mode to conserve power. All the above factors can cause frequent

45 breaking of communication links. As a result, if a node is temporarily unreachable, system should not treat it as a node failure. System should be tolerant to temporary network failures or partitions.

4.1.4 Autonomous

Most of the time, individual CubeSats and the whole CubeSat cluster are inaccessible to human operators. So the software design should take care of all failure scenarios. A reset mechanism, at the node and network level, should be provided. In case if all the fault tolerance mechanisms fail, system will undergo reset mechanism and start working again. As a result, distributed file system should be able to operate completely autonomously without human intervention. 4.1.5 Data Integrity

Memory failures are fatal for satellite missions. Even though memory chips for satellites are radiation hardened, high energy cosmic rays can sometimes cause trouble.

For example, the Mars rover Curiosity had suffered a significant setback because of damage to the memory of its primary computer caused by a high-energy particle.

Hence, data integrity should not be violated. 4.2 Shared Goals Between CDFS, GFS and HDFS

Along with the above design points that CDFS shares additional design points with GFS and HDFS which are highlighted below.

4.2.1 Component Failures are Norm

Given a large number of CubeSats and communication links, failures are norm rather than the exception. Therefore, constant monitoring, error detection, fault tolerance, and automatic recovery must be integral to the system. 4.2.2 Small Number of Large Files

Files are huge by traditional standards. Images and remote sensing data generated by satellites tend to be in the order of hundreds of megabytes.

46 4.2.3 Immutable Files and Non-existent Random Read Writes

Random writes within a file are practically non-existent. Once written, the files are only read, and often only sequentially. This kind of access patterns are common for imaging, remote sensing missions and programs like MapReduce that process this data and generate new data. CDFS shares goals of availability, performance, scalability and reliability with GFS and HDFS. Owing to its radically different operating environment, the system design points and constraints are very different for CDFS. GFS and HDFS were designed for non-power constrained cluster of computers connected using high speed wired media.

CDFS is meant for distributed data storage on CubeSat clusters which use wireless communication medium for exchanging data and have severe power and bandwidth constraints. Design of CDFS should be simple, operate with very less bandwidth consumption and operate autonomously without the requirement for human intervention. It should be tolerant to network partitions, temporary link failures, node failures and preserve the integrity of the data stored. 4.3 Architecture of CubeSat Distributed File System

Figure 4-1 Shows the architecture of CDFS. A CDFS cluster consists of Sensor

CubeSats and worker CubeSats. Sensor nodes are equipped with sensing module and thus performs sensing. While orchestrating a mission, a Sensor node plays the role of the Master (M) node. Worker nodes aid the Master node in processing or downlinking large files. Here is how CDFS stores a file on the cluster. Administrator will issue a remote sensing command to the central server (as shown in Figure 4-1). Central server will transmits the command to a relay ground station which uplinks it to the Master CubeSat. Upon receiving the command, Master CubeSat will perform sensing, like taking image or doing a radar scan. Sensing operation will generate large amounts of data (about 100 MB) which is stored into a local file on Master node.

47 48

Figure 4-1. Architecture of CubeSat Distributed File System Master node (M) splits this file into blocks called chunks and stores them on worker

CubeSats. Each chunk is identified using unique chunk id. For reliability, each chunk is replicated on multiple workers. By default, CDFS creates two replicas (a primary replica and a secondary replica), along with an implicit replica stored on Master node. So in effect, there are three replicas. Along with the implicit replicas, the Master CubeSat holds all metadata for the filesystem. Metadata includes the mapping from files to chunks, the location of these chunks on various workers, the namespace and access control information. The workers store the actual data. Worker nodes store chunks as regular files on local flash memory. As shown in the figure, the cluster is organized as a tree with the Master node as the root node.

4.3.1 File System Namespace

CDFS supports a traditional hierarchical file organization in which a user or an application can create directories and store files inside them. The file system namespace hierarchy is similar to that of Linux file systems[28]. The root directory

is “/” and is empty by default. One can create, rename, relocate, and remove files. CDFS supports hidden files and directories concept in a way similar to that of Linux file

systems. Hidden file or directories start with “.” (period) and contain metadata, error detection and correction information, configuration information and other miscellaneous information required by CDFS. These hidden files are stored as regular files rather than

distributed files since these are very small and are used by the system locally. One can refer to the files stored on a server using notation “://server/filepath”, where filepath

looks like “/directory1/directory2/ . . . /filename”. 4.3.2 Heartbeats

Several problems can cause loss of data or connectivity between Master and worker

nodes. Problems are diagnosed using Heartbeat messages. Once every 10 minutes, worker nodes send a Heartbeat message to Master node. Heartbeat message contains workers current status and problems, if any. Master periodically diagnoses the received

49 Heartbeat messages to detect any problems and rectify them if possible. If a worker does not send any Heartbeat message with in 10 minutes, Master will mark the node as temporary failure. If there is no Heartbeat message from worker with in 30 minutes of time, Master marks the worker as permanent failure. When a node is marked temporary failure, data chunks assigned to worker will not be replicated on other nodes, instead the secondary replicas will be marked as primary. After permanent failure, Master node marks the node as dead. When a node contacts the Master after recovering from permanent failure, Master refreshes it metadata to reflect the change.

4.4 File Operations

CDFS has a simple interface. CDFS supports file operations create, write, read and delete. Next section we describe in detail what happens when each of these operations is performed. 4.4.1 Create a File

Once Master performs remote sensing operation (take an image or do a radar scan), it generates huge amounts of sensor data. Initially this data will be stored into a local file on the Master node. Typical size of this file is about 100 MB. Master stores this file on CubeSat cluster using CDFS to perform distributed processing or distributed downlinking. Following actions will implemented in sequence when a CDFS file is created by Master node. It requires filename, and chunk size as parameters. By default, chunk size is 64 KB and is optional.

1. Master calculates the number of chunks based on the file size and chunk size. (Number of chunks = file size / chunk size).

2. Master generates chunk identifiers called chunk ids and assigns one to each chunk. Chunk id is an immutable id.

3. Master assigns the chunks to worker nodes. Each chunk is assigned to one worker node in a round robin fashion. A copy of the chunk stored at the selected node is called primary replica of the chunk.

50 4. Master stores the above metadata (filename, number of chunks, chunk to chunk id mapping and chunk id to worker node mapping) in its permanent storage and communicates the same to the backup Master. 4.4.2 Writing to a File

Write operation is performed by Master node when it wants to copy a local file (on the Master) to CDFS. Files in CDFS are immutable. They can be written only once

after that are created. Inputs for a writing a file are source filename on Master and the destination filename CDFS. Following actions happen in sequence when the Master

writes a local file to a CDFS file.

1. For each chunk Master performs the actions described in steps 2, 3, 4, and 5.

2. Master looks up the metadata of the destination file on CDFS to find out the worker node responsible for storing the chunk.

3. Master determines transmission path (from Master to the worker node) using tree based routing algorithm.

4. From the nodes on the transmission path (excluding the Master and destination worker node),

5. Master randomly picks a node to be the secondary replica of the chunk and notifies it.

6. Master transmits the chunk to the primary replica node. While the chunk is being transmitted to the primary replica node, secondary replica node copies the chunk and stores in its local storage.

7. After storing all the chunks on the cluster, Master commits the metadata to its memory and communicates the same to the Server.

4.4.3 Deleting a File

The following actions are performed in sequence when a file is deleted.

1. Administrator issues delete file command to the Server.

2. Server uplinks the command to the Master CubeSat through a relay ground station.

3. Master node looks up the metadata for the file and sends the delete chunk command to all the primary and secondary replicas nodes.

51 4. Once a worker deletes the chunks, it sends ACK to the Master.

5. Once the ACKs are received from all worker CubeSats, Master deletes the metadata for the file.

6. Master CubeSat will send SUCCESS message to the Server through relay ground station.

4.5 Enhancements and Optimizations

CDFS serves well as distributed data storage on CubeSat Clusters. However

CubeSats have stringent energy constraints and CubeSat clusters have severe bandwidth constraints. So there is a dire need to reduce energy and bandwidth consumption. Below we describe the methods we employ for reducing the energy and bandwidth consumption.

4.5.1 Bandwidth and Energy Efficient Replication

To ensure reliability of data stored, CDFS uses redundancy. Each chunk has three replicas stored on three different nodes, called replica nodes. But, creating replicas is both energy and bandwidth consuming. For a CubeSat cluster, both energy and bandwidth are precious. In order to reduce energy and bandwidth, Master node (Source node) can be used as the Super Replica Node (Super Replica Node: A node which stores the replicas of all chunks). Since the Master node performs sensing and has all the data initially, implicit replicas on Master node are created without any energy and bandwidth consumption. Using Master node as a Super Replica node essentially means that CDFS needs to create only two additional replicas. This also means that Master node should be equipped with sufficiently high storage to store all chunks. But this is a small cost compared energy and bandwidth saved. The data from the source node is accessed only if other two replicas are not available, in order to conserve the power of the source node. For additional two replicas, any random selection of worker nodes will do a good job for achieving reliability. But, if replica nodes are carefully selected, energy and bandwidth consumption can be significantly reduced. Consider the two scenarios

52 A and B depicted in Figure 4-2. In scenario A, the chunk is replicated on nodes M

(the Master node), A and B2. In scenario B, the chunk is replicated on nodes M, B and B1. The cost of communication (bandwidth and energy consumed) in the first scenario is 3 times the average link communication cost (fromMtoAandfromMto B to B2). In the second case, energy consumption is only 2 times the average link communication cost (from M to B to B1). Storing a chunk on nodes that are on the same communication path or on nodes which are located close to each other yields best energy and bandwidth efficiency. Exploiting the above observation, we designed a novel method for providing reliability with low power and bandwidth consumption. This technique is called Copy-on-transmit.

Figure 4-2. Bandwidth and energy efficient replication

When the source node transmits the data to a destination, it goes through multiple hops. Selected nodes, on the communication path, copy the data while it is being transmitted. This method is very convenient for doing data replication in wireless networks, without incurring additional energy or bandwidth consumption. Consider the

53 scenarios shown in Figure 4-3. In all cases the source node M, transmits data to the

destination node Z. Below we describe how we replicate data using copy-on-transmit for different communication path lengths for a replication factor of 3 (1 implicit replica and 2

explicit replicas). 4.5.1.1 Number of nodes on communication path = replication factor

In this case, we replicate the chunk on all nodes along the path including the source and destination. When the chunk is being transmitted from Node M to Node Z through

Node A, Node A makes a copy of the chunk and stores in its memory. Now the chunk has three replicas, one each at M, A and Z.

4.5.1.2 Number of nodes on communication path >replication factor

In this case, we replicate the chunk on Master node M, destination node Z and a random on the path. When the chunk is being transmitted from Node M to Node Z through Nodes A, B, C, D, and E, Node C makes a copy of the chunk and stores it in its memory. Now the chunk has three replicas one each at M, C and Z. 4.5.1.3 Number of nodes on communication path

In this case, we replicate the data on all nodes on the communication path (Node

M and Node Z) and some additional nodes. This scenario can have two different sub-scenarios (a) when the destination node is not a leaf node (has children) and (b) when the destination is a leaf node (no children). These two sub-scenarios are discussed as Case 3(a) and Case 3(b) below. Case 3(a) Destination node is not a leaf node: In this case, first we replicate the data

on all nodes (Node M and Node Z), along path. In order to meet the replication requirement, the communication path is extended beyond the destination node Z to

store data on Node A. This ensures that there are required numbers of replicas. Case 3(b) Destination node is a leaf node: In this case, first we replicate the data on all nodes (Node M and Node Z), along path. In order to meet the replication

requirement, one more replica of chunk needs to be created. Master randomly

54 selects another node A and stores chunk on it, ensuring that there are required

number of replicas.

Figure 4-3. Copy on transmit

4.5.2 Load Balancing

The goal of load balancing is to distribute data to the nodes in the cluster in order to balance one or several of the criteria like storage, processing, communication, power consumption. When a file is created, number of chunks assigned to a worker node is proportional to the value of the LBF(node), where LBF is the load balancing function. Below we explain how we determine the load balancing function for uniform storage, proportional storage and several other criteria. Custom load balancing function can be used to perform load balancing according to users wish. However its needs to be noted that distributing data in order to perform uniform storage might result in uneven load balancing for processing or communication and vice versa. N is the total number of

55 worker nodes in the cluster and LBF is the load balancing function. The following are the available load balancing functions available in CDFS:

• Uniform storage / processing / communications per node: LBF(node) = 1 / N

• In proportion to storage capacity of node: LBF(node) = Storage capacity of the node / Total storage capacity of the Cluster

• In proportion to processing capacity of node: LBF(node) = Processing power of node / Total processing power of the Cluster

• In proportion to communication capacity of node: LBF(node) = Communication speed of node / Total communication speed of the cluster.

• In proportion to power generation capacity of node: LBF(node) = Power generation capability of node / Total power generation capability of the cluster

• Hybrid: LBF(node) = a * LBF(node) for storage + b * LBF(node) for processing + c * LBF(node) for communication + d * LBF(node) for power, where a, b, c and d are normalized proportion coefficients, and sum of a, b, c and d is 1. For missions that are processing intensive, it is desirable that number of chunks stored on a node is proportional to the nodes processing power. For communication intensive missions, it is desirable that number of chunks stored on a node is proportional to the communication capabilities of the node. For missions that are both processing and communication, hybrid function can be used. Additionally, in order not to overload nodes, a capping on the number of chunks stored per node per file is suggested. 4.5.3 Chunk Size and Granularity

By splitting files into a large number of chunks, granularity will be improved. Small chunks ensure better storage balancing, especially for small files. However, as the number of chunks increases, so the amount of metadata, metadata operations and number of control messages which decreases the system performance. In order to strike balance between the advantages of large chunks with advantages of granularity, we selected chunk size to be about 64 KB.

56 4.5.4 Fault Tolerance

CDFS is designed to be tolerant for temporary and permanent CubeSat failures and its performance degrades gracefully with component, machine or link failures. A CubeSat cluster can contain up to about a hundred CubeSats and are interconnected with roughly same number of high speed wireless links. Because of a large number of components and harsh space environment, some CubeSats or wireless links may face intermittent problems and some may face fatal errors from which they cannot recover unless hard reset by ground station. Source of the problem can be in application, operating system, memory, connectors or networking. So failures should be treated as a norm rather than an exception. In order to avoid system downtime or corruption of data, system should be designed to handle the failures and its performance should degrade gracefully with failures. Below we discuss how we handle these errors when they come up. 4.5.5 Master Failure

Master node stores metadata, which consists of mapping between the files to chunks and chunks to worker nodes. If the Master node fails, the mission will fail. In order to avoid mission failure in case of Master failure, metadata is written to Masters non-volatile memory, like flash, and the same is communicated to the Server. If the Master reboots because of a temporary failure, a new copy will be started from the last known state stored in Masters non-volatile memory. In case of failure of Master, worker nodes will wait until a new Master resumes. 4.5.6 Worker Failure

Worker nodes send Heartbeat messages to master once every 10 minutes. If a worker reports a fatal error, Master marks the worker node as failed. If a worker does not send heartbeat message with in 10 minutes, Master will mark the node as temporary failure. If Master does not receive heartbeat message with in 30 minutes, Master marks

57 the worker as failed. Once the failed node comes back online, Masters metadata will be refreshed to account the change. 4.5.7 Chunk Corruption

Harsh space environment and the cosmic rays lead to frequent memory corruption.

One of the computer systems of the Mars rover Curiosity had a memory problem due to high energy particles and resulted in a major setback for mission. Thus, ensuring the integrity of data stored on CDFS is of paramount important. CDFS uses checksum of data for detecting bad data. Performing data integrity operations on entire chunk is inefficient. If a chunk is found to be corrupt, discarding the whole chunk will lead to a lot of wasted IO. It also requires a lot of time and memory to read the whole chunk (64 KB), to verify its integrity. Thus, each chunk is split into blocks of 512 bytes. CDFS stores

CRC of each block of data and performs checksum validation at the block level. When a read operation is performed on a chunk, block by block is read and each block is verified for data integrity by comparing the stored checksum with newly computed checksum.

This way if one of the blocks is found to be corrupt, only that block is marked bad and can be read from another healthy replica of the chunk. Employing data integrity check at the block level ensures that partial IO or downlinking that was done before detecting the data corruption will not go waste. Doing data integrity at block levels also increases the availability of data.

4.5.8 Inter CubeSat Link Failure

Owing to harsh space environment, communication links fail often. If a CubeSat to CubeSat link fails, the child node in the routing tree will retry connect to its parent. If the link re-establishment is not successful or the link quality is bad, the child node will ping its neighbours and search for a new parent node and joins the cluster.

4.5.9 Network Partitioning

Sometimes a single CubeSat or several CubeSats may get separated from the

CubeSat cluster. This phenomenon is called network partitioning. In either case, the

58 data stored on the separated nodes will be retained and will be available for downlinking to the ground stations. Using the downlinked metadata, separated CubeSats can be contacted by Server via ground stations for downlinking the data.

4.6 Simulation Results

We simulated CubeSat Distributed File System with a CubeSat cluster consisting of one master node and 5 - 25 worker nodes. Each CubeSat has a processing clocked at 1 GHz, 1 GB RAM, 32 GB of flash storage memory, 1 Mbps inter-cluster communication link and 9.6 kbps CubeSat to ground station data rate. Our simulation results indicate that file storing time for 100 MB file on cluster of size 10 is about 12.96 minutes. Since

file storing time is only few minutes, it is negligible compared to file processing and file downlinking time, which are in hours.

4.7 Summary of CubeSat Distributed File System

We built CubeSat Distributed File System to store large files in distributed fashion and thus enable distributed applications like CubeSat MapMerge [4] and CubeSat Torrent [5] on CubeSat Clusters. It treats component and system failures as a norm rather than the exception and is optimized for processing satellite images and remote sensing which are huge by nature. CDFS provides fault tolerance by constant monitoring, replicating crucial data, and does automatic recovery. In CubeSat Clusters, network bandwidth and power are scarce resources. A number of optimizations in our system are therefore targeted at reducing the amount of data and control messages sent across the network. Copy-on-transmit enables making replicas without any additional or very little bandwidth or energy consumption. Failures are detected using Heartbeat mechanism. CDFS has built in load balancers for several use cases like CubeSat MapMerge and CubeSat Torrent and allows use of user defined custom load balancers.

59 CHAPTER 5 DISTRIBUTED PROCESSING OF REMOTE SENSING IMAGES ON CUBESAT CLUSTERS Processing power of CubeSats is about 1 GHz. Lack of available power and active cooling of microprocessors further restricts the available processing power. As a result, processing intensive remote sensing applications cannot be performed on individual CubeSats in a meaningful amount of time. Distributed computing offers a solution to this problem. By pooling processing power of individual CubeSats in a cluster, processing of large remote sensing files can be speeded up. CubeSat Cloud uses

CubeSat MapMerge to process remote sensing data on CubeSat Clusters. 5.1 CubeSat MapMerge

CubeSat MapMerge is inspired by MapReduce and is tailored for CubeSat clusters. Master node orchestrates CubeSat MapMerge. Master node commands the worker nodes to process the chunks stored with them. Worker nodes process the chunks and produce intermediate results. As soon as the workers process chunks, they downlink the partial solutions to the Server. Once the Server gets all the results, it stitches the intermediate solutions to obtain the full solution. Master node takes care of scheduling map tasks, monitoring them and re-executing the failed tasks. The worker nodes execute the subtasks as directed by the master. Figure 5-1 shows an overview of how an image can be processed using CubeSat MapMerge and is explained in brief in following steps.

1. Master node splits the image into chunks and distributes them to the worker nodes in the cluster using CDFS.

2. Worker nodes process the splits given to them to produce partial solutions and downlink the solutions to Server.

3. Server stitches the downlinked partial solutions into full solution.

60 Figure 5-1. Example of CubeSat MapMerge

5.2 Command and Data Flow during a CubeSat MapMerge Job

Figure 5-2 shows the flow of data and commands during a CubeSat MapMerge operation. When the Administrator issues a process command to the Server (Ex: process image.jpg), the following actions occur in the sequence noted.

1. Uplinking the command: Administrator issues a command to Server. Server forwards the command to Ground station, which uplinks the command to master CubeSat. (Ex: take an image of a particular area and process it).

2. Work assignment: Master node commands the worker nodes to process the chunks stored with them and downlink the results.

61 62

Figure 5-2. Overview of execution of CubeSat MapMerge on CubeSat cluster 3. Map phase: Worker node process the chunks stored with them and stores the result locally.

4. Downlinking the results: As and when a worker node processes a chunk, it downlinks the solution to a ground station. Ground station forwards the solution to Server. Downlinking of results is achieved through CubeSat Torrent.

5. Reduce phase: Once Server receives all the partial solutions, it stitches them into full solution.

More details about CubeSat MapMerge are presented in the paper CubeSat MapMerge [4].

5.3 Fault Tolerance, Failures, Granularity and Load Balancing

CubeSat MapMerge is tolerant to temporary and permanent CubeSat failures. Its performance degrades gracefully with component, machine or link failures. Metadata is replicated to avoid mission failure in case of failure of the master node. Worker failures are detected using Heartbeat mechanism. If a worker node fails, the tasks assigned for the worker node are rescheduled on other worker nodes. Data chunks are split into a large number of pieces to improve granularity and load balancing. Chunk size is selected to be about 64 KB in order to balance the advantages of granularity with control traffic overhead.

5.3.1 Fault Tolerance

CubeSat MapMerge is designed to be tolerant to temporary and permanent CubeSat failures and its performance degrades gracefully with component, machine or link failures. A CubeSat cluster can contain up to about a hundred CubeSats and are interconnected with roughly same number of high speed wireless links. Because of a large number of components and harsh space environment, some CubeSats or wireless links may face intermittent problems and some may face fatal errors from which they cannot recover unless hard reset by ground station. So failures should be treated as the norm rather than an exception. In order to avoid system downtime or corruption of data, system should be designed to handle the failures and its performance should degrade

63 gracefully with failures. Below we discuss how we handle these errors when they come

up. 5.3.2 Master Failure

Master node stores metadata, which consists of mapping between the map jobs

to worker nodes and the state of map jobs. In order to avoid mission failure in case of failure of the master node, periodically metadata is written to masters non-volatile memory, like flash, and the same is communicated to the Server. If the master reboots

because of a temporary failure, a new copy will be started from the last known state stored in masters non-volatile memory. If the master cannot recover from error, Map

Reduce mission is aborted and raw data can be downlinked to the Server. 5.3.3 Worker Failure

Worker nodes periodically send Heartbeat messages containing their status and problems, if any. If a worker reports fatal error, master marks the worker node as failed.

Processing task assigned to the worker node is reset back to its initial idle state and is scheduled on other worker node containing the replica of the chunk.

5.3.4 Task Granularity and Load Balancing

By splitting the data into a large number of pieces, task granularity will be improved. CubeSats with a faster processor or special hardware like GPU, DSP or FPGA can process an order of magnitude large number of map tasks than a standard CubeSat.

Fine task granularity will ensure better load balancing. However, as the number of chunks increase so does the metadata operations and control messages, leading to decrease in the system performance. To balance the advantages of granularity with the control traffic overhead, chunk size is selected to be about 64 KB.

5.4 Simulation Results

We simulated CubeSat MapMerge with a CubeSat cluster consisting of one master node and 5 - 25 worker nodes. Each CubeSat has a processing clocked at 1 GHz, 1 GB RAM, 32 GB of flash storage memory, 1 Mbps inter-cluster communication link and

64 9.6 kbps CubeSat to ground station data rate. We processed images using de-noise, entropy, peak detection, segmentation and Sobel edge detection algorithms. We used Scikit Python image processing library for processing the images. Our simulations indicate that CubeSat MapMerge, with cluster sizes in the range of 5 - 25 CubeSats, can process images at about 4.8 - 23.4 times faster than an individual CubeSat. These results indicate that CubeSat MapMerge can speedup processing intensive remote sensing missions by a factor of size of the cluster. More detailed results are presented in Section 8: Simulation results.

5.5 Summary of CubeSat MapMerge

CubeSat MapMerge is a very simple, yet efficient distributed processing framework for processing of remote sensing images on CubeSat clusters. It treats node and link failures as a norm rather than an exception and is optimized for processing remote sensing images. It provides fault tolerance by constant monitoring, replicating crucial data, and fast and automatic recovery. With Heartbeat mechanism to detect failures and redundant execution to recover from failures this design is fault tolerant. Optimal chunk size balances the advantages of granularity with control traffic overhead. Load balancing takes into account of nodes with multi core processors, graphic processing units, digital signal processors and FPGAs into account and distributed the data accordingly. CubeSat MapMerge can speedup processing intensive remote sensing missions by a factor of size of the cluster.

65 CHAPTER 6 DISTRIBUTED COMMUNICATION OF REMOTE SENSING IMAGES FROM CUBESAT CLUSTERS Due to stringent space constraints, CubeSats typically use monopole, dipole and turnstile antennas. As a result, a typical CubeSat to ground station link has a data rate of 9.6 kbps. Low speed data communication is one of the major bottlenecks for remote sensing missions that require downlinking of large amounts of data. For emerging remote sensing missions, communication bottleneck poses a severe threat as the connectivity with ground station will be very limited, intermittent and comes at a very high price. As a result, data intensive remote sensing applications cannot be performed using individual CubeSats in a meaningful amount of time. Distributed communication offers a solution to this problem. By pooling the communication resources of individual

CubeSats in a cluster, downlinking of large sized remote sensing images can be speeded up.

We studied CubeSat Communication protocols including AX.25 [29] and CubeSat Space Protocol (CSP) [30]. All these protocols are point-to-point and does not support any form of distributed communications for faster downloading of large data files like

images or videos. Currently there are no protocols for downloading data from CubeSat clusters in a distributed fashion. So we designed CubeSat Torrent based on Torrent

communication protocol to speedup remote sensing missions requiring downlinking of large amounts of data. CubeSat Cloud uses CubeSat Torrent for distributed downlinking

of remote sensing data from CubeSat Clusters. 6.1 CubeSat Torrent

CubeSat Torrent [5] is a distributed communications framework inspired by Torrent [31]. CubeSat Torrent works in the following way. Master node plays the role of tracker.

It keeps track of all the worker nodes in the cluster and their available downlink capacity. When the Server requests for a file to be downlinked, Master node commands the

worker nodes in the cluster to downlink the chunks or partial solutions stored with

66 them. Worker nodes simultaneously downlink chunks to various ground stations.

Ground stations forward the chunks to the Server. Once Server receives all the chunks, Server stitches them to generate the original file. Figure 6-1 shows an overview of how

CubeSat Torrent works. 6.2 Command and Data Flow During a Torrent Session

1. Uplinking the command: Server sends a file downlink command to the ground station, which uplinks it to the Master.

2. Distributing the subcommands: Master issues subcommands to the worker nodes storing the chunks of the file to downlink them.

3. Downlinking the chunks: When a worker gets a chunk downlink command, it reads the chunk from its local file system and starts downlinking it to the connected ground station.

4. Notification: Upon successful downloading of chunk, worker notifies master and continues to next chunk. This process repeats until all chunks are downlinked.

5. Forwarding of chunks: Once ground station receives chunk, it forwards the chunk to Server.

6. Reconstructing original file: Once all the chunks are downlinked to the Server, Server stitches the chunks into the original image. 6.3 Enhancements and Optimizations

We made several enhancements and optimizations to CubeSat Cloud to improve performance. Below we present the enhancements, particularly for remote sensing missions, which require only downlinking of remote sensing data. 6.3.1 Improve Storage Reliability and Decrease Storage Overhead

CubeSat Cloud uses redundancy to provide reliability. Each chunk is replicated 3 times, so that even if a CubeSat fails or loses a chunk, the chunk is still available with two other CubeSats. Replication provides access to raw data at each worker node so that data can be processed before it is downlinked to the ground station. It also leads to a lot of communication and storage overhead. Replicating each chunk 3 times, leads to 200% storage overhead and 10% - 25% communication and energy consumption

67 68

Figure 6-1. Overview of CubeSat Torrent overhead. More details about overhead resulting due to replication are discussed in detail in the paper Distributed Data Storage for CubeSat Clusters [32]. For remote sensing missions which only need to downlink the data, there is no advantage of having access to raw data as worker nodes does not process the data. This can be used as an opportunity to reduce the storage and communication overhead. Once the Master performs sensing, it creates chunks C1, C2, C3 ...Cn of raw data. Then, based on the required redundancy, it creates coded chunks C1’, C2’, C3’ . ..Cm, where m >n. Master node then distributes these chunks to the worker nodes. As long as n out m chunks are downlinked, the original image can be recovered. More details about performance analysis of source coding are discussed in Chapter 8.

6.3.2 Using Source Coding to Improve Downlink Time

Some worker nodes take unusually long time to downlink a chunk. These nodes are called straggler nodes. There can be several reasons for this like a bad antenna, cache failures, scheduling of intensive background tasks, very low speed link, etc. If raw data is downlinked directly, downlink is not complete until the last chunk is downlinked to the Server. If a straggler node takes very long time to downlink a chunk, no matter how fast the other nodes downlink the rest of the chunks, file downlink will still be slowed down due to the delay in downlinking of chunk by the straggler node. To mitigate the risk of slowdown of downlinking of a file by stragglers, CubeSat Cloud performs uses duplicate downlinking of last few chunks as explained in CubeSat Torrent. However, for missions that require only downlinking of remote sensing data, efficiency of this mitigation mechanism can further be improved by used of source coding. After Master performs sensing, it creates chunks C1, C2, C3 ...Cn of raw data. Then, based on the required redundancy, it creates coded chunks C1’, C2’, C3’ . ..Cm, where m >n. Master node then distributes these chunks to the worker nodes. When the Master receives the DOWNLINK command from Server, it starts downlinking the chunks C1’, C2’, ...in usual way until N-W chunks are downlinked to the Server, where N is the number of chunks

69 and W is the number of Worker nodes in the Cluster. At that point, only W chunks needs

to be downlinked to the Server to complete the file download. If any of the W chunk downloads take unusually long, the whole file download will take more time. In order to

prevent slowdown of downlinking of files due to stragglers, Master schedules more than W chunk downloads. As a result, even if straggler nodes slow down downlinking of some chunks, required number of chunks (N) will be downlinked to the Server at the highest

possible speed. Once the Server receives N chunks, it undoes the source coding to create the original file from the downlinked chunks. More details about performance

analysis of source coding are discussed in Section 8. 6.3.3 Improving the Quality of Service for Real-time Traffic Applications Like VoIP

Real-time traffic applications like VoIP need high quality of service. Traditional

methods provide better quality of service through the use of forward error correction and or retransmission. Given that bandwidth is premium for CubeSat communications, large amounts of forward error correction data means high overhead and thus less bandwidth

for actual data. Retransmissions lead to increased downlink times. Other methods for providing quality of service include the use of multiple channels to send copies of

packets creating redundant transmissions. Although, these methods are computationally less intensive, they do not ensure resilience to the losses and reduce overall throughput of the system.

Consider a scenario where CubeSat Torrent is used for streaming data from Master (Sensor) node. As explained before, Master node splits the raw data into chunks. Let’s suppose that the data frame (D) for time ti is split into chunks C1, C2, C3 ...Cn. Master uses these chunks to create linear coded chunks C1’, C2’, C3’, ...Cm, where m >n.

Master forwards the coded packets to the worker nodes, which downlink them to the ground stations. In the process of this downlinking, some packets are lost. Rest of the packets reach Server, which then stitches them back into D, the original data frame.

Server can obtain D back from coded packets as long as it receives at least n of them

70 or if a maximum of m-n packets are lost in transmission. If the Master node notices that

more than r packets are being lost on their way to Server, it increases the redundancy by increasing m and thus increasing r. More details and results about our source coding

technique are presented in the paper Robust Communications for CubeSat Cluster using Network Coding [33]. 6.4 Fault Tolerance, Failures, Granularity and Load Balancing

CubeSat Torrent incorporates several mechanisms to make itself tolerant to temporary and permanent CubeSat failures. Its performance degrades gracefully with communication link failures. Worker node failures are detected using Heartbeat mechanism. If a worker node fails, the downlink tasks assigned for the worker node are rescheduled on other worker nodes. Size of data chunks is selected to be 64 KB to improve granularity and load balancing. Chunk size is selected to be about 64 KB in order to balance the advantages of granularity with metadata and control traffic overhead.

6.4.1 Fault Tolerance

CubeSat Torrent is designed to be tolerant to temporary and permanent CubeSat failures and its performance degrades gracefully with the machine or link failures.

Failures are the norm rather than an exception. A cluster can contain up to hundred nodes and is connected, with roughly about the same number of ground stations, through long distance wireless links. The quantity and quality of the links virtually guarantee that some links break intermittently and are not functional at any given time, and some will not recover from their failures. Problems can be caused by human errors, CubeSat mobility, bad antennas, communication system bugs, memory failures, connectors and other networking hardware. Such failures can result in an unavailable communication links or can lead to data corruption. Therefore, constant monitoring, error detection, fault tolerance, and automatic recovery must be a part of the system.

71 Below we discuss how we meet these challenges and how we resolve the problems when they occur. 6.4.2 Master Failure

Master writes periodic checkpoints of all the master data structures. If the master task dies, a new copy will be started from the last checkpointed state. Master node represents the single point of failure for the CubeSat Torrent. In order to avoid mission failure in case of failure of the master node, periodically metadata is written to masters nonvolatile memory, like flash, and the same is communicated to the Server. If the master reboots because of a temporary failure, a new copy will be started from the last known state stored in masters nonvolatile memory. In case of failure of master, data can be downlinked to from the worker nodes to the Server.

6.4.3 Worker Failure

Periodically workers send Heartbeat message to Master node. Heartbeat message contains the status of the worker and problems, if any. If the Master does not receive Heartbeat message from Master with in 30 minutes, master marks the worker as failed.

Downlink task assigned to the worker is reset back to its initial idle state, and scheduled on other worker nodes. If a worker loses connection with ground station, it retries with same or different ground station. If it cannot connect to any ground station within a certain amount of time, it signals failure to master. Master marks the worker as a temporary failed node. If the worker cannot connect to the ground station, Master marks the worker as failed node and reschedules the downlinking job assigned to the failed worker to another worker.

6.4.4 Task Granularity

Master divides the file to be downloaded into C chunks. Ideally, C should be much larger than the number of worker machines. Having each worker download many different chunks improves dynamic load balancing. However, as C increases, so does the amount of control traffic and delays resulting from exchange of control information.

72 In order to balance the advantages of granularity with the overhead incurred due to

control traffic, C is chosen to be about 64 KB. 6.4.5 Tail Effect and Backup Downloads

Some nodes takes unusually long time to downlink a chunk. These nodes are

called stragglers. Reasons behind them could be a bad antenna or a very low speed link. To mitigate the risk of slowdown of downlinking or uplinking of a file by stragglers, CubeSat Torrent uses backup downloads. When a file downlink or uplink operation is

close to completion, the master schedules backup downlinking tasks for the remaining in-progress chunks. The chunk is marked as downlinked whenever either the primary

or the backup worker finishes downlinking. This is only a design feature and is not implemented.

6.5 Simulation Results and Summary of CubeSat Torrent

We simulated CubeSat Torrent on a CubeSat cluster consisting of one master,

5 - 25 workers and 5 - 25 ground stations. Each CubeSat has a processing speed of 1 GHz, 1 GB RAM, 32 GB flash storage and ground communication data speed 9.6

kbps. CubeSats in the cluster are connected to each other through 1 Mbps high speed inter-cluster communication links. Our simulation results indicate that CubeSat Torrent,

with cluster sizes in the range of 5 - 25 CubeSats, enables 4.71 - 22.93 times faster (compared to a single CubeSat) downlinking of remote sensing data. CubeSat Torrent can potentially speed up CubeSat missions requiring remote sensing data downlinking

by a factor of size of the cluster. CubeSat Torrent demonstrates the essential qualities for downlinking of large size remote sensing data for CubeSat clusters. It is fault tolerant and scalable. It provides fault tolerance by constant monitoring, replicating crucial data, and fast and automatic recovery. Optimal chunk size balances amount of overhead from control message traffic and advantages of granularity. Checksumming is used to detect data corruption. Proposed design delivers high aggregate throughput which is required for a variety

73 of missions. We achieve this by splitting the file into chunks and downlinking them in parallel from workers to ground stations. Simplified design and minimal metadata operations result in very low overhead.

74 CHAPTER 7 SIMULATOR, EMULATOR AND PERFORMANCE ANALYSIS

For simulating and measuring performance of CubeSat cloud, we created a CubeSat Cloud simulator. For verifying the simulation results, we also created a CubeSat Cloud testbed consisting of 5 CubeSats. We used Raspberry Pi mini single-board computer for emulating a CubeSat and desktop computer for emulating the

Server and ground stations. Below is a detailed description of CubeSat Cloud simulator and emulator. 7.1 Hardware and Software of Master and Worker CubeSats for Emulator

Master and Worker are emulated using Raspberry Pi. Raspberry Pi is a mini single-board computer developed by the Raspberry Pi Foundation. Figure 7-1 shows various components of Raspberry Pi. It has a Broadcom BCM2835 system on a chip

(SoC), has 512 MB of RAM and uses an SD card for booting and long-term storage. Debian and Arch Linux ARM distributions are available for running on Raspberry Pi. Python is the primary advocated programming language to be used with the platform, although support for BBC BASIC, C, and Perl is there. Below are more detailed specifications of a Raspberry Pi Model B single-board computer. We processed images using de-noise, entropy, peak detection, segmentation and Sobel edge detection algorithms. We used Scikit Python image processing library for processing the images.

• Processor: Raspberry Pi runs on Broadcom BCM2835 SoC chip. Broadcom chip includes ARM1176JZFS processor clocked at 700 MHz, floating point unit, and VideoCore 4 GPU.

• Graphics: With VideoCore GPU, Raspberry enables hardware-accelerated graphics capable of rendering 1Gpixel/s

• SDRAM: Model B comes with 512 MB RAM. 512 MB is shared with GPU. RAM is genrally clocked from 400 MHz to 500 MHz.

• Storage: There is no bootable flash disk, instead boots from pluggable SD card. A minimum of 2 GB is required, but more then 4 GB is suggested.

75 • Power ratings: Raspberry Pi draws about 300 mA (1 W) in idle power mode and about 700 mA (2.2 W) when all peripherals are active.

• Ports: Raspberry comes with 10/100 BaseT , HDMI and 2 USB ports. It is powered using microUSB interface. Its size is roughly about 9 x6x2cm.

• Low-level peripherals: It has 8 General Purpose IO (GPIO) pins, a UART, an I2C bus, a SPI bus with two chip selects and I2S audio.

Figure 7-1. Raspberry Pi mini computer

Image courtesy of Matthew Murray

The specifications in terms of processing power and memory are very similar to that of a CubeSat. So, we used Raspberry Pi to emulate a CubeSat in the CubeSat Cloud testbed.

76 7.2 Hardware and Software of Server and Ground Station for Emulator

Server and ground station hardware are implemented using Optiplex 755 model

desktop computers. Below are the specifications of these machines:

• Processor: It comes with two Core 2 Duo CPU E8400, clocked at 3.00 GHz.

• Memory: It comes with 4 GiB of RAM.

• Graphics: It is powered by VESA RV610 graphics card.

• OS Type: It is configured to run Ubuntu LTS 12.04.03 Precise Pangolin, 32-bit version.

• Disk: It has 240 GB of storage disk for OS and permanent storage.

We used the open source Ubuntu 12.04.03 Long Term Support (LTS) version as our base Operating System for Server and ground station. For running twisted applications, we used Python 2.7.3 version as Python versions above 3.0 did not have full support

for Twisted framework. We used Python python-twisted 11.1.0 built for Ubuntu using the python-twisted package.

7.3 Network Programming Frameworks

In order to develop CubeSat Cloud framework, we researched available network programming frameworks in Python, including Twisted, Eventlet, PyEv, asynccore, Tornado. Below is a brief description of the each of these frameworks.

7.3.1 Twisted

Twisted is considered as the best reactor frameworks available in Python. It is a little bit complex and has a steep learning curve, but is elegant and provides all necessary features required for developing asynchronous applications. 7.3.2 Eventlet

Eventlet was developed by Linden Lab. It is based on Greenlet framework, which is geared towards asynchronous network applications. It is non-pep8 compliant tough. Logging mechanism is not implemented to the full , the API is somewhat inconsistent.

77 7.3.3 PyEv

PyEv is based on libevent framework. It needs to be developed lot more to be considered as a serious competitor with other network programming frameworks. There does not seem to be big companies using this framework, as of now.

7.3.4 Asyncore

Asyncore is based on stdlib and is a very low-level framework. There is not much support for high-level network operations, so a lot of boiler code needs to be written just to get started with network applications.

7.3.5 Tornado

Tornado is a very simple python server meant for developing dynamic websites. It features async HTTP client and a simple ioloop. Its simple, but not provide required callback features to be considered a candidate for implementing CubeSat Cloud.

7.3.6 Concurrence

Concurrence is a networking framework for creating massively concurrent network applications in Python. It exposes a high-level synchronous API to low-level asynchronous IO using libevent. It runs using either Stackless Python or Greenlets. All blocking network I/O is transparently made asynchronous through a single libevent loop, so it is nearly as efficient as a real asynchronous server. It is similar to Eventlet in this way. The downside is that its API is quite different from Python’s sockets/threading modules.

7.4 Twisted Framework

Of the frameworks, we researched into Twisted was best suited for our job, since it provided handy features like callbacks, deferreds, etc. along with a strong community support. It is an asynchronous event based network programming framework. It is implemented in Python programming language and licensed under open source MIT license. Call backs are the core part of the Twisted framework. Users write callbacks

78 and register them to be called when events happen (as a connection is made, a message is received, or connection is lost). 7.5 Network Configuration

CubeSat to ground station communication link is modelled with data rate of 9600 bps, a delay of 2 ms with a jitter of 200 us following normal distribution. We modelled the CubeSat Cluster communication links using the specifications of RelNAV. Data rate is 1 Mbps, link communication delay of 0.1 ms. Packet loss rate was set at 0.3%, with a 25% loss correlation in order to simulate packet burst loses. We used Hierarchical Token Bucket (HTB) and tc networking tool on Linux to shape the network traffic to our requirements. 7.6 CubeSat Cloud Emulator Setup

CubeSat Cloud emulator consists of one Server, one Master, 5 Worker nodes and 5 Ground stations. CubeSat Cloud emulator is shown in the Figure 7-2. Master and worker CubeSats are emulated using Raspberry Pi, since the CubeSats footprint (processing power and RAM) matches with that of Raspberry Pi’s. Server and ground stations are emulated using Dell Optiplex computer. All the components are connected using a Gigabit Ethernet switch. CubeSat to CubeSat and CubeSat to ground station communication links are configured as described in the section 7.5. 7.7 CubeSat Cloud Simulator Setup

CubeSat Cloud simulator consists of one Server, one Master, 5 - 25 Worker nodes and ground stations. System architecture of CubeSat Cloud simulator is shown in the

Figure 7-3. Simulation is run on the Dell Optiplex computer described in section 7.2. Master and Worker CubeSat are simulated using the profiling results obtained from the emulator. Components communicate to each using TCP/IP sockets of localhost interface. CubeSat to CubeSat and CubeSat to ground station communication links are configured as described in the section 7.5. Simulation results are presented below.

79 80

Figure 7-2. CubeSat Cloud emulator 81

Figure 7-3. CubeSat Cloud simulator 7.8 CubeSat Reliability Model

Data reliability is achieved through replication. Each remote sensing image is

split into chunks and distributed to worker nodes. Each chunk is replicated on multiple CubeSats, so that if some CubeSats fail, data is still available on other CubeSats.

Number of replicas per chunk is primarily governed by required availability of data and node failure rate. Availability of an image (A) is given by,

A=(1 − fR)C × 100 Where, f is the probability of failure of a node, R is the number of replicas of each chunk and C is number of chunks of the file. To find the CubeSat failure probability,

we collected data about lifetimes of the CubeSats that are launched so far. Figure 7-4 shows a summary of the lifetime of launched CubeSats. More details about CubeSats

launched so far can be obtained from ”A Survey of Communication Sub-systems for Inter-satellite Linked Systems and CubeSat Missions” [34]. Using the above data, we calculated that the mean lifetime of a CubeSat is about 1204 days. And depending on

the downlink speeds and mission data size (about 100 MB), a remote sensing mission can take about 1 day. So the probability of failure of CubeSat during a mission (f) is

about 10−3. Typical number of chunks per file (C) is about 1000. With a redundancy of 1 (2 replicas for each chunk) CDFS provides an availability of 99.98 and with a redundancy of 2 (3 replicas) CDFS provides an availability of 99.9999. We targeted an availability of 99.9999. So each chunk needs to be replicated 3 times. 7.9 Simulation and Emulation Results

7.9.1 Profiling Reading and Writing of Remote Sensing Data Chunks on Rasp- berry Pi

In order to build a simulation framework, we did profiling of reading chunks from

flash storage and writing chunks to flash storage of remote sensing data chunks on Raspberry Pi single board mini-computer. Profiling results are reported in Figure 7-5.

Average reading and writing times for a chunk of size 64 KB are 4.91 and 15.66 ms.

82 Figure 7-4. Lifetimes of CubeSats

This shows that, reading and writing a file of 100 MB will take about 8 and 25 seconds. Compared to this, processing and downlinking a chunk will take order of hours. So reading and writing times are negligible compared to time taken for processing and downlinking a remote sensing image. 7.9.2 Processing, CubeSat to CubeSat and CubeSat to Ground Station Chunk Communication Time

We did profiling of processing time, CubeSat to CubeSat communication time and CubeSat to Ground station communication time. We processed images using de-noise, entropy, peak detection, segmentation and Sobel edge detection algorithms.

We used Scikit Python image processing library for processing the images. Processing time is the average of time taken by Raspberry Pi to process a chunk using the above mentioned image processing algorithms. Communication links are simulated using the parameters specified in section 8.5 Network Configuration. Profiling results are reported in Figure 7-6. Average CubeSat to CubeSat chunk communication time is

83 Figure 7-5. Read and write times of a chunk about 1.19 seconds, processing time is 15.62 seconds and CubeSat to Ground Station communication time is about 68.29 seconds. This result shows that chunk processing time and chunk communication time from CubeSat to ground station are more than an order of magnitude larger than chunk communication time between CubeSat to

CubeSat. As a result, distributing a file on the cluster will be much faster compared to processing and downlinking the file. These results also indicate that processing an image of size 100 MB on a single CubeSat will take about 7 hours and downlinking the same will take about 30 hours. Hence we need to parallelize processing and downlinking of remote sensing images. 7.9.3 Storing Remote Sensing Images using CubeSat Cloud

Figure 7-7 shows the time taken for storing (splitting am image into chunks and distributing the chunks to the worker nodes in the cluster) an image on the CubeSat

84 Figure 7-6. CubeSat to CubeSat and CubeSat to ground station chunk communication profiling

cluster for various cluster and image sizes. For cluster size of 1 (a single CubeSat), file storing time is almost zero (11 seconds for a file of size 100 MB), since the files only

needs to be split into chunks and does not needs to be distributed over the network. Average file storing time for 100 MB file on cluster of size 10 is about 12.96 minutes.

Since file storing time is only few minutes, it is negligible compared to file processing and file downlinking time, which are in hours.

7.9.4 Processing Remote Sensing Images using CubeSat Cloud

Figure 7-8 shows the image processing times for various cluster and image sizes.

We processed images using de-noise, entropy, peak detection, segmentation and Sobel edge detection algorithms. We used Scikit Python image processing library for processing the images. Processing time is the average of time taken by CubeSat Cloud to process the remote sensing images using the above mentioned image processing algorithms. For cluster size of 1 (a single CubeSat), file processing time is 448 minutes. Average file processing time for 100 MB file on clusters of size 10 and 25 is about 47 and 19 minutes respectively. This results in a savings of 401 minutes for processing a

file on cluster of size 10 and 429 minutes on a cluster of size 25. CubeSat MapMerge

85 Figure 7-7. File distribution time for various file sizes and cluster sizes reduces the processing time from about 8 hours to less than an hour and thus is attractive for processing large size remote sensing images.

Figure 7-8. File processing time for various file sizes and cluster sizes

7.9.5 Speedup and Efficiency of CubeSat MapMerge

We studied the variation of speedup and efficiency of CubeSat MapMerge with variation in cluster size. Speed up is defined as the ratio of time taken by the cluster

86 to process image to the time taken by a single CubeSat to process the same image.

Efficiency is defined as ratio of speed up of the cluster to the cluster size expressed in percentage. Figure 7-9 shows the variation of processing speedup with cluster size

for large files (>10 MB). For cluster sizes of 10 and 25 the speedup is 9.54 and 23.40 respectively. Figure 7-10 shows the variation of processing efficiency with cluster size for large files (>10 MB). For cluster sizes of 10 and 25 the efficiency is 95.38 and 93.61 respectively.

Figure 7-9. Speedup of CubeSat MapMerge

7.9.6 Downlinking Remote Sensing Images Using CubeSat Cloud

Figure 7-11 shows the image downlinking time for various cluster and file sizes. Downlinking time is the time taken by the a CubeSat or CubeSat Cluster to downlink a remote sensing image to the Server. A single CubeSat takes 1 day 6 hours of

connectivity time to downlink a file of size 100 MB. Compared to that average file

87 Figure 7-10. Efficiency of CubeSat MapMerge

downlinking time for 100 MB file on cluster of size 10 needs only about 3 hours

13 minutes of connectivity. This results in a savings of about 27 hours of time for downlinking a file of 100 MB. CubeSat Torrent reduces image downlinking time approximately by the factor of the size of the cluster.

7.9.7 Speedup and Efficiency of CubeSat Torrent

We studied the variation of speedup and efficiency of CubeSat Torrent with variation in cluster size. Speed up is defined as the ratio of time taken by the cluster to downlink

an image to the time taken by a single CubeSat to downlink the same image. Efficiency is defined as ratio of total effective data speed of the cluster to the total raw data speed of the cluster expressed in percentage. Figure 7-12 shows the variation of processing

speedup with cluster size for large files (>10 MB). For cluster sizes of 10 and 25 the speedup is 9.35 and 22.93 respectively. Figure 7-13 shows the variation of processing

88 Figure 7-11. File downlinking time for various file sizes and cluster sizes efficiency with cluster size for large files (>10 MB). For cluster sizes of 10 and 25 the efficiency is 71.95 and 70.59 respectively. 7.9.8 Copy On Transmit Overhead

Figure 7-14 shows the bandwidth overhead due to replication using Copy-On-Transmit for various cluster sizes. Energy overhead is same as bandwidth overhead. Bandwidth overhead for Copy On Transmit for cluster sizes of 10 and 25 is 35.71 and 9.61 respectively. Copy On Transmit leads to 200% storage overhead, as it creates two explicit replicas. 7.9.9 Source Coding Overhead

Figure 7-15 shows the bandwidth overhead for single and double redundancy due to Source Coding for various cluster sizes. With single redundancy, data can be recovered in case of one failed CubeSat. Using double redundancy, data can be recovered, even if two CubeSats fail. Bandwidth overhead for Source Coding for cluster sizes of 10 and 25 varies from about 5 - 25% depending the number of redundant chunks and cluster size. Energy overhead is same as bandwidth overhead.

89 Figure 7-12. Speedup of CubeSat Torrent

7.9.10 Metadata and Control Traffic Overhead

Figure 7-16 shows the bandwidth overhead due to metadata and other control information for various cluster and file sizes. Bandwidth overhead is about 0.4 - 1%.

Bandwidth and overhead percentage is mostly independent of the file size and varies primary with the cluster size.

7.9.11 Comparison of CDFS with GFS and HDFS

Bandwidth and energy are very limited on CubeSat cluster. CDFS uses several enhancements like using Master node as super replica node, Copy-on-transmit and liner block source coding for reducing the energy and bandwidth consumption. Figure 7-17 shows the bandwidth required by CDFS and GFS (as well as HDFS) for writing a file of 100 MB to the cluster. CDFS consumes about 35 - 40% less bandwidth compared GFS and HDFS. Figure 7-18 shows the time taken by CDFS and GFS (as well as HDFS)

90 Figure 7-13. Efficiency of CubeSat Torrent for writing a file of 100 MB to the cluster. CDFS writes are about 50% faster than GFS and HDFS because of super replica node and reduced bandwidth requirements. Figure 7-19 shows the energy required by CDFS and GFS (as well as HDFS) for writing a file of 100 MB to the cluster. CDFS consumes about 40% less energy compared GFS and

HDFS. 7.9.12 Simulator vs Emulator

Figure 7-20 shows the time required for writing, processing and downlinking remote sensing image of size 100 MB. Simulator results are about 5-12% more then emulator results. This discrepancy might be attributed to due delays in the simulation framework because of large number of threads running simultaneously.

91 Figure 7-14. Bandwidth overhead due to replication

Figure 7-15. Bandwidth overhead due to source coding

92 Figure 7-16. Bandwidth and energy overhead

Figure 7-17. Bandwidth consumption of CDFS vs GFS and HDFS

93 Figure 7-18. Write time of CDFS vs GFS and HDFS

Figure 7-19. Energy consumption of CDFS vs GFS and HDFS

94 Figure 7-20. Simulator vs emulator

7.10 Summary of Simulation Results

We simulated CubeSat Cloud framework on CubeSat Cloud testbed. CubeSat

Cloud framework was developed using Python programming language. We simulated CubeSat Torrent on a CubeSat cluster consisting of one master, 5 - 25 workers and 5

- 25 ground stations. Each CubeSat has a processor running at 1 GHz, 1 GB RAM, 32 GB non-volatile memory, 1 Mbps inter-cluster communication link and 9.6 kbps ground station data rate. Server and ground stations are connected to each other via Internet

through 10 Mbps data rate communication links. We simulated CubeSat Cloud with various cluster sizes. Our simulation results indicate that for cluster sizes in range of 5 to 25 CubeSats, a speedup of 4.75 - 23.15 times faster (compared to a single CubeSat) processing and downlinking of remote sensing images can be achieved. Simulation results closely match with results from the testbed.

95 CHAPTER 8 SUMMARY AND FUTURE WORK

Weight, power and geometry constraints severely limit processing and communication capabilities. A CubeSat has about 1 GHz processing capability, 1 GB RAM, 32 - 64

GB of flash memory and CubeSat to ground station communication data rate of 9.6 kbps. As a result, processing and communication intensive remote missions, which generate about 100 MB per sensing operation, cannot be completed in a meaningful amount of time. Processing a remote sensing image of size 100 MB takes about 8 hours and downlinking takes a day and quarter with current infrastructure. We consider the possibility of using distributed storage, processing and communications for faster execution of remote sensing missions.

We propose, CubeSat Cloud, a framework for distributed storage, processing and communication of remote sensing data on CubeSat Clusters. CubeSat Cloud is optimized for storing, processing and downlinking of large sized remote sensing data which is of order of hundreds of megabytes. CubeSat Cloud uses CubeSat Distributed File System for storing remote sensing data in distributed fashion on the cluster. CubeSat Distributed File System splits the large size remote sensing data in chunks and distributes them to the worker nodes in the cluster. Metadata consisting of

file to chunk mapping and chunk to worker node mapping is stored with Master node. For processing distributed data CubeSat Cloud uses CubeSat MapMerge. Worker nodes process the chunks stored with them and store the results obtained on the local file system. Once the chunks are processed, they are downlinked to Server using CubeSat Torrent. Server stitches the partial solutions into full solution. Component and link failures are treated as norm instead of exceptions. Failures are detected using Heartbeat mechanism and system is tolerant to component and link failures. CubeSat

Cloud implements several enhancements including copy-on-transmit and linear block source coding to reduce consumption of scarce resources like power and bandwidth.

96 For simulating CubeSat cloud we created, CubeSat Cloud testbed. We simulated

CubeSats using Raspberry Pis and testbed is written using Python-twisted, an event based asynchronous network programming framework. Simulation results indicate that CubeSat MapMerge and CubeSat Torrent, with cluster sizes in range of 5 - 25 CubeSats, enables 4.75 - 23.15 times faster (compared to a single CubeSat) processing and downlinking of large sized remote sensing data. All this speed is achieved at almost negligible bandwidth and memory overhead (1%). These results indicate that CubeSat Cloud can speed up remote sensing missions by a factor of size of cluster.

8.1 Future work

Below is an overview of the future work as an extension to CubeSat Cloud. Launching and deploying of the CubeSats into CubeSat cluster and maintaining the cluster for long time periods needs to be looked into. CubeSat Cloud was designed using Python programming language in order to support rapid prototyping. A flight ready system can be built using C++ and the network stack can be optimized for CubeSat communication channel characteristics. Link layer communication protocol can be integrated into CubeSat Torrent to improve the efficiency of the downloads. From

CubeSat subsystems perspective, a lightweight CubeSat to CubeSat low distance high speed LASER communication module will significantly enhance the efficiency of the system and lead to reduced energy consumption.

97 REFERENCES

[1] H. Heidt, J. Puig-Suari, A. Moore and R. Twiggs, “Cubesat: A new Generation of Picosatellite for Education and Industry Low-Cost Space Experimentation,” Proceedings of the Utah State University Small Satellite Conference, Logan, UT , Citeseer, p. 12, 2001. [2] Andrew E. Kalman (2010, Jan 15), “CubeSat Kit: Commercial Off the Shelf Components for Cuebsats,” Retrieved July 16, 2012, from http://www.cubesatkit.com/docs/datasheet/. [3] J. Gozalvez, “Smartphones Sent Into Space [Mobile Radio],” Vehicular Technology Magazine, IEEE, vol. 8, no. 3, pp. 13–18, 2013.

[4] Obulapathi N. Challa and Janise Y. McNair, “Distributed Computing on Cubesat Clusters using Mapreduce,” iCubeSat, The Interplanetary CubeSat Workshop, 2012. [5] Obulapathi N. Challa and Janise Y. McNair, “CubeSat Torrent: Torrent like Distributed Communications for CubeSat Satellite Clusters,” Military Communi- cations Conference, pp. 1–6, 2012.

[6] D.E. Koelle and R. Janovsky, “Development and Transportation costs of Space Launch Systems,” DGLR/CEAS European Air and Space Conference, 2007. [7] Kirk Woellert and Pascale Ehrenfreund and Antonio J. Ricco and Henry Hertzfeld, “Cubesats: Cost-effective Science and Technology Platforms for Emerging and Developing Nations,” Advances in Space Research, vol. 47, no. 4, pp. 663 – 684, 2011. [8] Jeffrey Dean and Sanjay Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008.

[9] S. Lee and J. Puig-Suari, Coordination of Multiple CubeSats on the Dnepr Launch Vehicle, M.S. Thesis. California Polytechnic State University, December 2006. [10] B. Klofas, J. Anderson, K. Leveque, “A Survey of CubeSat Communication Systems,” 5th Annual CubeSat Workshop - Cal Poly, 2008.

[11] MoreDBs team at University of CalPoly, “Massive Operations, Recording, and Experimentation Database System (2011, April 15).,” Retrieved July 16, 2012, from http://moredbs.atl.calpoly.edu/, 2008. [12] Norman G. Fitz-Coy, “Space Systems Group (ssg) (2008, aug 26).,” Retrieved July 16, 2012, from http://www2.mae.ufl.edu/ssg/.

[13] Janise Y. McNair, “Wireless and Mobile Systems Laboratory (wam) (2008, aug 26).,” Retrieved July 16, 2012, from http://www.wam.ece.ufl.edu/.

98 [14] Tzu Yu. Lin, Takashi Hiramatsu, Narendran Sivasubramanian and Norman G. Fitz-Coy, “T-c3: A cloud computing architecture for spacecraft telemetry collection,” Retrieved July 16, 2012, from http://www.swampsat.com/tc3, 2011. [15] GENSO Consortium, “Global Educational Network for Satellite Operations (2009, jun 20).,” Retrieved July 16, 2012, from http://www.genso.org/, 2009. [16] R. Scrofano, P.R. Anderson, J.P. Seidel, J.D. Train, G.H. Wang, L.R. Abramowitz, J.A. Bannister and D. Borgeson, “Space-based local area network,” Military Communications Conference, 2009., pp. 1–7, 2009. [17] Nestor Voronka, Tyrel Newton, Alan Chandler and Peter Gagnon, “Improving CubeSat Communications,” CubeSat Developers Workshop, Cal Poly, 2013.

[18] Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung, “The Google File System,” SIGOPS Operating Systems Review, vol. 37, no. 5, pp. 29–43, 2003. [19] K. Shvachko, Hairong Kuang, S. Radia and R. Chansler, “The Hadoop Distributed File System,” IEEE Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–10, 2010.

[20] , J.J. Kistler, P. Kumar, M.E. Okasaki, E.H. Siegel and D.C. Steere, “Coda: A Highly Available File System for a Distributed Workstation Environment,” IEEE Transactions on Computers, pp. 447–459, 1990. [21] Sun Jian, Li Zhan-huai and Zhang Xiao, “The Performance Optimization of Lustre File System,” 7th International Conference on Computer Science Education (ICCSE), pp. 214–217, 2012. [22] Apache Software Foundation, “Apache Thrift,” Retrieved July 16, 2012, from http://thrift.apache.org/, January 2012. [23] Apache Software Foundation (2012, Feb 6)., “HDFS: Hadoop Distributed File System,” Retrieved July 16, 2012, from http://hadoop.apache.org/, June 2012. [24] “Florida University SATellite V (FUNSAT V) Competition,” Retrieved July 16, 2012, from ://vivo.ufl.edu/display/n958538186, 2009.

[25] “Tethers SDR (2012): Software Defined Radio (SWIFT SDR) Based Communication Downlinks for CubeSats,” Retrieved July 16, 2012, from http://goo.gl/Q5fut, 2012. [26] “RelNav: Relative Navigation, Timing and Data Communications for CubeSat Clusters,” Retrieved July 16, 2012, from http://www.tethers.com/SpecSheets/RelNavSheet.pdf.

[27] Paul Muri, Obulapathi N. Challa and Janise Y. McNair, “Enhancing Small Satellite Communication Through Effective Antenna System Design,” Military Communica- tions Conference, 2010, pp. 347–352, 2010.

99 [28] R. Russell, D. Quinlan and C. Yeoh, “Filesystem Hierarchy Standard,” Retrieved July 16, 2012, from http://refspecs.linuxfoundation.org/FHS 2.3/fhs-2.3.pdf, January 2003. [29] A. William, Beech, D. E. Nielsen and J. Taylor, “AX.25 Link Access Protocol for Amateur Packet Radio,” Retrieved July 16, 2012, from http://www.tapr.org/pdf/AX25.2.2.pdf, 1998.

[30] “CubeSat Space Protocol: A Small Network-layer Delivery Protocol Designed for CubeSats,” Retrieved July 16, 2012, from https://github.com/GomSpace/libcsp, April 2010. [31] B. Cohen, “The BitTorrent Protocol Specification Standard,” Retrieved July 16, 2012, from http://www.bittorrent.org/beps/bep 0003.html, January 2008.

[32] Obulapathi N. Challa and Janise Y. McNair, “Distributed Data Storage on CubeSat Clusters,” Advances in Computing, pp. 36–49, 2013. [33] Gokul Bhat, Obulapathi Challa, Paul Muri and Janise McNair, “Robust Communications for CubeSat Cluster using Network Coding,” 3rd Interplanetary CubeSat Workshop, 2013.

[34] Paul Muri and Janise McNair, “A Survey of Communication Sub-systems for Intersatellite Linked Systems and CubeSat Missions,” JCM, vol. 7, no. 4, pp. 290–308, 2012.

100 BIOGRAPHICAL SKETCH

Dr. Obulapathi N. Challa was born and brought up in India. He received a B.S. in Information and Communication Technology from DA-IICT in India, a M.S. in Computer

Engineering and a Ph.D. in Cloud Computing from the University of Florida. He worked as a Research Assistant with Dr. Janise McNair and was a part of Wireless and Mobile Laboratory, and Small Satellite Group at University of Florida. His interests include

Cloud Computing, BigData, Small Satellites, Open Source and Distributed Systems.

101