HydraSpace: Computational Data Storage for Autonomous Vehicles

Ruijun Wang, Liangkai Liu and Weisong Shi Wayne State University {ruijun, liangkai, weisong}@wayne.edu

Abstract— To ensure the safety and reliability of an The current vehicle data logging system is designed autonomous driving system, multiple sensors have been for capturing a wide range of signals of traditional CAN installed in various positions around the vehicle to elim- bus data, including temperature, brakes, throttle settings, inate any blind point which could bring potential risks. Although the sensor data is quite useful for localization engine, speed, etc. [4]. However, it cannot handle sensor and perception, the high volume of these data becomes a data because both the data type and the amount far burden for on-board computing systems. More importantly, exceed its processing capability. Thus, it is urgent to the situation will worsen with the demand for increased propose efficient data computing and storage methods for precision and reduced response time of self-driving appli- both CAN bus and sensed data to assist the development cations. Therefore, how to manage this massive amount of sensed data has become a big challenge. The existing of self-driving techniques. As it refers to the installed vehicle data logging system cannot handle sensor data sensors, camera and LiDAR are the most commonly because both the data type and the amount far exceed its used because the camera can show a realistic view processing capability. In this paper, we propose a computa- of the surrounding environment while LiDAR is able tional storage system called HydraSpace with multi-layered to measure distances with laser lights quickly [5], [6]. storage architecture and practical compression algorithms to manage the sensor pipe data, and we discuss five Both produce a massive amount of data that would be open questions related to the challenge of storage design multiplied with the increased resolution and number of for autonomous vehicles. According to the experimental channels. results, the total reduction of storage space is achieved by Many researchers have investigated image compres- 88.6% while maintaining the comparable performance of sion to save storage spaces [7]–[11]. The authors in the self-driving applications. [12], [13] proposed an efficient image compression al- I.INTRODUCTION gorithm for gray-scale images based on the quadtree The data captured by an autonomous vehicle is grow- decomposition method. Other researchers have focused ing quickly, typically generating between 20TB and on bit-error aware lossless compression algorithms for 40TB per day, per vehicle [1]. This includes cam- color image compression subject to the bit-error rate eras, which tend to generate 20 to 60Mbps, depending during transmission [14]. In order to largely reduce the on the quality of the image, as well as sonar (10 compressed size, lossy compression has been proposed; to100kbps), radar (10kbps), LiDAR (10 to 70Mbps), it does not restore the original data entirely, but the infor- and GPS (50kbps) [2], [3]. Storing data securely and mation loss has little impact on the understanding of the efficiently can accelerate overall system performance. original image, resulting in a much larger compression Take object detection as an example; the variation of ratio [15]–[17]. However, none of these methods have historical data could contribute to the improvement of been utilized on the vehicular dataset to reduce the size detection precision using machine learning algorithms. and save storage space. With more data produced by Map generation can also benefit from the stored data connected and autonomous vehicles, there is a need to in updating traffic and road conditions appropriately. In create an effective data store and management plan to addition, the sensor data can be utilized for ensuring facilitate the development of autonomous vehicles and public security as well as predicting and preventing their applications. crime. The biggest challenge is to ensure that sensors are In this paper, we propose a novel computational data collecting the right data, and it is processed immediately, storage solution for autonomous vehicles by adopting stored securely, and transferred to other technologies in effective compression algorithms based on different in- the chain, such as infrastructure, Road-Side Unit (RSU), coming sources. Comprehensive and intensive experi- and cloud data center. More importantly, how to create ments were conducted to verify the effectiveness of our hierarchical storage and workflow that enables smooth proposed method. The major contributions of this paper data accessing and computing is still an open question are as follows: for the future development of autonomous vehicles. • Propose a computational storage architecture named

1 Cloud server multiple sensors, various data access frequency, infor-

Cellular tower Road-Side Unit mation , as well as data retrieving and analysis. As shown in Figure 1, there are three types of on-

Sensed & CAN Bus data Computation Access frequency Applications board computing applications that need the support of

Sensor pipe sensed data. To satisfy the requirements of hard real-time Ÿultrasonic Hard real-time requirement Cache ŸCamera Data Object Cruise Collision Detection Control ŸŸŸ Avoidance processing applications, such as adaptive cruise control ŸLiDAR Reduce Transfer ŸGPS noise Soft real-time requirement

Ÿ SSD

Ÿ and object detection, the data will be placed directly into

Ÿ Map Ÿ Ÿ Oriented Infrastructure V2X ŸŸŸ generation Ÿ CAN bus a high-speed cache to accelerate the response time. For Detect Non time-critical applications ŸEngine abnormal HDD ŸSpeed Space Health Traffic Social features Assistant information ŸŸŸ Media ŸBrake less time-critical applications that are defined as a soft Ÿ Ÿ Ÿ real-time requirement in Figure 1, the sensor data will Data collector HydraSpace Vehicle Computing Unit adopt a fast lossy compression algorithm and be stored Fig. 1. An overview of HydraSpace. in lower latency SSD to save storage space and meet the quick response requirement. For those non time-critical applications, the data will first be compressed using a HydraSpace to efficiently support a variety of ap- lossless algorithm and stored in HDD to satisfy the needs plications leveraging diverse data sources for au- of large capacity while reducing the total cost. tonomous vehicles. B. Access Frequency • Apply various compression algorithms to investi- gate the compression performance to find an opti- The sensed data should be placed in different levels mal solution for HydraSpace. of storage architecture based on its access frequencies. • Conduct intensive experiments on an indoor Mobile The highly frequently utilized data can be arranged into platform HydraOne [18] to collect a real vehicular a high-speed cache with limited space; others could dataset, and test the system performance as well as be stored in SSDs or HDDs based on their volume the power consumption of multiple sensors. and the application demands. For those hard real-time • Discuss five open questions to envision the future applications, the tasks will be running periodically to storage design challenge for autonomous vehicles. generate the results for the vehicle control system to The rest of the paper is organized in the following take corresponding action. In other words, these data structure. Section II focuses on the system design and are considered as “hot” data, which will be accessed implementation of our proposed data storage HydraS- frequently by time-critical applications, such as object pace. The experimental setup and observation results are detection, collision avoidance, adaptive cruise control, presented in Section III. Section IV presents the open etc. As we mentioned in the II-A, the less utilized data questions and challenge regarding storage system design, can be put into SSD or HDD for backup and later Section V describes the related works and our work is analysis. concluded in Section VI. C. Data Amount and Type

II.HYDRASPACE DESIGN There are two major types of data flowing into Hy- draSpace in autonomous vehicles, which are sensor pipe In this section, we present the system design of our data and CAN bus data. The sensor pipe data contains the proposed HydraSpace. It is a multilevel computational camera, LiDAR, millimeter radar, ultrasonic, and GPS. storage system that is designed for autonomous vehicles The CAN bus data consists of traditional vehicle data to compute and manage vehicular data based on access such as speed, engine information, brake, temperature, frequency, data type and volume, as well as real time ap- etc. Compared to the can bus data, sensor pipe data will plication requirements. More importantly, how to support generate more data due to its complicated data structure applying multiple machine learning models to the same and number of different sensors. In this paper, the data data set concurrently also poses a challenge in the design produced by the camera and LiDAR are the major con- of HydraSpace. This calls for creating an intermediate cerns we considered in our design. The detailed analysis results layer in storage to avoid redundant computations. of the vehicular data-set structure is presented below Figure 1 shows the overall architecture of HydraSpace. to demonstrate the in-depth analysis of our proposed solution. A. Real-time Application Requirement 1) Camera Image dataset: The image data that we The storage architecture of HydraSpace includes a used in our paper are RGB images, which are collected cache, solid-state drive (SSD), and traditional hard disk using the camera installed on our mobile platform. The drives (HDD). This multilevel storage scheme is de- RGB image is a three-dimensional array: two of the signed to cater to the vast amount of data generated by dimensions specify the location of a pixel within an

2 image, and the other dimension specifies the color of each pixel. The color dimension always has a size of 3 and is composed of the red, green, and blue color channels of the image. We formulate the RGB image using the following equations:

 X = varR ∗ 0.4124 + varG ∗ 0.3576 + varB ∗ 0.1805  Y = varR ∗ 0.2126 + varG ∗ 0.7152 + varB ∗ 0.0722  Fig. 2. The experiment setup for HydraSpace.  Z = varR ∗ 0.0193 + varG ∗ 0.1192 + varB ∗ 0.9505 (1) Where X, Y stands for the pixel position within an im- X, Y , Z stands for the position of each point in 3D age, Z represents the specific color of the corresponding space. The object can be described in the following pixel, and the parameters represent the specific color map equation: of our selected RGB images. (1) (2) (N) 2) LiDAR Point Cloud dataset: LiDAR is one of Os = [Os ,Os , ..., Os ] (3) the most popular sensors installed on connected and D. System Design autonomous vehicles to collect data and support the Based on the above requirements, we propose a self-driving applications, such as 3D map generation, computational storage HydraSpace with a multi-layered and collision avoidance. It will generate a set of points storage architecture and adopt effective compression embedded in the 3D space and carry both geometry algorithms to manage the sensor pipe data. The term and attribute information. First, LiDAR acts as an eye ”computational storage” means raw data will be stored of self-driving vehicles. It is continuously rotating and after applying certain computation processes, such as sending thousands of laser pulses to collide with the compression, noise filtering, and abnormal detection. surrounding objects. An on-board computing system Compared with the previous method of storing raw data, records the resulting light reflections and translates the processed data can effectively improve application this rapidly updating point cloud into an animated 3D performance. Take data compression as an example; representation [6]. Secondly, LiDAR can be used for the compressed data can largely improve the running state localization. It can obtain precise and real-time efficiency in streaming processing and face model recog- locating information in the global coordinate system by nition as well as deep model generation [19]–[21]. Ad- matching the online frame with the prior map, which ditionally, the capacity of the cache, SSD, and HDD can is needed in many modules, such as behavior decision, be adjusted based on the application requirements and motion planning, and feedback control. With the help real budget. In our current design, the storage capacity is of LiDAR, autonomous vehicles travel smoothly and gradually increased to meet the needs of a large capacity avoid collisions by detecting obstructions in advance. of sensor pipe data and maintain a reasonable cost. To This improves the safety of commuters and makes au- assist in the development of connected and autonomous tonomous vehicles less prone to accidents due to the risk vehicles and ensure the smooth communication between of human negligence and reckless driving. The downside CAVs, edge servers, and remote clouds, HydraSpace will of LiDAR is that it will produce a huge amount of data allocate a ”transfer oriented space” as shown in Figure 1 up to 30MB per second (30MB/s) based on its specific to store the temperate data that is ready to be collected configurations. In this case, managing this data in a by the applications running on edge servers, road-side fast and effective way could largely contribute to the units, or clouds. development of connected and autonomous vehicles. In our proposed HydraSpace, we present a sensor pipe data III.IMPLEMENTATION AND EVALUATION management scheme to abide by the following rules. For In this section, we first present the initial implemen- hard real-time processing applications, the sensor data tation of HydraSpace and the devices we used in the can be quickly accessed directly for computing purposes. evaluation. Then we show the results related to system For applications with soft real-time or fast processing performance, compression effects, and self-driving appli- needs, the data can be first stored into the HydraSpace cation performance as well as storage space reduction. and then retrieved by the applications when needed. The point cloud data in 3D space can be translated using the A. Initial Implementation following equation: HydraSpace was implemented in the indoor mobile platform HydraOne [18] with the installation of six (X,Y,Z) = H−1 ∗ (r, α, ) (2) cameras and one 3D LiDAR for collecting sensor pipe

3 Device No. Summary C. Compression Effects Computing 1 32GB memory,Intel R Xeon(R) CPU System E3-1275 v5 @ 3.60GHz × 8 Camera 6 SONY IMX322 CMOS LIDAR 1 Velodyne Puck LITE, the default rota- The compression effect is evaluated by using compres- tion speed is 600RPM and work fre- quency is 10 Hz sion ratio, compression time, and compression error. The TABLE I compression ratio is defined as the ratio of the original EXPERIMENT PARAMETERS data size to the compressed data size. The compression time refers to the total time spent on compressing a specific amount of the data. The compression error is data. The specific parameters are presented in Table I. defined as the deviation of the original data value and The camera model we used is based on SONY IMX322 the decompressed data value. We run four different com- CMOS with a frequency of 33Hz. It is connected via pression algorithms on image and point cloud data col- the USB port to the computing system. The LiDAR we lected by the camera and LiDAR installed in the indoor used in our design is Velodyne Puck LITE, which is con- platform to check the compression effects [23], [24]. We nected through the Ethernet with a default rotation speed apply four different compression algorithms, including of 600RPM and working frequency of 10 Hz. These , lz4, sz, and zfp. The former two compression devices were launched on top of the ROS [22]. The approaches are based on lossless algorithms while the capacity of the cache, SSD and HDD for HydraSpace latter two are based on lossy compression algorithms. is set up as 16GB,250GB and 1TB, respectively. For the Specifically, bzip2 is based on the Burrows Wheeler computation processes that we mentioned in Figure 1, we algorithm, which is one of the best overall compression implemented adaptive compression algorithms for sensor algorithms. This method is capable of compressing files pipe data, and we plan to add more computing progress down to 15% or 10%. Lz4 provides extremely fast in our future works. compression speed that is greater than 500MB/s per The experiment is categorized into three steps as core. It encodes and decodes from a sliding window shown in Figure 2. First, we evaluate the system impact over previously seen characters, which results in both by utilizing six cameras and one 3D LiDAR. Second, a smaller output and faster decompression times. As it we run different compression algorithms on image and refers to the lossy compression, sz is a fast error-bounded point cloud to observe the compression effect. Third, we lossy compression solution that works effectively on run various self-driving applications on the compressed large-scale HPC data sets [15]. It starts by linearizing data and compare their performance. multi-dimensional snapshot data and then predicts the successive data points with the best fit selection of curve fitting models. zfp is a lossy compressor for the integer B. System Performance Evaluation and floating-point data stored in multidimensional arrays. The impact on system performance is evaluated by By applying these compression algorithms to a vehic- capturing the memory percentage, CPU utilization, and ular data-set, we will have a complete understanding power consumption with various combinations of the of the compression effects and find the best solution number of cameras and LiDAR. This is aimed at discov- for on-board storage reduction. Figure 4 shows the ering the influence on the on-board computing system by compression outcomes on both point cloud and image varying the settings of cameras and LiDAR to find the datasets regarding compression ratio, compression time, optimal solution for the sensors. The experiment was and compression error among the algorithms mentioned carried out with four different settings: LiDAR only, above. We can see from the figure that the lossy com- LiDAR with two cameras, LiDAR with four cameras, pression method has better compressed size compared and LiDAR with six cameras. The results are presented with the lossless algorithms for both point cloud and in Figure 3. We can see that the 6 cameras with Li- image datasets. Specifically, sz outperforms zfp due DAR have the most impact on the system performance to its error-bounded selection and effective prediction compared to other settings with an average of 35% CPU of the data point with curve fitting models. Figures 5 utilization and 0.3% memory usage as well as 12.8 Watts and 6 illustrate the points and pixel variations among the power consumption. We can also conclude that all four four compression methods and their standard deviation, settings have a minimal influence on memory but a indicating the corresponding compression errors. After greater impact on CPU utilization with the increasing compression of the point cloud dataset using two lossless number of cameras. As it refers to power consumption, and two lossy algorithms, the results are presented in the LiDAR consumes 7.7 watts and the total power Figure 8. The results generated by the two lossless consumption would not be changed while the number algorithms, bzip2 and lz4, have an apparent visual effect of camera increases. compared to the results produced by sz and zfp.

4 Fig. 3. System performance evaluation by running multiple sensors.

tion performances with a precision of 98%, 95%, and 97%, respectively.

E. Storage Space Reduction Based on the experiment results and observations, we propose an initial data management scheme for (a) (b) HydraSpace following the rules presented below. If the Fig. 4. Compression performance among four methods on (a) point incoming source is a camera, the data will be streamed cloud (b) image. into the HydraSpace and compressed using the lossy compression algorithm sz, which has a faster compres- sion time, accepted deviation, and comparative detection performance on a self-driving application. If it is from LiDAR, the point cloud data will be compressed by the bzip2 lossless algorithm due to the careful consideration of compression time and reconstruct performance as well as the scalability on the current system. Thus, we can Fig. 5. Point cloud compression error among the 4 methods. save 88.6% of the space.

IV. OPEN QUESTIONSAND CHALLENGES D. Self-driving Application Performance Autonomous driving is a complex task that relies on After comparing the compression results using loss- precise localization, navigation, planning, and system less and lossy algorithms, we need to conduct further control. To maximize the safety and reliability of the investigation into the impacts on self-driving applica- vehicle, multiple sensors have been installed for collect- tions. Thus, we run an open-source TensorFlow Object ing data and performing computing tasks. However, the Detection API built on top of TensorFlow to investigate amount of the sensed data is increasing rapidly, which the detection performance. We use images of varying far exceeds the capability and capacity of the existing quality as the input to construct, train, and deploy object on-board computing and storage systems. detection models. The output pictures are presented in The foremost challenge when developing a CAV stor- Figure 7. Figure 7(a) shows the object detection result age system is that the architecture should cater to the dif- of the original picture and its detection precision. Fig- ferent real-time application requirements, satisfy several ures 7(b) and (c) show the results of the compressed computation needs, consider effective task offloading image using sz and zfp lossy compression algorithms. and data backup as well as provide fast communication We found that all these images have comparative detec- between edge servers and remote clouds. Moreover, developers need to access and process sensed data, which is usually behind the CAN bus in a vehicle, as well as create and manage their applications by themselves. How to manage and share the data between the developers is still an open question. Many open source projects for data analytics are not reliable enough to meet the safety and reliability requirements; also, the lack of programming frameworks for accessing, processing and Fig. 6. Image compression error among the 4 methods. sharing CAV data among the VCU, road side servers,and

5 (a) (b) (c)

Fig. 7. Object detection results on (a) original image (b) compressed image using sz (c) compressed image using zfp.

different sensors fall into different granularity. For example, considering the vehicles that use network time protocol (NTP) for time synchronization, the timestamp difference can be as long as 100ms. For some sensors with a built-in GPS antenna the time accuracy goes to the nanosecond level, while other sensors get a timestamp from the host machine system time where the time accuracy is at the millisecond level. How to handle the sensor data with different frequency and timestamp accuracy is still an open question. • Privacy: The sensors on a vehicle can capture much sensitive information, like people’s faces, license plates, etc. In order to share the data without disclosure of personal privacy, how to mask privacy information from the raw data and maintain the whitelist/blacklist of access control has become a big challenge. For example, without proper access control methods, malicious applications might tam- per this life-critical data, resulting in erroneous driv- ing decisions and threatening the safety of passen- gers. Thus, how to securely access the data without Fig. 8. Reconstruction of point cloud data among four methods. any violations has become an urgent requirement for CAVs and their associated infrastructure. How- ever, this is not an easy task due to the complexity the cloud [25] should be taken into consideration in the of vehicular data, the high volume of access patterns future development of CAVs. as well as the lack of efficient control framework. Here we summarize five open questions and chal- • Security: With rich data and a powerful compu- lenges for autonomous driving storage systems: tation platform, autonomous vehicles are supposed to provide a service that allows outside developers • Synchronization: Data on the autonomous driving to run applications on it [25]. The storage system vehicle has a variety of sources: its own sensors, provides not only raw data but also computing other vehicles’ sensors, RSU, and even social me- resources for developers to access. How to leverage dia. One big challenge to handle a variety of data Trusted Execution Environment (TEE) to run appli- sources is how to synchronize them. For example, cations remains an open issue. Specifically, this is a camera usually produces 30-60 frames per second a two-way security challenge. On the one hand, a while LiDAR’s point cloud data frequency is 10HZ. question is how to prevent the host (vehicle) from For applications like 3D object detection which knowing the function executed by third parties, such requires camera frames and point cloud at the same as searching for a person wanted by law enforce- time, should the storage system do synchronization ment. On the other hand, another question is how to beforehand or let the application developer to do prevent third party applications from attacking the it? This issue becomes more challenging consid- host (vehicle) and other applications running on the ering that the accuracy of the timestamps from

6 vehicle. Both require further investigation. named Diamond. The system optimizes the evaluation • Data movement: Data movement happens every- order of the filters based on the run-time measurements where since the whole autonomous driving pipeline of each filter’s selectivity and computational cost. Di- is built on sensing and perception. Several running amond can also dynamically partition the computation applications/processes may require the same data, between the storage devices and the host computer to and several pieces of data may be needed for the adjust for changes in hardware and network conditions. execution of one application. Usually a middle- Performance numbers show that Diamond dynamically ware system is used as the middle-level between adapts to a query and run-time system state. applications and hardware/software infrastructure For compression techniques, image compression al- for efficient data movement, like Robot Operating gorithms can be divided into two categories: lossless System (ROS) [22] and Cyber [26]. However, how compression and lossy compression. Lossless compres- to share data among processes or vehicles with low sion ensures that the reconstructed image is exactly latency and less overhead is still an open issue. the same as the original image, but its compression • Reliability and redundancy: To make data - ratio and efficiency are inferior to lossy compression. ing a service, guaranteeing reliability is an impor- Lossy compression does not restore the original data tant issue. This problem becomes even more severe completely, but the information loss has little impact on when considering the limited storage space and the understanding of the original image, resulting in a the huge amount of data sensors can generate. In much larger compression ratio. Currently, there are two general, for a distributed system, several replicas main kinds of lossy compression methods to maintain are maintained on several nodes to ensure reliability. the color image. First are direct methods. This way is to The first question is how many replicas are neces- compress the image directly in the spatial domain, using sary and where to store these replicas. For the ve- techniques that mainly consist of block truncation [32] hicle scenario, the storage on an RSU is a potential and vector quantization [33]. Second are transform meth- option for the vehicle to store replicas of their data. ods. This way uses discrete cosine transform (DCT) [34], The next question is how to ensure the consistency discrete wavelet transform (DWT) [35], principal com- considering the unreliable communications between ponent analysis (PCA) [36], and other transformations to the vehicle and RSU [27]. process the image in a transform domain. The purpose of these transformations is to concentrate the energy of V. RELATED WORK the original image into less transform coefficients. Ratnasamy et al. developed a Geographic Hash Table system to make effective use of the vast amounts of VI.SUMMARY data gathered by large-scale sensor networks [28]. The Geographic Hash Table system (GHT) hashes keys into With the explosive growth of data collected by mul- geographic coordinates and stores a key-value pair at tiple sensors, data storage is the critical factor for the the sensor node geographically nearest to the hash of its future of Autonomous Vehicles. Unfortunately, there has key. The system replicates stored data locally to ensure been no solid and comprehensive research conducted in persistence when nodes fail. It also distributes the load this field. Therefore, we propose a computational storage throughout the network using a geographic hierarchy. HydraSpace with multi-layered storage architecture that The results demonstrate that GHT is the preferable adopts effective compression algorithms to manage the approach as it offers high data availability and scales sensor pipe data. We also conduct a comprehensive to large sensor-net deployments even when nodes fail or experiment on the indoor experimental platform to study are mobile. Zeng et al. proposed a layered architecture of the system performance regarding CPU and memory cloud storage and discussed the key technologies involv- utilization, power consumption for different sensors, as ing deployment, storage virtualization, data organization, well as the data-set compression with the consideration migration, security, etc. [29]. The operation mechanism of different data types. This would be extremely helpful includes an ecology chain, game theory, ant colony opti- for the application development and system optimization mization, data life cycle management, maintenance and for future autonomous vehicles. For future work, we will update, convergence and evolution mechanisms. Gibson focus on the evaluation of various computing platforms, et al. [30] proposed the Network-Attached Secure Disk reduce the data stored in HydraSpace by incorporating (NASD) storage architecture, which provides scalable the feature extraction of the collected data, and imple- storage bandwidth without the cost of servers used ment more computation processes. In addition, five open primarily for transferring data from peripheral networks. questions and challenges have been discussed to envision Huston et al. [31] proposed a storage architecture for the future development of storage for autonomous vehi- early discard for an interactive search of unindexed data cles.

7 REFERENCES Conference on Usenix Annual Technical Conference, USENIX ATC ’18, page 307–320, USA, 2018. USENIX Association. [1] Flood of data will get generated in autonomous cars. [20] Ping Luo, Zhenyao Zhu, Ziwei Liu, Xiaogang Wang, and Xiaoou https://autotechreview.com/features/flood-of-data-will-get- Tang. Face model compression by distilling knowledge from generated-in-autonomous-cars. Accessed: 2020-2-18. neurons. In Proceedings of the Thirtieth AAAI Conference on [2] Data storage is the key to autonomous vehicles’ fu- Artificial Intelligence, AAAI’16, page 3560–3566. AAAI Press, ture. https://iotnowtransport.com/2019/02/12/71015-data-storage- 2016. key-autonomous-vehicles-future/. Accessed: 2019-12-30. [21] Bharat Bhusan Sau and Vineeth N. Balasubramanian. Deep [3] The basics of LiDAR - light detection and ranging - remote model compression: Distilling knowledge from noisy teachers. sensing. https://www.neonscience.org/lidar-basics. Accessed: CoRR, abs/1610.09650, 2016. 2020-2-18. [22] Robotic . https://www.ros.org/. Accessed: 2020- [4] S. Ilic, J. Katupitiya, and M. Tordon. In-vehicle data logging 2-18. system for fatigue analysis of drive shaft. In International [23] Junguo Zhang, Qiumin Xiang, Yaguang Yin, Chen Chen, and Workshop on Robot Sensing, 2004. ROSE 2004., pages 30–34, Xin Luo. Adaptive compressed sensing for wireless image sensor 2004. networks. Multimedia Tools and Applications, 76(3):4227–4242, [5] Yunsheng Wang, Holger Weinacker, and Barbara Koch. A LiDAR Feb 2017. point cloud based procedure for vertical canopy structure analysis [24] D. Minnen, G. Toderici, M. Covell, T. Chinen, N. Johnston, and 3d single tree modelling in forest. Sensors, 8(6):3938–3951, J. Shor, S. J. Hwang, D. Vincent, and S. Singh. Spatially adaptive Jun 2008. image compression using a tiled deep network. In 2017 IEEE [6] D. Steinhauser, O. Ruepp, and D. Burschka. Motion segmentation International Conference on Image Processing (ICIP), pages and scene classification from 3d LiDAR data. In 2008 IEEE 2796–2800, Sep. 2017. Intelligent Vehicles Symposium, pages 398–403, June 2008. [25] Q. Zhang, Y. Wang, X. Zhang, L. Liu, X. Wu, W. Shi, and [7] Oren Rippel and Lubomir Bourdev. Real-time adaptive image H. Zhong. Openvdap: An open vehicular data analytics platform compression. In Proceedings of the 34th International Confer- for cavs. In 2018 IEEE 38th International Conference on Dis- ence on Machine Learning - Volume 70, ICML’17, pages 2922– tributed Computing Systems (ICDCS), pages 1310–1320, 2018. 2930. JMLR.org, 2017. [26] Baidu. Apollo Cyber. [Online]. [8] George Toderici, Damien Vincent, Nick Johnston, Sung https://github.com/ApolloAuto/apollo/tree/master/cyber. Jin Hwang, David Minnen, Joel Shor, and Michele Covell. Full [27] Liangkai Liu, Baofu Wu, and Weisong Shi. A comparison of resolution image compression with recurrent neural networks. In communication mechanisms in vehicular edge computing. In 3rd The IEEE Conference on Computer Vision and Pattern Recogni- USENIX Workshop on Hot Topics in Edge Computing (HotEdge tion (CVPR), July 2017. 20). USENIX Association, June 2020. [9] Nanrun Zhou, Haolin Li, Di Wang, Shumin Pan, and Zhihong [28] Sylvia Ratnasamy, Brad Karp, Li Yin, Fang Yu, Deborah Estrin, Zhou. Image compression and encryption scheme based on 2d Ramesh Govindan, and Scott Shenker. Ght: A geographic compressive sensing and fractional mellin transform. Optics hash table for data-centric storage. In Proceedings of the 1st Communications, 343:10 – 21, 2015. ACM International Workshop on Wireless Sensor Networks and [10] Y. Zhou, C. Wang, and X. Zhou. DCT-based color image Applications, WSNA ’02, page 78–87, New York, NY, USA, compression algorithm using an efficient lossless encoder. In 2002. Association for Computing Machinery. 2018 14th IEEE International Conference on Signal Processing [29] Wenying Zeng, Yuelong Zhao, Kairi Ou, and Wei Song. Re- (ICSP), pages 450–454, Aug 2018. search on cloud storage architecture and key technologies. In [11] Eli Shusterman, Meir Feder, and Senior Member. Image com- Proceedings of the 2nd International Conference on Interaction pression via improved quadtree decomposition algorithms, 1994. Sciences: Information Technology, Culture and Human, ICIS ’09, [12] Muhammad Ali Qureshi and M. Deriche. A new wavelet page 1044–1048, New York, NY, USA, 2009. Association for based efficient image compression algorithm using compressive Computing Machinery. sensing. Multimedia Tools and Applications, 75(12):6737–6754, [30] Garth A. Gibson, David F. Nagle, Khalil Amiri, Jeff Butler, Jun 2016. Fay W. Chang, Howard Gobioff, Charles Hardin, Erik Riedel, [13] Hanaa ZainEldin, Mostafa A. Elhosseini, and Hesham A. Ali. David Rochberg, and Jim Zelenka. A cost-effective, high- Image compression algorithms in wireless multimedia sensor bandwidth storage architecture. 32(5), 1998. networks: A survey. Ain Shams Engineering Journal, 6(2):481 – [31] Larry Huston, Rahul Sukthankar, Rajiv Wickremesinghe, Ma- 490, 2015. hadev Satyanarayanan, Gregory R Ganger, Erik Riedel, and [14] X. Peng, J. Hou, L. Tan, J. Chen, J. Jiang, and X. Guo. Bit-Error Anastassia Ailamaki. Diamond: A storage architecture for early aware lossless color image compression. In 2019 IEEE Inter- discard in interactive search. In FAST, volume 4, pages 73–86, national Conference on Electro Information Technology (EIT), 2004. pages 126–131, May 2019. [32] E. Delp and O. Mitchell. Image compression using block trunca- [15] S. Di and F. Cappello. Fast error-bounded lossy HPC data tion coding. IEEE Transactions on Communications, 27(9):1335– compression with SZ. In 2016 IEEE International Parallel 1342, 1979. and Distributed Processing Symposium (IPDPS), pages 730–739, [33] N. M. Nasrabadi and R. A. King. Image coding using vector May 2016. quantization: a review. IEEE Transactions on Communications, [16] James Diffenderfer, Alyson Fox, Jeffrey Hittinger, Geoffrey 36(8):957–971, 1988. Sanders, and Peter Lindstrom. Error analysis of ZFP compression [34] Wen-Hsiung Chen, C. Smith, and S. Fralick. A fast computational for floating-point data. SIAM Journal on Scientific Computing, algorithm for the discrete cosine transform. IEEE Transactions 41:A1867–A1898, 02 2019. on Communications, 25(9):1004–1009, 1977. [17] Chiranjeevi Karri and Umaranjan Jena. Fast vector quantization [35] M. J. Shensa. The discrete wavelet transform: wedding the a trous using a bat algorithm for image compression. Engineering and mallat algorithms. IEEE Transactions on Signal Processing, Science and Technology, an International Journal, 19(2):769 – 40(10):2464–2482, 1992. 781, 2016. [36] Svante Wold, Kim Esbensen, and Paul Geladi. Principal compo- [18] Yifan Wang, Liangkai Liu, Xingzhou Zhang, and Weisong Shi. nent analysis. Chemometrics and Intelligent Laboratory Systems, HydraOne: An indoor experimental research and education plat- 2(1):37 – 52, 1987. Proceedings of the Multivariate Statistical form for CAVs. In 2nd {USENIX} Workshop on Hot Topics in Workshop for Geologists and Geochemists. Edge Computing (HotEdge 19), 2019. [19] Gennady Pekhimenko, Chuanxiong Guo, Myeongjae Jeon, Peng Huang, and Lidong Zhou. Tersecades: Efficient data compression in stream processing. In Proceedings of the 2018 USENIX

8