Gonzalez Vijay Chidambaram Jivko Sinapov Peter Stone University of Texas at Austin

Total Page:16

File Type:pdf, Size:1020Kb

Gonzalez Vijay Chidambaram Jivko Sinapov Peter Stone University of Texas at Austin CC-Log: Drastically Reducing Storage Requirements for Robots Using Classification and Compression Santiago Gonzalez Vijay Chidambaram Jivko Sinapov Peter Stone University of Texas at Austin Abstract cal storage or streaming data as it is produced to a remote server are not feasible solutions. Sub-sampling the sen- Modern robots collect a wealth of rich sensor data during sor data at a fixed reduced rate runs the risk of missing their operation. While such data allows interesting anal- data essential for debugging or analysis. ysis and sophisticated algorithms, it is simply infeasible to store all the data that is generated. However, collecting We tackle this problem in the context of the Building- only samples of the data greatly minimizes the usefulness Wide Intelligence (BWI) project at the University of of the data. We present CC-LOG, a new logging system Texas at Austin [4]. The robot associated with the built on top of the widely-used Robot Operating Sys- project, the BWIBot, is used for research for human- tem that uses a combination of classification and com- robot interaction, multi-robot interaction, and planning pression techniques to reduce storage requirements. Ex- in cohabited environments. The BWIBot has sensors periments using the Building-Wide Intelligence Robot, a such as the Kinect and the odometer which produce ten mobile autonomous mobile platform capable of operat- to hundreds of MB of data per minute. ing for long periods of time in human-inhabited environ- We present a system, CC-LOG, that combines event ments, showed that our proposed system can reduce stor- classification and compression to drastically reduce the age requirements by more than an order of magnitude. storage used for logging (x3). First, instead of logging Our results indicate that there is significant unrealized all events, we use a Support Vector Machine (SVM) to potential in optimizing infrastructure commonly used in classify events as anomalous or not, and only log anoma- robotics applications and research. lous events (along with a few events preceding and suc- ceeding the anomalous event). Second, we leverage the 1 Introduction high inherent redundancy of the JSON format of the logs, using well-known loseless compression schemes such as In recent times, we have witnessed an unprecedented LZMA [7] to reduce the size of the logs significantly. surge in sensors and IoT devices. This has been accom- We use the Gazebo robot simulation framework [8] to panied by an exponential increase in the amount of data collect data from several runs of the robot (as it is lo- being generated. Sensors often generate small amounts gistically difficult to collect enough data on the shared of data, but at high frequency (e.g., at 100 Hz). Prior re- physical robot to evaluate our system). Compression re- search has looked at how to efficiently store and process duces the logged data to 10% of its original size, and these streams of data in sensor networks [1–3]. our classification scheme our event classification scheme However, one such source of sensor data has been further reduces the amount of data logged (x4). In our largely unexplored: robotics. Modern robotic platforms evaluation, the classifier does not produce any false neg- seek to understand natural language commands and per- atives (i.e., missing data we want to log). CC-LOG was form a wide variety of tasks. To achieve this, robots in- developed as a package in the widely-used Robot Op- clude a large number of sensors. For example, structured erating System [5] to facilitate integration into different infrared depth sensors (such as the Microsoft Kinect) and robot platforms. LIDAR scanners are used for localization and naviga- We believe our work on CC-LOG is a strong indica- tion in a number of robots [4]. Other sensors used in- tor of the untapped opportunities in the field of robotics. clude rotary encoders, inertial measurement sensors, and Analyzing robotics infrastructure from a systems view- standard cameras. Such sensors produce data frequently point can yield significant improvements in efficiency (e.g., depth sensors produce 30 GB in 15 seconds), and (x6). Sadly, the different areas of research in computer logging the sensor data allows post-hoc analysis and de- science have become silos, with very little cross-over of bugging [5, 6]. ideas. By presenting this work in HotStorage, we hope to However, the dizzying rate of data production poses alleviate this problem a little by showing there are inter- a problem for limited storage and short battery life of esting systems challenges and opportunities for fruitful robots. Due to these restrictions, logging all data on lo- collaboration in robotics. 2 Background LOG accomplishes this through the use of a unique slid- ing window structure and a classifier. We first provide some background on the BWIBot and Anomalous events typically include the sudden and the Robot Operating System. The BWIBot version extreme acceleration or sudden twisting to a large degree. 2 (also called the Segbot) contains a desktop com- Note that both acceleration and twisting are not anoma- puter powered by an Intel i7-4790T/i7-6700T processor, lous in the general case; only when they deviate from placed in HD-Plex H1.S fanless case, with 6 Gigabit normal behavior. Since anomalous events typically are Ethernet Network Interfaces, along with 4 USB3 and 2 not instantaneous, CC-LOG takes recent history into ac- USB2 interfaces. A 20 inch touchscreen is mounted at count when classifying data. Additionally, while anoma- a human-operable height to serve as the primary user in- lous events are interesting in and of themselves, it may terface with the robot. The BWIBot is equipped with be desirable to save data for a period of time following several sensors including a PointGrey BlackFly GigE the event. Such data can help roboticists see how, and if camera and the Kinect. The BWIBot is built on top of the robot recovers from anomalies. the Segway RMP 110 mobile base which, coupled with We first describe how anomalies are detected, how Simultaneous Localization and Mapping (SLAM), pro- we use windows to log anomalous events, and how log vides sufficiently accurate odometry estimates for robust events are compressed. navigation. The robot is controlled and programmed using the Anomaly Detection. To classify feature vectors as Robot Operating System (ROS) [5]. ROS is structured as anomalous or nominal, we elected to use a one-class a group of communicating processes called nodes. While Support Vector Machine [9] (SVM) with a nonlinear ra- nodes need to communicate with each other using a com- dial basis function (RBF) kernel [10]. SVMs function by mon format, each node can be written in a different lan- attempting to find the maximally-separating hyperplane guage. This allows developers to easily extend ROS in between two sets of data using linear programming tech- the language of their choice. Nodes and sensors can send niques. Since most datasets are not linearly separable, messages to a topic that other nodes can subscribe to. kernel functions can be used to map data into a high- Typically, ROS processes the raw data from sensors and dimensional feature space. In our case, we have chosen publishes a topic with formatted sensor data, which other to use a Gaussian RBF kernel. One-class SVMs, specif- nodes consume. For example, the odometry sensor topic ically, are a popular distribution density estimation tech- publishes several KB of data every 10 milliseconds (100 nique; largely due to their simplicity and wide applica- Hz). The raw odometry data is processed by ROS and bility to different domains. Within the CC-LOG system, published in JSON format. The JSON format has a lot of we use the Scikit-learn library [11] (built atop libsvm) to redundancy that increases the storage requirements. provide SVM functionality. Machine learning classifiers require a feature vector: a vector of values that have been a meaningful dependence 3 Design on the set to be classified. For example, for classification based on odometry data, the x-component of the robot’s velocity is a useful feature. Feature selection is impor- Goal. In this work, we focus on efficiently logging the tant in many types of machine learning. A sub-optimal BWIBot odometry data (but our techniques are general selection of features can result in an inaccurate classifier. and will work for other sensor data as well). We con- Feature vectors are constructed as a concatenated array sider that the logging data is used mainly for debugging of recently seen features. and analysis that considers anomalous events to be of in- terest. We note that sometimes the log data is used to CC-LOG uses pose position, pose orientation, linear obtain statistics about the robot (e.g., how many kilome- twist, and angular twist data for our feature vectors. We ters has the robot covered over the past year?). For such deliberately limited the size of our feature set to reduce cases, other techniques (such as statistical sampling) can the size of the training and testing datasets. Selecting be used so that samples are logged to allow statistics ag- feature vectors happens at setup time, even before the gregation with enough accuracy; we do not handle it in first event has been logged. this work. Logging Windows. To help in debugging and analysis, We present CC-LOG, a new logging module in the CC-LOG logs the events around each detected anomaly Robot Operating System [5] that drastically reduces the (i.e., what happened just before and after the anomaly storage requirements for logging sensor data. At a high was detected). To implement this, CC-LOG uses a log- level, CC-LOG processes data from subscribed topics, ging window.
Recommended publications
  • Data Compression: Dictionary-Based Coding 2 / 37 Dictionary-Based Coding Dictionary-Based Coding
    Dictionary-based Coding already coded not yet coded search buffer look-ahead buffer cursor (N symbols) (L symbols) We know the past but cannot control it. We control the future but... Last Lecture Last Lecture: Predictive Lossless Coding Predictive Lossless Coding Simple and effective way to exploit dependencies between neighboring symbols / samples Optimal predictor: Conditional mean (requires storage of large tables) Affine and Linear Prediction Simple structure, low-complex implementation possible Optimal prediction parameters are given by solution of Yule-Walker equations Works very well for real signals (e.g., audio, images, ...) Efficient Lossless Coding for Real-World Signals Affine/linear prediction (often: block-adaptive choice of prediction parameters) Entropy coding of prediction errors (e.g., arithmetic coding) Using marginal pmf often already yields good results Can be improved by using conditional pmfs (with simple conditions) Heiko Schwarz (Freie Universität Berlin) — Data Compression: Dictionary-based Coding 2 / 37 Dictionary-based Coding Dictionary-Based Coding Coding of Text Files Very high amount of dependencies Affine prediction does not work (requires linear dependencies) Higher-order conditional coding should work well, but is way to complex (memory) Alternative: Do not code single characters, but words or phrases Example: English Texts Oxford English Dictionary lists less than 230 000 words (including obsolete words) On average, a word contains about 6 characters Average codeword length per character would be limited by 1
    [Show full text]
  • Package 'Brotli'
    Package ‘brotli’ May 13, 2018 Type Package Title A Compression Format Optimized for the Web Version 1.2 Description A lossless compressed data format that uses a combination of the LZ77 algorithm and Huffman coding. Brotli is similar in speed to deflate (gzip) but offers more dense compression. License MIT + file LICENSE URL https://tools.ietf.org/html/rfc7932 (spec) https://github.com/google/brotli#readme (upstream) http://github.com/jeroen/brotli#read (devel) BugReports http://github.com/jeroen/brotli/issues VignetteBuilder knitr, R.rsp Suggests spelling, knitr, R.rsp, microbenchmark, rmarkdown, ggplot2 RoxygenNote 6.0.1 Language en-US NeedsCompilation yes Author Jeroen Ooms [aut, cre] (<https://orcid.org/0000-0002-4035-0289>), Google, Inc [aut, cph] (Brotli C++ library) Maintainer Jeroen Ooms <[email protected]> Repository CRAN Date/Publication 2018-05-13 20:31:43 UTC R topics documented: brotli . .2 Index 4 1 2 brotli brotli Brotli Compression Description Brotli is a compression algorithm optimized for the web, in particular small text documents. Usage brotli_compress(buf, quality = 11, window = 22) brotli_decompress(buf) Arguments buf raw vector with data to compress/decompress quality value between 0 and 11 window log of window size Details Brotli decompression is at least as fast as for gzip while significantly improving the compression ratio. The price we pay is that compression is much slower than gzip. Brotli is therefore most effective for serving static content such as fonts and html pages. For binary (non-text) data, the compression ratio of Brotli usually does not beat bz2 or xz (lzma), however decompression for these algorithms is too slow for browsers in e.g.
    [Show full text]
  • Third Party Software Component List: Targeted Use: Briefcam® Fulfillment of License Obligation for All Open Sources: Yes
    Third Party Software Component List: Targeted use: BriefCam® Fulfillment of license obligation for all open sources: Yes Name Link and Copyright Notices Where Available License Type OpenCV https://opencv.org/license.html 3-Clause Copyright (C) 2000-2019, Intel Corporation, all BSD rights reserved. Copyright (C) 2009-2011, Willow Garage Inc., all rights reserved. Copyright (C) 2009-2016, NVIDIA Corporation, all rights reserved. Copyright (C) 2010-2013, Advanced Micro Devices, Inc., all rights reserved. Copyright (C) 2015-2016, OpenCV Foundation, all rights reserved. Copyright (C) 2015-2016, Itseez Inc., all rights reserved. Apache Logging http://logging.apache.org/log4cxx/license.html Apache Copyright © 1999-2012 Apache Software Foundation License V2 Google Test https://github.com/abseil/googletest/blob/master/google BSD* test/LICENSE Copyright 2008, Google Inc. SAML 2.0 component for https://github.com/jitbit/AspNetSaml/blob/master/LICEN MIT ASP.NET SE Copyright 2018 Jitbit LP Nvidia Video Codec https://github.com/lu-zero/nvidia-video- MIT codec/blob/master/LICENSE Copyright (c) 2016 NVIDIA Corporation FFMpeg 4 https://www.ffmpeg.org/legal.html LesserGPL FFmpeg is a trademark of Fabrice Bellard, originator v2.1 of the FFmpeg project 7zip.exe https://www.7-zip.org/license.txt LesserGPL 7-Zip Copyright (C) 1999-2019 Igor Pavlov v2.1/3- Clause BSD Infralution.Localization.Wp http://www.codeproject.com/info/cpol10.aspx CPOL f Copyright (C) 2018 Infralution Pty Ltd directShowlib .net https://github.com/pauldotknopf/DirectShow.NET/blob/ LesserGPL
    [Show full text]
  • The Basic Principles of Data Compression
    The Basic Principles of Data Compression Author: Conrad Chung, 2BrightSparks Introduction Internet users who download or upload files from/to the web, or use email to send or receive attachments will most likely have encountered files in compressed format. In this topic we will cover how compression works, the advantages and disadvantages of compression, as well as types of compression. What is Compression? Compression is the process of encoding data more efficiently to achieve a reduction in file size. One type of compression available is referred to as lossless compression. This means the compressed file will be restored exactly to its original state with no loss of data during the decompression process. This is essential to data compression as the file would be corrupted and unusable should data be lost. Another compression category which will not be covered in this article is “lossy” compression often used in multimedia files for music and images and where data is discarded. Lossless compression algorithms use statistic modeling techniques to reduce repetitive information in a file. Some of the methods may include removal of spacing characters, representing a string of repeated characters with a single character or replacing recurring characters with smaller bit sequences. Advantages/Disadvantages of Compression Compression of files offer many advantages. When compressed, the quantity of bits used to store the information is reduced. Files that are smaller in size will result in shorter transmission times when they are transferred on the Internet. Compressed files also take up less storage space. File compression can zip up several small files into a single file for more convenient email transmission.
    [Show full text]
  • The Deep Learning Solutions on Lossless Compression Methods for Alleviating Data Load on Iot Nodes in Smart Cities
    sensors Article The Deep Learning Solutions on Lossless Compression Methods for Alleviating Data Load on IoT Nodes in Smart Cities Ammar Nasif *, Zulaiha Ali Othman and Nor Samsiah Sani Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science & Technology, University Kebangsaan Malaysia, Bangi 43600, Malaysia; [email protected] (Z.A.O.); [email protected] (N.S.S.) * Correspondence: [email protected] Abstract: Networking is crucial for smart city projects nowadays, as it offers an environment where people and things are connected. This paper presents a chronology of factors on the development of smart cities, including IoT technologies as network infrastructure. Increasing IoT nodes leads to increasing data flow, which is a potential source of failure for IoT networks. The biggest challenge of IoT networks is that the IoT may have insufficient memory to handle all transaction data within the IoT network. We aim in this paper to propose a potential compression method for reducing IoT network data traffic. Therefore, we investigate various lossless compression algorithms, such as entropy or dictionary-based algorithms, and general compression methods to determine which algorithm or method adheres to the IoT specifications. Furthermore, this study conducts compression experiments using entropy (Huffman, Adaptive Huffman) and Dictionary (LZ77, LZ78) as well as five different types of datasets of the IoT data traffic. Though the above algorithms can alleviate the IoT data traffic, adaptive Huffman gave the best compression algorithm. Therefore, in this paper, Citation: Nasif, A.; Othman, Z.A.; we aim to propose a conceptual compression method for IoT data traffic by improving an adaptive Sani, N.S.
    [Show full text]
  • Forcepoint DLP Supported File Formats and Size Limits
    Forcepoint DLP Supported File Formats and Size Limits Supported File Formats and Size Limits | Forcepoint DLP | v8.8.1 This article provides a list of the file formats that can be analyzed by Forcepoint DLP, file formats from which content and meta data can be extracted, and the file size limits for network, endpoint, and discovery functions. See: ● Supported File Formats ● File Size Limits © 2021 Forcepoint LLC Supported File Formats Supported File Formats and Size Limits | Forcepoint DLP | v8.8.1 The following tables lists the file formats supported by Forcepoint DLP. File formats are in alphabetical order by format group. ● Archive For mats, page 3 ● Backup Formats, page 7 ● Business Intelligence (BI) and Analysis Formats, page 8 ● Computer-Aided Design Formats, page 9 ● Cryptography Formats, page 12 ● Database Formats, page 14 ● Desktop publishing formats, page 16 ● eBook/Audio book formats, page 17 ● Executable formats, page 18 ● Font formats, page 20 ● Graphics formats - general, page 21 ● Graphics formats - vector graphics, page 26 ● Library formats, page 29 ● Log formats, page 30 ● Mail formats, page 31 ● Multimedia formats, page 32 ● Object formats, page 37 ● Presentation formats, page 38 ● Project management formats, page 40 ● Spreadsheet formats, page 41 ● Text and markup formats, page 43 ● Word processing formats, page 45 ● Miscellaneous formats, page 53 Supported file formats are added and updated frequently. Key to support tables Symbol Description Y The format is supported N The format is not supported P Partial metadata
    [Show full text]
  • Context-Aware Encoding & Delivery in The
    Context-Aware Encoding & Delivery in the Web ICWE 2020 Benjamin Wollmer, Wolfram Wingerath, Norbert Ritter Universität Hamburg 9 - 12 June, 2020 Business Impact of Page Speed Business Uplift Speed Speed Downlift Uplift Business Downlift Felix Gessert: Mobile Site Speed and the Impact on E-Commerce, CodeTalks 2019 So Far On Compression… GZip SDCH Deflate Delta Brotli Encoding GZIP/Deflate – The De Facto Standard in the Web Encoding Size None 200 kB Gzip ~36 kB This example text is used to show how LZ77 finds repeating elements in the example[70;14] text ~81.9% saved data J. Alakuijala, E. Kliuchnikov, Z. Szabadka, L. Vandevenne: Comparison of Brotli, Deflate, Zopfli, LZMA, LZHAM and Bzip2 Compression Algorithms, 2015 Delta Encoding – Updating Stale Content Encoding Size None 200 kB Gzip ~36 kB Delta Encoding ~34 kB 83% saved data J. C. Mogul, F. Douglis, A. Feldmann, B. Krishnamurthy: Potential Benefits of Delta Encoding and Data Compression for HTTP, 1997 SDCH – Reusing Dictionaries Encoding Size This is an example None 200 kB Gzip ~36 kB Another example Delta Encoding ~34 kB SDCH ~7 kB Up to 81% better results (compared to gzip) O. Shapira: Shared Dictionary Compression for HTTP at LinkedIn, 2015 Brotli – SDCH for Everyone Encoding Size None 200 kB Gzip ~36 kB Delta Encoding ~34 kB SDCH ~7 kB Brotli ~29 kB ~85.6% saved data J. Alakuijala, E. Kliuchnikov, Z. Szabadka, L. Vandevenne: Comparison of Brotli, Deflate, Zopfli, LZMA, LZHAM and Bzip2 Compression Algorithms, 2015 So Far On Compression… Theory vs. Reality GZip (~80%) SDCH Deflate
    [Show full text]
  • Lempel-Ziv Sliding Window Update with Suffix Arrays
    i-ETC: ISEL Academic Journal of Electronics, Telecommunications and Computers CETC2011 Issue, Vol. 2, n. 1 (2013) ID-4 LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS Artur Ferreira1,3,4 Arlindo Oliveira2,4 Mario´ Figueiredo3,4 1Instituto Superior de Engenharia de Lisboa (ISEL) 2Instituto de Engenharia de Sistemas e Computadores – Investigac¸ao˜ e Desenvolvimento (INESC-ID) 3Instituto de Telecomunicac¸oes˜ (IT) 4Instituto Superior Tecnico´ (IST), Lisboa, PORTUGAL [email protected] [email protected] [email protected] Keywords: Lempel-Ziv compression, suffix arrays, sliding window update, substring search. Abstract: The sliding window dictionary-based algorithms of the Lempel-Ziv (LZ) 77 family are widely used for uni- versal lossless data compression. The encoding component of these algorithms performs repeated substring search. Data structures, such as hash tables, binary search trees, and suffix trees have been used to speedup these searches, at the expense of memory usage. Previous work has shown how suffix arrays (SA) can be used for dictionary representation and LZ77 decomposition. In this paper, we improve over that work by proposing a new efficient algorithm to update the sliding window each time a token is produced at the output. The pro- posed algorithm toggles between two SA on consecutive tokens. The resulting SA-based encoder requires less memory than the conventional tree-based encoders. In comparing our SA-based technique against tree-based encoders, on a large set of benchmark files, we find that, in some compression settings, our encoder is also faster than tree-based encoders. 1 INTRODUCTION dictionaries [4] and to find repeating sub-sequences for data deduplication [1], among other applications.
    [Show full text]
  • Answers to Exercises
    Answers to Exercises A bird does not sing because he has an answer, he sings because he has a song. —Chinese Proverb Intro.1: abstemious, abstentious, adventitious, annelidous, arsenious, arterious, face- tious, sacrilegious. Intro.2: When a software house has a popular product they tend to come up with new versions. A user can update an old version to a new one, and the update usually comes as a compressed file on a floppy disk. Over time the updates get bigger and, at a certain point, an update may not fit on a single floppy. This is why good compression is important in the case of software updates. The time it takes to compress and decompress the update is unimportant since these operations are typically done just once. Recently, software makers have taken to providing updates over the Internet, but even in such cases it is important to have small files because of the download times involved. 1.1: (1) ask a question, (2) absolutely necessary, (3) advance warning, (4) boiling hot, (5) climb up, (6) close scrutiny, (7) exactly the same, (8) free gift, (9) hot water heater, (10) my personal opinion, (11) newborn baby, (12) postponed until later, (13) unexpected surprise, (14) unsolved mysteries. 1.2: A reasonable way to use them is to code the five most-common strings in the text. Because irreversible text compression is a special-purpose method, the user may know what strings are common in any particular text to be compressed. The user may specify five such strings to the encoder, and they should also be written at the start of the output stream, for the decoder’s use.
    [Show full text]
  • Data Compression in Solid State Storage
    Data Compression in Solid State Storage John Fryar [email protected] Santa Clara, CA August 2013 1 Acknowledgements This presentation would not have been possible without the counsel, hard work and graciousness of the following individuals and/or organizations: Raymond Savarda Sandgate Technologies Santa Clara, CA August 2013 2 Disclaimers The opinions expressed herein are those of the author and do not necessarily represent those of any other organization or individual unless specifically cited. A thorough attempt to acknowledge all sources has been made. That said, we’re all human… Santa Clara, CA August 2013 3 Learning Objectives At the conclusion of this tutorial the audience will have been exposed to: • The different types of Data Compression • Common Data Compression Algorithms • The Deflate/Inflate (GZIP/GUNZIP) algorithms in detail • Implementation Options (Software/Hardware) • Impacts of design parameters in Performance • SSD benefits and challenges • Resources for Further Study Santa Clara, CA August 2013 4 Agenda • Background, Definitions, & Context • Data Compression Overview • Data Compression Algorithm Survey • Deflate/Inflate (GZIP/GUNZIP) in depth • Software Implementations • HW Implementations • Tradeoffs & Advanced Topics • SSD Benefits and Challenges • Conclusions Santa Clara, CA August 2013 5 Definitions Item Description Comments Open A system which will compress Must strictly adhere to standards System data for use by other entities. on compress / decompress I.E. the compressed data will algorithms exit the system Interoperability among vendors mandated for Open Systems Closed A system which utilizes Can support a limited, optimized System compressed data internally but subset of standard. does not expose compressed Also allows custom algorithms data to the outside world No Interoperability req’d.
    [Show full text]
  • Video/Audio Compression
    5.9. Video Compression (1) Basics: video := time sequence of single images frequent point of view: video compression = image compression with a temporal component assumption: successive images of a video sequence are similar, e.g. directly adjacent images contain almost the same information that has to be carried only once wanted: strategies to exploit temporal redundancy and irrelevance! motion prediction/estimation, motion compensation, block matching intraframe and interframe coding video compression algorithms and standards are distinguished according to the peculiar conditions, e.g. videoconferencing, applications such as broadcast video Prof. Dr. Paul Müller, University of Kaiserslautern 1 Video Compression (2) Simple approach: M-JPEG compression of a time sequence of images based on the JPEG standard unfortunately, not standardized! makes use of the baseline system of JPEG, intraframe coding, color subsampling 4:1:1, 6 bit quantizer temporal redundancy is not used! applicable for compression ratios from 5:1 to 20:1, higher rates call for interframe coding possibility to synchronize audio data is not provided direct access to full images at every time position application in proprietary consumer video cutting software and hardware solutions Prof. Dr. Paul Müller, University of Kaiserslautern 2 Video Compression (3) Motion prediction and compensation: kinds of motion: • change of color values / change of position of picture elements • translation, rotation, scaling, deformation of objects • change of lights and shadows • translation, rotation, zoom of camera kinds of motion prediction techniques: • prediction of pixels or ranges of pixels neighbouring but no semantic relations • model based prediction grid model with model parameters describing the motion, e.g. head- shoulder-arrangements • object or region based prediction extraction of (video) objects, processing of geometric and texture information, e.g.
    [Show full text]
  • In-Core Compression: How to Shrink Your Database Size in Several Times
    In-core compression: how to shrink your database size in several times Aleksander Alekseev Anastasia Lubennikova www.postgrespro.ru Agenda ● What does Postgres store? • A couple of words about storage internals ● Check list for your schema • A set of tricks to optimize database size ● In-core block level compression • Out-of-box feature of Postgres Pro EE ● ZSON • Extension for transparent JSONB compression What this talk doesn’t cover ● MVCC bloat • Tune autovacuum properly • Drop unused indexes • Use pg_repack • Try pg_squeeze ● Catalog bloat • Create less temporary tables ● WAL-log size • Enable wal_compression ● FS level compression • ZFS, btrfs, etc Data layout Empty tables are not that empty ● Imagine we have no data create table tbl(); insert into tbl select from generate_series(0,1e07); select pg_size_pretty(pg_relation_size('tbl')); pg_size_pretty --------------- ??? Empty tables are not that empty ● Imagine we have no data create table tbl(); insert into tbl select from generate_series(0,1e07); select pg_size_pretty(pg_relation_size('tbl')); pg_size_pretty --------------- 268 MB Meta information db=# select * from heap_page_items(get_raw_page('tbl',0)); -[ RECORD 1 ]------------------- lp | 1 lp_off | 8160 lp_flags | 1 lp_len | 32 t_xmin | 720 t_xmax | 0 t_field3 | 0 t_ctid | (0,1) t_infomask2 | 2 t_infomask | 2048 t_hoff | 24 t_bits | t_oid | t_data | Order matters ● Attributes must be aligned inside the row create table bad (i1 int, b1 bigint, i1 int); create table good (i1 int, i1 int, b1 bigint); Safe up to 20% of space.
    [Show full text]