Compression Algorithm in Mobile Packet Core
Total Page:16
File Type:pdf, Size:1020Kb
Master of Science in Computer Science September 2020 Compression Algorithm in Mobile Packet Core Lakshmi Nishita Poranki Faculty of Computing, Blekinge Institute of Technology, 371 79 Karlskrona, Sweden This thesis is submitted to the Faculty of Computing at Blekinge Institute of Technology in partial fulfilment of the requirements for the degree of Master of Science in Computer Science. The thesis is equivalent to 20 weeks of full time studies. The authors declare that they are the sole authors of this thesis and that they have not used any sources other than those listed in the bibliography and identified as references. They further declare that they have not submitted this thesis at any other institution to obtain a degree. Contact Information: Author(s): Lakshmi Nishita Poranki E-mail: [email protected] University advisor: Siamak Khatibi Department of Telecommunications Industrial advisor: Nils Ljungberg Ericsson, Gothenburg Faculty of Computing Internet : www.bth.se Blekinge Institute of Technology Phone : +46 455 38 50 00 SE–371 79 Karlskrona, Sweden Fax : +46 455 38 50 57 Abstract Context: Data compression is the technique that is used for the fast transmission of the data and also to reduce the storage size of the transmitted data. Data compression is the massive and ubiquitous technology where almost every communication company make use of data compression. Data compression is categorized mainly into lossy and lossless data compression. Ericsson is one of the telecommunication company that deals with millions of user data and, all these data get compressed using the deflate compression algorithm. Due to the compression ratio and speed, the deflate algorithm is not optimal for the present use case(compress twice and decompress once) of Ericsson. This research is all about finding the best alternate algorithm which suits the current use case so that the deflate algorithm can replace it. Objectives: The objective of the research is to replace the deflate algorithm with the algorithm, which is useful for compressing the Serving GPRS Support Node-Mobility Manage- ment Entity(SGSN-MME) user data effectively. The main objectives to achieve this goal are: to investigate the better algorithm which fits the SGSN-MME compression patterns, to investigate the few alternate algorithms for deflate algorithm, the SGSN- MME dataset was used to perform experimentation, the experiment should perform by using all selected algorithms on the dataset, the results of the experiment were compared based on the compression factors, based on the performance of algorithm the deflate algorithm will get replaced with the suitable algorithm. Methods: In this research, a literature review performed to investigate the alternate algorithms for the deflate algorithm. After selecting the algorithm, an experiment conducted on the data which was provided by Ericsson AB, Gothenburg and based on the com- pression factors like compression ratio, compression speed the performance of the algorithm evaluated. Results: By analyzing the results of the experiment, Z-standard is the better performance algorithm with the optimal compression sizes, compression ratio, and compression speed. Conclusions: This research concludes by identifying an alternate algorithm that can replace the deflate algorithm and also which is suitable for the present Use case. Keywords: Compression Algorithms, Lossless Compression Algorithm, SGSN-MME node, Compression factors, performance of compression algorithm. ii Acknowledgments Firstly, I would like to express my prodigious gratitude to my university supervisor Prof. Siamak Khatibi, Department of Telecommunications for his worthwhile super- vision, patience, suggestions and incredible guidance throughout entire period of the thesis. I would also like to express my warmest gratitude to my Industrial Supervisors Erik Vargas and Nils Ljunberg for their guidance, support and insightful comments throughout my journey at Ericsson AB, Gothenburg. I would like to express my profound gratitude to my parents Siva Rama Prasad Poranki and Vijaya Lakshmi Poranki, my sister Snehitha Poranki and my colleague Lakshmi Venkata Sai Sri Tikkireddy for their persistent and unparalleled love and continuous support. Last but not least, I would like to thank all of my friends who stood beside me during my good and bad times and helped me a lot to complete my thesis and made my thesis journey so successful. Thank you very much, everyone!! iii Contents Abstract i Acknowledgments iii 1 Introduction 1 1.1 Problem Statement: ............................ 4 1.2 Outline: .................................. 4 2 Background 6 2.1SGSN:................................... 6 2.2 MME: ................................... 6 2.3 Ericsson SGSN components: ....................... 7 2.4 Performance factors ............................ 8 2.5 Types of Lossless Compression Algorithms: ............... 8 2.5.1 Huffman Algorithm ........................ 9 2.5.2 Arithmetic Coding ........................ 9 2.5.3 Shannon-Fano ........................... 10 2.5.4 Lempel-Ziv ............................ 10 2.5.5 LZ77-LZ78 ............................. 10 2.5.6 LZ4 ................................ 11 2.5.7 Brotli ............................... 12 2.5.8 Zlib ................................ 14 2.5.9 Z-standard ............................. 15 3 Methodology 17 3.1 Literature review ............................. 18 3.2 Experiment ................................ 20 3.2.1 Hypothesis ............................. 20 3.2.2 Experiment workspace ...................... 20 3.2.3 Dataset Creation ......................... 21 3.2.4 Dataset Pre-Processing ...................... 22 3.2.5 Experiment ............................ 23 3.2.6 Statistical Tests .......................... 24 4 Results and Analysis 26 4.1 Results for Z-standard: .......................... 26 4.2 Results for LZ4: .............................. 27 4.3 Results for Brotli: ............................. 28 iv 4.4 Results for Zlib (deflate): ......................... 29 4.5 Comparison of Compression Ratio .................... 29 4.6 Comparison of Compression speed .................... 31 4.7 Comparison of Space-saving ....................... 32 4.8 Result Analysis .............................. 33 4.9 Performing Statistical Tests ....................... 34 5 Discussion 36 5.1 Answering Research Questions ...................... 36 5.2 Validity Threats .............................. 37 6 Conclusion and Future Work 39 References 40 v List of Figures 1.1 Lossy compression ............................ 2 1.2 Lossy vs Lossless compression ...................... 3 2.1 Ericsson SGSN components ....................... 7 2.2 Types of Compression .......................... 9 2.3FlowchartofLZ4............................. 13 3.1 Dataset Creation ............................. 23 4.1 Z-standard compression ......................... 26 4.2 LZ4 compression ............................. 27 4.3 Brotli compression ............................ 28 4.4 Zlib compression ............................. 29 4.5 Comparison of compression ratios .................... 30 4.6 Comparison of compression speed .................... 31 4.7 Comparison of space saving ....................... 32 vi List of Tables 2.1 The data format of the LZ4 sequence .................. 12 4.1 Ranks for comparison of compression ratios of algorithms for each test case. .................................... 34 4.2 Average ranks of algorithms. ....................... 35 vii List of Abbreviations ANS Asymmetric Numeral System AP Application Procedures ASCII American Standard Code for Information Interchange BSD Berkeley Source Distribution DP Device Procedures ETS Erlang Term Storage FPGA Field Programmable Gate Array FSE Finite State Entropy GGSN Gateway GPRS Support node GIF Graphics Interchange Format GNU GNU’s not Unix GPRS General Packet Radio Service GSM Global System for mobile communication(2G) HTML Hyper Text Markup Language HTTP Hyper Text Transfer Protocol IP Internet Protocol IT Information Technology JPEG Joint Photographic Experts Group LTE Long Term Evolution(4G) LZW Lempel-Zive-Welch MME Mobility Management Entity MP3 MPEG Audio layer 3 MPEG Moving Picture Experts Group viii OTP Open Telecom Platform PIU’s PlugIn units SGSN Service GPRS Support Node SLR Systematic Literature Review tANS tabled variant of ANS TIFF Tagged Image File Format UE User Equipment WCDMA Wideband Code Division Multiple Access(3G) WEM Workspace Environment Management Zstd Z standard ix Chapter 1 Introduction Digital communication is a field that deals with the concept of transmitting and receiving digital data. Here, the digital data may contain different data formats like text data, audio, video, and images which are to be transmitted. Before transmis- sion, the complete data needs to convert into a binary format which is either a 0 or 1. Usually, the data bitstream consists of many data bits which may reach to millions of data bits in most of the cases. So, it is clear that the large files will take approximately minutes for the transmission [33]. If the massive amount of data should get transmitted, there is a chance of delay in reaching the destination. The data should get compressed to avoid the situations [44]. Incomparably the vast amount of the data received, processed and transmitted, which will affect the data transmission speed ability and also leads to the shortage of storage [57][66]. For a long time, compression is the domain that consists of a small group of engineers and scientists. But now, the data