Investigation of the Error Performance of Tunstall Coding
Total Page:16
File Type:pdf, Size:1020Kb
University of Malta Faculty of Engineering DepartmentofCommunications and Computer Engineering Final Year Pro ject B Eng Hons Investigation of the Error Performance of Tunstall Co ding by Johann Bria A dissertation submitted in partial fullment of the requirements for the award of Bachelor of Engineering Hons of the UniversityofMalta June Abstract A lossless data compression algorithm takes a string of symb ols and enco des it as a string of bits such that the average number of bits required is less than that in the unco ded case where all source symb ols are represented by equal length co dewords Compression algorithms are only p ossible when some strings or some symbols in the input stream are more probable than others these would b e enco ded in fewer bits than less probable strings or symb ols resulting in a net average gain Most compression algorithms may b e split into twomajortyp es xedtovariable length schemes such as Human co ding and Arithmetic co ding enco de equallength source symb ols or strings with co dewords of a variable length In variabletoxed length schemes such as the ZivLemp el co de to units of variable length which are then transmitted with a the input is divided in xedlength output co de The Tunstall source co ding algorithm is a variabletoxed length enco ding scheme It has b een frequently hyp othesised that in such schemes the error propagation and re synchronisation diculties encountered in xedtovariable length schemes would b e less signicant leading to a better p erformance in noise The error p erformance of various Tunstall co des is analysed and a theoretical mo del prop osed Several parameters which can be varied in the enco ding pro cess are considered with the ob jective of minimising the eect of errors without sacricing compression The p ossibilityofcho osing an error correction scheme optimised for the Tunstall algorithm is also considered including the use of Unequal Error Protection techniques Finally the p erformance in noise of Tunstall co ding is compared with that of Human co ding Acknowledgements I would like to take this opp ortunity to thank my sup ervisor Dr Victor Buttigieg PhD Manch MSc Manch BElecEng Hons MIEEE for suggesting this eld of study for my nal year pro ject and for all his invaluable assistance Many thanks also go to Mr Edward Gatt for explaining various details of UNIX systems and for a lot of patience during to o long simulations This pro ject nds me once again indebted to my family particularly my parents for their patience and supp ort throughout my studies Their encouragement in the pursuit of knowledge is invaluable and deeply appreciated FinallyIwould like to express my gratitude to all my friends particularly those who have b een with me through these last four years They have help ed often without knowing to makemy life what it is to day Johann Bria June My dear Watson said he I cannot agree with those who rank mo desty among the virtues To the logician all things should b e seen exactly as they are and to underesti mate oneself is as much a departure from the truth as to exaggerate ones own p owers Sherlo ck Holmes The Adventure of the Greek Interpreter Sir Arthur Conan Doyle Contents Intro duction Data Compression Unequal Symb ol Probability Statistically Dep endantSymb ols Lossless Reduction of Statistical Dep endance Lossy Reduction of Statistical Dep endance Eect of Errors in Compressed Data Error Control Co ding Do cument Structure Co ding Algorithm The Tunstall Tunstall Co dec Tunstall Algorithm Illustration Co ding Eciency Co deword Assignment Co de Rate Complications Source Message to o Short CONTENTS Numb er of Source Symb ols not an Integral Power of Two Sources with Memory Mo difying the Tunstall Algorithm Mo difying the Source Message Shortcomings of the Tunstall Co ding System of Tunstall Co de Error Performance Measuring the Error Performance Error Span and Error Increase Levenshtein Distance Standard Algorithm Simplied Algorithm Improved Algorithm Validity of Simplied Algorithms Error Span Validity of Error Span Algorithms to Minimise Error Span Intro duction Random Assignment Sequential Assignment Gray Co de Assignment Simulated Annealing Comments ReactiveTabu Search Comments CONTENTS Greedy Algorithm Illustration Implementation Details ts to the Basic Algorithm Improvemen Comments SemiExhaustive Search Illustration Comments Comparison with Human Co ding Conclusions Performance of Tunstall Co des in a BSC Co ding Gain for Tunstall Co des Mathematical Mo del for Calculating Co ding Gain Comparison with Human Co ding Use of Error Correction Intro duction Indep endent Channel Co ding Comparison with Human Co ding Optimised Channel Co ding Unequal Protection of Bits Within a Co deword Unequal Protection of Co dewords Conclusions Tunstall Co dec Diculties of Having an Incomplete Tunstall Co de CONTENTS Sources with Memory Sources which Cannot b e Fully Enco ded Adaptive Compression Minimising Error Span Comparison with Human Co ding Use of Error Control Co ding Optimised Error Protection for Tunstall Co des A Source Statistics B Program Do cumentation B EndUser Programs B Library Functions B Simulation Controllers B UtilityFunctions List of Figures Mo del of a communication system Tunstall tree at one level of expansion Message transmission using Tunstall co ding Matrix used to calculate Levenshtein distance Simulation to test validity and accuracy of Levenshtein distance algorithms Dierence in the Error Increase calculated by the approximate algorithms as compared to the standard algorithm Comparison of Error Span and Error Increase Error Span values for source toy Error Span values for source pic Error Span distribution for source pic with random co dewords Error Span distribution for source eng with random co dewords Simulated Annealing Basic algorithm Typical simulated annealing prole ReactiveTabu Search Basic algorithm Evolution of the tabu search showing the escap e mechanism in action List size dynamics for the reactivetabusearch Greedy Algorithm LIST OF FIGURES Co deword allo cations considered by the semiexhaustive algorithm Source pic with no error correction Detail from Fig Source eng with no error correction Comparison b etween mathematical mo del and simulation results Co ding gain b etween Tunstall co des with dierentcodeword assignments Co ding gain b etween a Tunstall co de and an uncompressed co de Comparison with Human co ding for source pic with no error correction Comparison with Human co ding for source eng with no error correction Eect of BER on SER for source pic with no error correction Eect of BER on SER for source eng with no error correction Source pic protected with BCH singleerrorcorrecting co de Source pic protected with BCH dualerrorcorrecting co de Source eng protected with BCH singleerrorcorrecting co de Comparison with Human co ding for source pic protected with BCH singleerrorcorrecting co de Comparison with Human co ding for source eng protected with BCH singleerrorcorrecting co de Eect of BER on SER for source pic protected with BCH single errorcorrecting co de Contribution of dierent bits in the co deword to the Error Span pic Contribution of dierent bits in the co deword to the Error Span eng Contribution of dierentcodewords to the Error Span pic Contribution of dierentcodewords to the Error Span eng List of Tables Tunstall co de for source toy with sequentially assigned co dewords Timings for dierent Levenshtein distance algorithms .