AZIP, Audio Compression System: Research on Audio Compression, Comparison of Psychoacoustic Principles and Genetic Algorithms
Total Page:16
File Type:pdf, Size:1020Kb
California State University, San Bernardino CSUSB ScholarWorks Theses Digitization Project John M. Pfau Library 2005 AZIP, audio compression system: Research on audio compression, comparison of psychoacoustic principles and genetic algorithms Howard Chen Follow this and additional works at: https://scholarworks.lib.csusb.edu/etd-project Part of the Computer Sciences Commons Recommended Citation Chen, Howard, "AZIP, audio compression system: Research on audio compression, comparison of psychoacoustic principles and genetic algorithms" (2005). Theses Digitization Project. 2617. https://scholarworks.lib.csusb.edu/etd-project/2617 This Project is brought to you for free and open access by the John M. Pfau Library at CSUSB ScholarWorks. It has been accepted for inclusion in Theses Digitization Project by an authorized administrator of CSUSB ScholarWorks. For more information, please contact [email protected]. AZIP, AUDIO COMPRESSION SYSTEM: RESEARCH ON AUDIO COMPRESSION, COMPARISON OF PSYCHOACOUSTIC PRINCIPLES AND GENETIC ALGORITHMS A Project Presented to the Faculty of California State University, San Bernardino In Partial Fulfillment of the Requirements for the Degree Master of Science in Computer Science by Howard Chen March 2005 AZIP, AUDIO COMPRESSION SYSTEM: RESEARCH ON AUDIO COMPRESSION, COMPARISON OF PSYCHOACOUSTIC PRINCIPLES AND GENETIC ALGORITHMS A Project Presented to the Faculty of California State University, San Bernardino by Howard Chen March 2005 Approved by: Dr. Keith Schubert ABSTRACT The purpose of this project is to investigate the differences between psychoacoustic principles and genetic algorithms (GA). These two theories will be discussed separately. The review will also compare the compression ratio and the quality of the decompressed files decoded by these two methods. The report will thoroughly discuss the reason that psychoacoustic principles produce small compressed files without affecting the quality of the sound. Two important topics, absolute threshold of hearing and masking effect, will be discussed in the report. Unlike psychoacoustic principles, GA uses a different approach to achieve high compression ratio. In the review, topics related to GA will be discussed. These topics will include the concept, training procedure, and the performance of GA. A computer program called Azip has been developed and is included for the use of researchers who are interested in this topic. In the review, the design and implementation of Azip will be discussed. Several experiments have been done using Azip. The result of experiments is discussed in the report as well. iii The review also compares the performance of Azip with that of Ogg and MPEG Layer3 (MP3), so the readers can have an understanding on their differences. iv ACKNOWLEDGMENTS Firstly, I would like to thank my supervisor, Professor Tong Lai Yu for his supervision and his support. He provided lots of valuable information that extended my understanding in audio compression and eventually improved the quality of the project. I would also like to thank my two-committee members, Dr. Kay Zemoudeh and Dr. Keith Schubert, for their suggestions and help. I would like to thank Dr. Jenq-Shyang Pan and Yi-Chih Hsin for introducing and discussing genetic algorithms. They also gave me lots of help in solving many problems and debugging many programs. I would like to extend my thanks to many other colleagues and friends helped me during my master study. In no particular orders, thanks to: Chun-Hsien Wang, John Ho, Jacob Lee, Chao-Kang Lee, Chuan-Ting Chou, Tang-Li Chen, Steven Chih, and Dennis Hsieh. Finally, I wish to thank to my parent and my sister for their encouragement and support. The support of the National Science Foundation under award 9810708 is gratefully acknowledged. v TABLE OF CONTENTS ABSTRACT.................................................. iii ACKNOWLEDGMENTS .......................................... v LIST OF TABLES........................................... ix LIST OF FIGURES.................... x CHAPTER ONE: INTRODUCTION 1.1 Purpose of the Project........................ 1 1.2 The Scope of the project...................... 2 1.3 Organization of Chapters ...................... 5 CHAPTER TWO: PSYCHOACOUSTIC PRINCIPLES 2.1 Introduction ................................... 7 2.2 Absolute Threshold of Hearing ................ 8 2.3 Masking Effect............ '.................... 10 2.3.1 Simultaneous Masking ................... 10 2.3.2 Non-Simultaneous Masking .............. 11 CHAPTER THREE: GENETIC ALGORITHMS 3.1 Introduction................................... 13 3.2 Training Processes ............................ 13 3.2.1 Initialization......................... 15 3.2.2 Evaluation.............................. 16 3.2.3 Selection............................... 16 3.2.4 Crossover............................... 16 3.2.5 Mutation................................ 18 3.3 Applying Genetic Algorithms on Audio Compression.................................... 19 vi 3.4 Problem of Genetic Algorithm and Conclusion................... ......... 22 CHAPTER FOUR: PROJECT DESIGN AND IMPLEMENTATION 4.1 Overall Description ........................... 24 4.1.1 Project .Functions............ ......... 24 4.1.2 Hardware Interface ..................... 24 4.1.3 User Interface......................... 24 4.2 Project Functions Design and Implementation.............................. 26 4.2.1 Filter Implementation...... 27 4.2.2 Implementation of Genetic Algorithms Training Processes ......... 34 4.2.3 Implementation of Psychoacoustic Principles............................. 3 8 CHAPTER FIVE: PERFORMANCE ANALYSIS 5.1 Filter Analysis................................ 41 5.1.1 Discrete Cosine Transform and Modified Discrete Cosine Transform . .............................. 42 5.1.2 Complexity of Discrete Cosine Transform and Modified Discrete Cosine Transform ....................... 44 5.1.3 Performance of Discrete Cosine Transform and Modified Discrete Cosine Transform ....................... 44 5.2 Comparing Psychoacoustic Principles with Genetic Algorithm ............................. 46 5.2.1 Signal-to-Noise Ratio ................. 46 5.2.2 Subjective QualityMeasure ............. 47 vii 5.3 Comparing Azip with Moving Picture Experts Group-Layer3 and Ogg Vorbis .......... 50 5.3.1 Background.............................. 50 5.3.2 Comparing with Ogg and Moving Picture Experts Group-Layer3 .......... 50 CHAPTER SIX: FUTURE ENHANCEMENT AND CONCLUSION 6.1 Vector Quantization ........................... 52 6.2 Conclusion..................................... 53 APPENDIX A: ABBREVIATION ................................ 56 APPENDIX B: COMPARING PSYCHOACOUSTIC PRINCIPLES WITH GENETIC ALGORITHMS .................... 58 APPENDIX C: COMPARING AZIP WITH MOVING PICTURE EXPERTS GROUP-LAYER3A ND OGG............... 60 REFERENCES............................................... 62 viii LIST OF TABLES Table 1. Legal Commands in Azip Audio Compression System ............................ 25 Table 2. Files in Azip Audio Compression System ...... 27 Table 3. Critical Functions Used in Discrete Cosine Transform and Modified Discrete Cosine Transform ............................... 28 Table 4. Functions Required for Genetic Training Processes . .................................... 34 ix LIST OF FIGURES Figure 1. Basic Audio Compression and Decompression Process ....................... 3 Figure 2. The Absolute Threshold of Hearing Under Quiet Condition ....................... 8 Figure 3. Non-Simultaneous Masking .................... 11 Figure 4. The Training Procedures of Genetic Algorithms................................... 15 Figure 5. Music Handling for Genetic Algorithm ....... 20 Figure 6. Compressing Process of Genetic Algorithm.................................... 21 Figure 7. Compression Ratio versus Signal-to-Noise Ratio in Genetic Algorithms................................... 22 Figure 8. File Codec_config.h.......................... 25 Figure 9. Implementation of Function Init_dct ........ 28 Figure 10. Implementation of Function Discrete Cosine Transform.... ........................ 2 9 Figure 11. Implementation of Function Inverse Discrete Cosine Transform .................. 30 Figure 12. Implementation of Function Init_mdct ...... 31 Figure 13. Implementation of Function Modified Discrete Cosine Transform ................... 32 Figure 14. Implementation of Function Inverse Modified Discrete Cosine Transform ......... 33 Figure 15. Implementation of Function Mutation ....... 35 Figure 16. Implementation of Function Crossover ...... 36 Figure 17. Implementation of Function Crossover (Continue) ........ .......................... 3 7 Figure 18. Implementation of Function Evaluate ....... 38 x Figure 19. Implementation of Function Freqmask_append ............................. 39 Figure 20. Implementation of Function Freqmask_append (Continue) ................. 40 Figure 21. Modified Discrete Cosine Transform ........ 43 Figure 22. Discrete Cosine Transform versus Modified Discrete Cosine Transform ......... 45 Figure 23. Comparing Psychoacoustic Principle with Genetic Algorithm Using Signal-to-Noise Ratio ....................... 47 Figure 24. Subjective Quality Scales .................. 48 Figure 25. Subjective Testing Result of Psychoacoustic Principle and Genetic Algorithms................................... 49 Figure 26. Subjective Testing Result of Moving Picture Experts Group-Layer3 and Ogg ...... 51 Figure 27. Codewords in 2-Dimensional Space ........... 53 xi CHAPTER ONE INTRODUCTION