Audio Fingerprinting Review of Audio Fingerprinting Algorithms and Looking Into a Multi-Faceted Approach to Fingerprint Generation

Audio Fingerprinting Review of audio fingerprinting algorithms and looking into a multi-faceted approach to fingerprint generation Nick Palmer A763896 Supervisor: Professor James L Alty May 11 This Project involves researching, developing and implementing an algorithm to generate, store and search for audio fingerprints. After initial research into audio fingerprinting showed that almost all current audio fingerprinting methods base the fingerprint purely on frequency/time data I decided to develop a method taking other musical features, such as the key, tempo and time signature, into account with a view to increase search speed and decrease the number of false positive matches. Testing how this compared to a standard fingerprint revealed only minor improvement in search times and while results contained less false positives with the augmented fingerprint, they also returned no matches more frequently. Reasons for which are explained however and future improvement should be possible. Nick Palmer A763896 Contents 1 Introduction ....................................................................................................................... 3 1.1 What I’m trying to do ................................................................................................. 3 1.2 Needs & Uses of Audio Fingerprinting ....................................................................... 3 1.3 Current Methods ........................................................................................................ 4 1.4 How I plan to improve ................................................................................................ 4 2 Literature Review ............................................................................................................... 5 3 Design ............................................................................................................................... 10 3.1 What I’m trying to do ............................................................................................... 10 3.2 Criteria for success ................................................................................................... 11 3.3 Environment ............................................................................................................. 11 4 Experimentation............................................................................................................... 12 4.1 Reading Audio .......................................................................................................... 12 4.2 Spectrograms ........................................................................................................... 13 4.3 Reading in wave files ................................................................................................ 16 4.4 Graphical User Interface .......................................................................................... 17 4.5 Databases ................................................................................................................. 17 4.6 Fingerprinting ........................................................................................................... 17 4.7 Other Factors ........................................................................................................... 18 5 Detailed Design ................................................................................................................ 20 5.1 Development Choices .............................................................................................. 20 5.2 UML .......................................................................................................................... 21 5.3 HCI Considerations ................................................................................................... 25 5.4 Database Design ....................................................................................................... 27 5.5 Third Party Libraries ................................................................................................. 28 5.6 Development Process .............................................................................................. 29 6 Implementation ............................................................................................................... 30 6.1 Audio Readers .......................................................................................................... 30 1 Nick Palmer A763896 6.2 Waveform Window .................................................................................................. 30 6.3 Spectrogram Window .............................................................................................. 31 6.4 Main window ........................................................................................................... 33 6.5 Basic Fingerprinting .................................................................................................. 34 6.6 Basic Fingerprint Storing & Searching ...................................................................... 34 6.7 Augmented Fingerprint Storing & Searching ........................................................... 35 6.8 Database Handling ................................................................................................... 36 7 Testing .............................................................................................................................. 37 7.1 Results ...................................................................................................................... 37 8 Conclusions ...................................................................................................................... 39 8.1 Looking to the future ............................................................................................... 40 Bibliography ............................................................................................................................. 41 Appendices ............................................................................................................................... 43 Appendix A - Database SQL .................................................................................................. 43 Appendix B – User Manual ................................................................................................... 44 Appendix C – Gantt Chart .................................................................................................... 46 Appendix D – Source Code Listing ....................................................................................... 47 2 Nick Palmer A763896 1 Introduction Audio fingerprinting is the term used to describe methodologies used to process audio data to generate a representational hash, known as the audio fingerprint, based on the perceived sound rather than the data the audio is stored as. Doing so requires reading of audio files, taking them from their often compressed formats, transforming them back into a stream of time-based waveform amplitude data, transforming that into a series of chunks containing frequency data, finding defining features and generating the fingerprint based on their values in frequency, time and relation to each other. There are currently a handful of well-known and robust audio fingerprinting algorithms developed by companies for a variety of uses. Most current algorithms are built for speed and only tend to use features found through spectral analysis of the audio to generate their fingerprints. 1.1 What I’m trying to do For this project I aim to implement an audio fingerprinting algorithm in the style of many currently designed methods and then develop an alternative version that takes into other aspects of the music which I could add to, or “augment”, the fingerprint that might aide recognition accuracy and speed. The resulting software I produce should be able to process an audio file and, using either a basic or augmented method, generate a fingerprint that can either be stored in a database for later retrieval or searched for in a database attempting to find matched values. 1.2 Needs & Uses of Audio Fingerprinting Audio fingerprinting has many applications in both the private business and public domains. The most popular use is through software developed by companies like Shazam, http://www.shazam.com/, and SoundHound, http://www.soundhound.com/, who both offer mobile solutions for audio fingerprinting which can be used through a mobile phone for identifying music overhead in a public place such as a car or a café. Gracenote, http://www.gracenote.com/, is another company who also offer mobile software for audio identification but also has software available for digital media players, like iTunes and Winamp so that music can be properly named, searched, grouped and identified. Gracenote can also be used for voice recognition and is currently in use with Ford’s SYNC in-car communications system (GRACENOTE, 2011). 3 Nick Palmer A763896 Last.fm, http://www.last.fm/, uses audio fingerprinting to try and correct inappropriate music metadata when it “scrobbles” (the term last.fm has for when it makes a record of a user listening to a track) (LAST.FM, 2007). Uses for audio fingerprinting extend beyond the general public managing music libraries and wanting to identify songs however. Audio fingerprinting can be used by composers and artists to try and identify parts of songs and riffs that an artist or composer wishes to check isn’t already copyrighted.

Load more