Musictronics for Audio to Midi Conversion MUSICTRONICS for AUDIO to MIDI CONVERSION
Total Page:16
File Type:pdf, Size:1020Kb
Musictronics For Audio To Midi Conversion MUSICTRONICS FOR AUDIO TO MIDI CONVERSION 1L.R.KARL MARX, 2I.GOGUL, 3A.MAHADEVAN, 4S.SASIKUMAR 1-4Electronics and Communication Engineering, Thiagarajar College of Engineering Thiruparankundram, Madurai, Tamil Nadu Abstract - The purpose of this paper is to convert an audio signal into MIDI. The overall idea is to output an instrument sound that plays the recorded human voice with accurate pitch and duration. This is a task requiring wider range of theoretical knowledge and their utilization in the practical part of the process. As a part of study, an algorithm has been designed and developed for implementing a real-time guitar tuner using Matlab, which outputs accurate frequency of the input signal and corrects it, if it is not in a perfect intonation. This idea has been extended to convert voice data into MIDI, for which another algorithm has been developed, and also hardware has been implemented. This technology has several music-related applications and is meant for musicians of all skill levels. One could also easily perform pitch corrections/alterations on MIDI data. MIDI is also useful in many areas, where one can apply different timbres to pitches (ex. use your voice to produce a guitar sound). Keywords – Music Analysis, Signal Processing, Fast Fourier Transform, MIDI, Microcontroller. reproduction. Music and sound technology refer to I. INTRODUCTION the use of sound engineering in both a commercial or leisurely/experimental manner. Music technology and Music technology is any technology, such as a sound technology may sometimes be classed as the computer, an effects unit or a piece of software, that same thing, but they actually refer to different fields is used by a musician to help make music, especially of work, the names of which are to some extent self- the use of electronic devices and computer software explanatory, but where sound engineering may refer to facilitate playback, recording, composition, storage, primarily to the use of sound technology for media- mixing, analysis, editing, and performance. Music logical purposes. The Fast Fourier Transform (FFT) technology is connected to both artistic and is a powerful general-purpose algorithm widely used technological creativity. Musicians are constantly in signal analysis. FFTs are useful when the spectral striving to devise new forms of expression through information of a signal is needed, such as in pitch music, and physically creating new devices to enable tracking or vocoding algorithms. The FFT can be them to do so. Although the term is now most combined with the Inverse Fast Fourier Transform commonly used in reference to modern electronic (IFFT) in order to resynthesize signals based on its devices such as a monome, the piano and guitar may analyses. This application of the FFT/IFFT is of also be said to be early examples of music technology. particular interest in electro-acoustic music because it In the computer age however, the ontological range allows for a high degree of control of a given signal's of music technology has greatly increased, and it may spectral information (an important aspect of timbre) now be mechanical, electronic, software-based or allowing for flexible, and efficient implementation of indeed even purely conceptual. Contemporary signal processing algorithms. Our first goal is to classical music sometimes uses computer generated develop a precision instrument tuning algorithm that sounds, either pre-recorded or generated/manipulated would be able to determine within a finite range live, in conjunction with classical acoustic perfect intonation. After implementing it, we extend instruments like the cello or violin. Music sequencer our algorithm for voice to MIDI conversion. Before software, such as Pro Tools, Logic Audio and many the invention of digital based music notation systems, others, are perhaps the most widely used form of paper-based musical notations and scores have been contemporary music technology. Such programs used for communicating musical ideas and allow the user to record acoustic sounds or MIDI compositions. Digital based music encoding is a musical sequences, which may then be organized simple editing, processing, and communication of along a time line. Musical segments can be copied musical scores. Music data are multi-dimensional; and duplicated, as well as edited and processed using musical sounds are commonly described by their a multitude of audio effects. Many musicians and pitch, duration, dynamics and timbre. Most music artists use 'patcher' type programmes, such as Pd, database and data management systems use one or Bidule, Max/MSP and Audiomulch as well as (or two dimensions and these vary based on types of instead of) digital audio workstations or sequencers users and queries. There are many formats in which and there are still a significant number of people music data can be digitally encoded. These formats using more "traditional" software only approaches are generally categorized into a) highly structured such as CSound or the Composers Desktop Project. formats such as Humdrum, where every piece of Music technology includes many forms of music musical information on a piece of musical score is Proceedings of 15th IRF International Conference, Chennai, India, 19th October. 2014, ISBN: 978-93-84209-59-9 54 Musictronics For Audio To Midi Conversion encoded, b) semi-structured formats such as MIDI in signal processing. Section IV describes the hardware which sound event information is encoded and c) architecture that is proposed for implementing the highly unstructured raw audio which encodes only algorithm. Section V describes the interface hardware the sound energy level over time. Most current digital modules. Experimental results are presented in musical data management systems adopt a particular section VI. format and therefore queries and indexing techniques are based upon the dimensions of music information II. GUITAR TUNING ALGORITHM that can be extracted or inferred from that particular encoding method. Systems such as voice to MIDI, whereby format conversion and transcription of real- time audio signals to MIDI data stream are required, are currently been researched and developed widely [3-5]. MIDI is a popular format in computer based music production such as composition, transcription, and so on. Size of MIDI files is very smaller than other music formats, because MIDI data consists of text messages that are defined instructions only and not sounds signal representation. It is suitable data format to be utilized in software application development. With Voice to MIDI system, we will be able to generate MIDI data of analog input rather easily. This system enables to convert a melody with a microphone to digital scores like MIDI. It extracts acoustical characteristics such as pitch, volume, and duration by intelligent algorithms and converts them into a sequence of notes for producing music scores. Thus, melodies will be translated into chromatic pitches without human intervention. Challenges faced Fig. 1 Guitar Tuning Algorithm using FFT in Matlab commonly in voice to MIDI systems include the clarity of input data for acquiring suitable results. Step 1: In the command window of Matlab, our real time tuner asks the user “Press enter to start program In some systems such as , we must sing music simply or press zero to quit.’’ with “ta ta ta …” expressively singing to prevent Step 2: It then asks about which string the user is many inaccurate outputs. So, in those methods we are going to adjust. (‘Which string are you adjusting? 1=e being forced to sing unnaturally. The quality of MDI 2=B 3=G 4=D 5=A 6=E’) transcribed would also depend on the hardware Step 3: Take the guitar and pluck the strings which capabilities. Also, some systems are based on you need to tune. (‘Press enter to record input signal software that needs intelligent algorithms for or 0 to start over’) providing better quality music transcriptions. Step 4: To hear the signal which is recorded, the tuner However, most of these systems use Digital Signal plays the signal. (“This is how the input signal Processor (DSP) to process audio signals. A sounds.”) microcontroller as the main processor to process real- Step 5: After computing FFT and plotting the input time audio signals is investigated in this study. The and frequency spectrum, the real time tuner tells proposed technique is implemented completely with whether the input signal’s frequency matches with the this microcontroller without a need for much complex original frequency or not. calculations. Step 6: Based on the calculations, there are three In this paper, an audio to MIDI transcription module types of outputs. encompassing a microcontroller and a pitch tracking If input frequency = original frequency, algorithm has been introduced. A hardware based “Perfect Intonation” real-time converter is offered which uses a real-time If input frequency is greater than original algorithm for implementing medium quality MIDI frequency, “Input frequency must be generator. The main aim is to extract music decreased” information from the voice signals to convert to MIDI representation. This hardware must be able to If input frequency is lesser than original “ estimate some parameters such as pitch, note onset frequency, Input frequency must be ” time, and duration, from the audio signals and increased generates MIDI messages. The rest part of this paper III. MIDI ENCODING ALGORITHM is organized as follows. Section II describes the Guitar tuning algorithm using Matlab. Section III The encoding algorithm implemented with the describes the MIDI encoding algorithm for real-time hardware circuit converts real-time sampled audio Proceedings of 15th IRF International Conference, Chennai, India, 19th October. 2014, ISBN: 978-93-84209-59-9 55 Musictronics For Audio To Midi Conversion signals to standard MIDI data. This allows real-time Harmonics, which occur at integral multiples of voice processing without using complex calculations frequency, often confuse pitch extracting and make it such as FFT.