The Aspirations and Limitations of Regional Funding Allocations
Total Page:16
File Type:pdf, Size:1020Kb
Evaluating the quality and validity of voice recognition software: Lessons from the field
Sarah Ayres and Ian Stafford, University of Bristol
KEY POINTS
The voice recognition (VR) software has been used for transcribing dictated digital recordings and qualitative interviews. Equipment and training costs for two researchers was £1900. A recognised voice recognition expert, Neil Winton (www.pc-voice.co.uk) was used to select equipment, install software and supply training. Installation and training would be difficult without the assistance of an expert. Project members have been impressed by the accuracy of the software. The software picks up complex words and errors tend to be the result of human error (e.g. mumblings or mispronunciations). Project members perceive transcribing qualitative interviews using voice recognition software as faster than manual typing.
INTRODUCTION
This briefing paper describes the use of Voice Recognition Software Dragon V9 as part of an ongoing Economic and Social Research Council (ESRC) project. It outlines (i) the rationale behind using Dragon V9 (ii) equipment and costs (iii) installation and training (v) software functions and (v) the effectiveness of the software. We hope that by sharing this information it will help to promote the use of cutting edge technologies and methodological developments in qualitative research.
RATIONALE FOR USING VOICE RECOGNITION
The VR technology (Dragon V9) is utilised in two distinct ways.
Transcribing dictated digital recordings (i.e. the voice of one researcher). The VR software can automatically transcribe the recording of the voice of one researcher with no additional manual effort. This enables researchers to make brief dictated notes after interviews which can then be automatically transcribed and saved into word files and shared with other team members.
Transcribing interview digital recordings The VR software potentially provides a faster alternative to transcribing interviews by typing and avoids the drawbacks of using professional transcription, e.g. transcription errors. EQUIPMENT AND COSTS
A leading UK voice recognition expert was contacted to help with selecting the equipment, installation and training. The following equipment was deemed suitable for the tasks above. Total equipment and training costs were £1900 including VAT.
Software Hardware Dragon Naturally Speaking 9 Olympus DS-2300 Digital Voice recorder Olympus DSS Player (plus upgrade) Olympus 512MB xD Picture Card Olympus DSS Player Pro Olympus AS-2300 PC Transcription Kit Upgraded VRS Headset Olympus ME12 Noise-Cancelling Microphone Olympus TP7 Telephone Pickup
INSTALLATION AND TRAINING
The installation process highlighted several factors:
Those installing the software need ‘administrator rights’ for the computers or laptops. Installation proved to be quite a lengthy process and one that would have been hugely complex without the assistance of an expert. It was important that ‘settings’ were tailored for the project’s needs. Issues emerged in terms of the compatibility between the Dragon and Olympus hardware/software. Resolving these issues would have been near impossible without the assistance of an expert.
The training focussed on two key elements (i) setting-up and using the Dragon V9 software and (ii) using the Olympus transcription software/hardware. A number of issues were highlighted during these sessions.
The expert instructed project members through the process of setting up user accounts, opening and closing the applications and familiarising themselves with the basic functions of Dragon V9. This was invaluable in terms of getting the most from the software and learning at a rapid pace. When training the Dragon V9 software, the researchers worked through the software’s own training procedure to allow it to recognise a single person’s voice. The process may take longer or need to be repeated if one has a particularly strong regional accent. The expert emphasised that this process was also about training the user to speak in a way that the software can recognise. Project members were told to speak like a ‘news reader’ with clear tone, expression and measured speed.
2 SOFTWARE FUNCTIONS
As part of our ongoing ESRC project, Dragon V9 and the Olympus digital recorder and transcription kit have been used in four distinct ways.
Recording and transcribing the researcher’s own voice Provided the user speaks clearly and consistently, the quality of the transcription using the recorder’s ‘dictation mode’ has proved to be very good.
Recording interviews Using the Olympus DS-2300 is relatively straight-forward. Nonetheless, it is important to familiarise oneself with the technology before entering the field. Even within noisy spaces (such as cafés) the recorder has been largely successful in picking up voices. To get best results we suggest: o Investing in an external microphone. o Using the ‘conference’ function on the digital recorder. o Placing the recorder as close as possible to the interviewee, with the microphone facing towards them. o If in a noisy space, place a book or some papers under the recorder to reduce vibration and get a better sound quality.
Transcribing interviews The VR software can only identify the ‘trained’ voice of the researcher so face-to-face interviews required the researcher to listen to the recordings through a headset and, at the same time, dictate the transcripts. It is possible not to use the keypad at all and make all commands verbally. Both researchers found that they tended to use the VR element for transcribing the text but did the formatting (i.e. new line, paragraph) with the keypad. We suggest the following: o Use the ‘noise cancellation’ option under the ‘Tools’ menu which cuts out background noise when listening back to recordings. o An automatic backspace of around 1.5-2.5 seconds and a playback rate of 75-100% proved optimum for transcription. o Invest in a higher quality headset (do not use the one supplied with the transcription kit). o To avoid repeating errors it is essential to continue to ‘train’ the software whilst dictating a recorded interview rather than simply correct mistakes using the keypad. Both researchers found using a headset with a single earpiece a little distracting but initial impressions confirm that transcribing using the voice recognition software is faster than manual typing and potentially provided more detailed transcription than when manually transcribing.
Conducting telephone interviews The process of recording telephone interviews is the same as face-to-face interviews but the Olympus TP7 Telephone Pickup replaces the microphone in the digital recorder. The earphone is able to pick up both voices very clearly and provides a good quality audio file for transcription.
3 ASSESSING THE EFFECTIVENESS OF VOICE RECOGNITION SOFTWARE
In a relatively short period of time both researchers noted a vast improvement in the quality of the drafts and the time it took to complete transcription using the software. This was primarily due to the training of the software and increased familiarity with the software and the transcription process. In general the accuracy of interviews transcribed with the package was around 95% but this can be variable depending upon the quality of microphone, clarity of dictation and the training of the software. The correction of errors slows down transcription and the process remained fairly arduous. Overall the VR software has successfully managed the intended tasks and its performance has exceeded initial expectations. We would recommend the use of this software to other qualitative researchers with the caveat that equipment and training must be tailored to the demands and needs of the individual researcher and research project.
ACKNOWLEDGEMENTS
This work is funded through the Economic and Social Research Council (ESRC). Project title: English Regionalism: Rhetoric or Substance? Evaluating Decision Making Procedures for Regional Funding Allocations, Award number RES-061-23- 0033.
For more information see the project website at www.bristol.ac.uk/sps/regionalism/default.shtml
4