Telecommunications for the Deaf and Hard of Hearing, Inc. Et Al

William B. Wilhelm, Jr. Christian E. Hoefly Jr. +1.202.739.3000 [email protected] [email protected] January 16, 2020 Via ECFS Marlene H. Dortch, Secretary Federal Communications Commission 445 12th Street, SW Room TW-A325 Washington, DC 20554 Re: Notice of Ex Parte Communication RM-11848; CG Docket No. 05-231 Telecommunications for the Deaf and Hard of Hearing, Inc. et al. Petition for Declaratory Ruling and/or Rulemaking on Live Closed Captioning Quality Metrics and the Use of Automatic Speech Recognition Technologies; Closed Captioning of Video Programming Dear Ms. Dortch: On January 14, 2019, Mudar Yaghi, Mike Veronis, and the undersigned counsel met with Diane Burstein,1 Suzy Rosen Singleton, Eliot Greenwald and Debra Patkin of the Federal Communications Commission’s (“Commission”) Consumer and Governmental Affairs Bureau (“Bureau”) to discuss AppTek’s comments2 and reply comments3 to the Telecommunications for the Deaf and Hard of Hearing Inc., et al., Petition for Declaratory Ruling and/or Rulingmaking on Live Closed Captioning Quality Metrics and the Use of Automatic Speech Recognition Technologies.4 1 Ms. Burstein participated in the first part of the meeting and recused herself during the discussion regarding quality metrics. 2 AppTek Comments, https://www.fcc.gov/ecfs/filing/101585639283 (filed October 15, 2019). 3 AppTek Reply Comments, https://www.fcc.gov/ecfs/filing/1030761604294 (filed October 30, 2019). 4 Telecommunications for the Deaf and Hard of Hearing, Inc. (TDI), et al., Petition for Declaratory Ruling and/or Rulemaking on Live Closed Captioning Quality Metrics and the Use of Automatic Speech Recognition Technologies, CG Docket No. 05-231 (filed July 31, 2019), https://www.fcc.gov/ecfs/filing/10801131063733 (“Petition”). Morgan, Lewis & Bockius LLP 1111 Pennsylvania Avenue, NW Washington, DC 20004 +1.202.739.3000 United States +1.202.739.3001 December 16, 2020 Page 2 During the meeting, AppTek discussed the presentation included in Appendix A.5 Founded in 1990, AppTek is a leader in automatic speech recognition (“ASR”) technologies, as well as, other machine translation technologies. AppTek’s advanced language technology platform, based on artificial intelligence, machine learning and deep neural network technologies, covers the entire spectrum of language technologies, including ASR, neural machine translation and natural language understanding. AppTek is at the forefront of research and development for next- generation language solutions including text-to-speech, speech-to-speech AI for dubbing, accessibility, sign-language recognition and more. Using this experience and cutting-edge technology, AppTek provides an ASR appliance and could-based solutions for a variety of business and government applications. AppTek supports the Commission promoting forward-looking and technology-neutral captioning policies and quality metrics that will foster continued improvements to captioning, especially for live programming. AppTek explained the capabilities of its current ASR solutions to improve caption quality by accurately capturing punctuation (including periods, commas and question marks), capitalization, speaker diarization (change detection and formatting), custom glossaries (generate custom lexicons of proper names, characters and dialects for improved accuracy), intelligent word replacement (replacing words by specific regional dialect to match appropriate spelling), smart-formatting (converting dates, times, numbers, currency values, phone numbers and more into more readable conventional forms in final transcripts) and other capabilities. AppTek stressed the importance of these captioning techniques to convey meaning and improve recognition for the viewer. AppTek’s ASR captioning solution has a latency as little as 1.7 seconds.6 Further, AppTek provides individualized support, training, and machine learning for each of its customer’s needs.7 AppTek continues to work with the deaf and hard of hard of hearing community to identify improvements to captioning that best meet their needs. This includes strong working relationships with Gallaudet University (world leading educator and researcher for the deaf and hard of hearing), and current captioning solution providers like TransPerfect, Red Bee, GrayMeta, and YellaUmbrella. Further, AppTek has participated in a captioning discussion with Disability Advisory Committee (DAC) Working Group. These partnerships have led to focused improvements to AppTek’s ASR captioning solutions. To further this type of collaboration, AppTek strongly encourages the Commission to appoint ASR providers to membership on the DAC, as the interaction between providers and the deaf and hard of hearing community will help foster dialog and improvements in the technology. AppTek discussed quality metrics including the Number, Edition and Recognition Errors (“NER”) model and Word Error Rate (“WER”). AppTek’s ASR technology has received official NER scores ranging from 97.5% to 97.9%, among the highest of any ASR captioning technology. AppTek discussed that NER measures accuracy by evaluating whether the meaning or information 5 The presentation included a video demonstrating AppTek’s captioning. The video is available at: https://www.youtube.com/watch?v=QcCJYGLlPWg. 6 AppTek's appliance outputs raw ASR in 1.7 seconds. In cases of live-to-air for broadcast, post-processing steps provide a total latency of up to 4 seconds. 7 See Appendix B (the manual provided with AppTek’s ASR appliance detailing the installation and training process). December 16, 2020 Page 3 was lost or received by the consumer, rather than by the number of words omitted, added or mistranslated in the captions.8 While AppTek uses and is evaluated by the NER model, AppTek supports any objective, technology-neutral metric would allow the Commission and the DAC to review the overall quality of captioning and see where improvements can be made. As identified in the record, ASR has improved the quality of captioning and can be immediately put in place where captions are not provided (like sports and weather).9 The Commission can instruct the DAC to investigate captioning and issue recommendations on where best to immediately put ASR to use and where further investigation is needed. Further, with ASR providers on the DAC, the conversation can be truly informed to the capabilities of current ASR technologies and what improvements to these technologies would be most beneficial. Sincerely, /s/ William B. Wilhelm William B. Wilhelm, Jr. Christian E. Hoefly Jr. Counsel to AppTek cc: Suzy Rosen Singleton Eliot Greenwald Debra Patkin 8 See AppTek Comment at 9; and see David Keeble, The Canadian NER Trial, www.nertrial.com (last visited Jan. 16, 2020); English Broadcasters Group, Caption Test, www.captiontest.com (last visited Jan. 16, 2020); Pablo Romero-Fresco & Juan Martinez, Accuracy Rate in Live Subtitling – NER Model, http://www.captiontest.com/roehampton%20NER-English.pdf (last visited Jan. 16, 2020). 9 AppTek Reply Comment at 3. Appendix A Automatic Captioning State of the Art 1356 Beverly Road, Suite 300 McLean, VA 22101 AppTek | 2019 Proprietary and Confidential World-Leading Advanced Language Technology Platform AppTek’s advanced language technology platform, based on artificial intelligence, machine learning and deep neural network technologies, covers the entire spectrum of language technologies, including: Automatic Speech Neural Machine Natural Language Recognition Translation Understanding for use in localization of video, web and for use in sentiment analysis, named for use in automated captioning, real-time print content, improvement of subtitling entity recognition, intelligent bots, telephony transcription, media asset workflows, advanced dubbing and content clustering for topics and trends management and more. more. and more. AppTek is at the forefront of research and development for next-generation language solutions including text-to-speech, speech-to-speech AI for dubbing, accessibility, sign-language recognition and more. 2019 Proprietary and Confidential 22 Products and Applications CLOUD SERVICES REAL-TIME CAPTIONING CAPTIONING AND SUBTITLING MEDIA MONITORING ASR & MT APIs CC APPLIANCE CC WORKBENCH OMNI-MONITOR Developer-friendly cloud-based API access to AppTek Fully automated, same-language captions for live content Cloud-based automated closed captioning, subtitling and Turn-key media monitoring solution utilizing AppTek’s Automatic Speech Recognition (ASR) and Machine adaptable to broadcaster’s programs, talents and voice for alignment of audio, video and text, in multiple languages, neural MT and ASR engines to improve speed and Translation (MT) technologies for a wide variety of use higher accuracy. Suitable for most domains with average with integrated distributed workforce post-editing via accuracy of results. Capabilities include web crawling, cases including telephony, archiving, IoT devices and latency of 4 seconds. cloud-based multilingual platform. social media monitoring and named entity detection. more. SPEECH-2-SPEECH MT IVR CX SENTIMENT E-COMMERCE TALK2ME (APP STORE) VOXSPHERE LUCIDVUECX DIVA Real-time speech-2-speech machine translation app Cloud-based IVR featuring continuous speech with Customer interaction analytics to passively uncover DIVA (Digital Intelligent Voice Assistant) creates a available on iOS and Android devices. Serves as a natural language understanding for superior consumer feedback including buying experience, loyalty frictionless speech-enabled

Load more