William B. Wilhelm, Jr. Christian E. Hoefly Jr. +1.202.739.3000 [email protected] [email protected]

January 16, 2020

Via ECFS Marlene H. Dortch, Secretary Federal Communications Commission 445 12th Street, SW Room TW-A325 Washington, DC 20554

Re: Notice of Ex Parte Communication RM-11848; CG Docket No. 05-231 Telecommunications for the Deaf and Hard of Hearing, Inc. et al. Petition for Declaratory Ruling and/or Rulemaking on Live Closed Captioning Quality Metrics and the Use of Automatic Technologies; Closed Captioning of Video Programming

Dear Ms. Dortch:

On January 14, 2019, Mudar Yaghi, Mike Veronis, and the undersigned counsel met with Diane Burstein,1 Suzy Rosen Singleton, Eliot Greenwald and Debra Patkin of the Federal Communications Commission’s (“Commission”) Consumer and Governmental Affairs Bureau (“Bureau”) to discuss AppTek’s comments2 and reply comments3 to the Telecommunications for the Deaf and Hard of Hearing Inc., et al., Petition for Declaratory Ruling and/or Rulingmaking on Live Closed Captioning Quality Metrics and the Use of Automatic Speech Recognition Technologies.4

1 Ms. Burstein participated in the first part of the meeting and recused herself during the discussion regarding quality metrics. 2 AppTek Comments, https://www.fcc.gov/ecfs/filing/101585639283 (filed October 15, 2019). 3 AppTek Reply Comments, https://www.fcc.gov/ecfs/filing/1030761604294 (filed October 30, 2019). 4 Telecommunications for the Deaf and Hard of Hearing, Inc. (TDI), et al., Petition for Declaratory Ruling and/or Rulemaking on Live Closed Captioning Quality Metrics and the Use of Automatic Speech Recognition Technologies, CG Docket No. 05-231 (filed July 31, 2019), https://www.fcc.gov/ecfs/filing/10801131063733 (“Petition”).

Morgan, Lewis & Bockius LLP

1111 Pennsylvania Avenue, NW Washington, DC 20004 +1.202.739.3000 United States +1.202.739.3001

December 16, 2020 Page 2

During the meeting, AppTek discussed the presentation included in Appendix A.5 Founded in 1990, AppTek is a leader in automatic speech recognition (“ASR”) technologies, as well as, other technologies. AppTek’s advanced language technology platform, based on artificial intelligence, machine learning and deep neural network technologies, covers the entire spectrum of language technologies, including ASR, neural machine translation and natural language understanding. AppTek is at the forefront of research and development for next- generation language solutions including text-to-speech, speech-to-speech AI for dubbing, accessibility, sign-language recognition and more. Using this experience and cutting-edge technology, AppTek provides an ASR appliance and could-based solutions for a variety of business and government applications.

AppTek supports the Commission promoting forward-looking and technology-neutral captioning policies and quality metrics that will foster continued improvements to captioning, especially for live programming. AppTek explained the capabilities of its current ASR solutions to improve caption quality by accurately capturing punctuation (including periods, commas and question marks), capitalization, speaker diarization (change detection and formatting), custom glossaries (generate custom lexicons of proper names, characters and dialects for improved accuracy), intelligent word replacement (replacing words by specific regional dialect to match appropriate spelling), smart-formatting (converting dates, times, numbers, currency values, phone numbers and more into more readable conventional forms in final transcripts) and other capabilities. AppTek stressed the importance of these captioning techniques to convey meaning and improve recognition for the viewer. AppTek’s ASR captioning solution has a latency as little as 1.7 seconds.6 Further, AppTek provides individualized support, training, and machine learning for each of its customer’s needs.7

AppTek continues to work with the deaf and hard of hard of hearing community to identify improvements to captioning that best meet their needs. This includes strong working relationships with Gallaudet University (world leading educator and researcher for the deaf and hard of hearing), and current captioning solution providers like TransPerfect, Red Bee, GrayMeta, and YellaUmbrella. Further, AppTek has participated in a captioning discussion with Disability Advisory Committee (DAC) Working Group. These partnerships have led to focused improvements to AppTek’s ASR captioning solutions. To further this type of collaboration, AppTek strongly encourages the Commission to appoint ASR providers to membership on the DAC, as the interaction between providers and the deaf and hard of hearing community will help foster dialog and improvements in the technology.

AppTek discussed quality metrics including the Number, Edition and Recognition Errors (“NER”) model and Word Error Rate (“WER”). AppTek’s ASR technology has received official NER scores ranging from 97.5% to 97.9%, among the highest of any ASR captioning technology. AppTek discussed that NER measures accuracy by evaluating whether the meaning or information

5 The presentation included a video demonstrating AppTek’s captioning. The video is available at: https://www.youtube.com/watch?v=QcCJYGLlPWg. 6 AppTek's appliance outputs raw ASR in 1.7 seconds. In cases of live-to-air for broadcast, post-processing steps provide a total latency of up to 4 seconds. 7 See Appendix B (the manual provided with AppTek’s ASR appliance detailing the installation and training process).

December 16, 2020 Page 3 was lost or received by the consumer, rather than by the number of words omitted, added or mistranslated in the captions.8 While AppTek uses and is evaluated by the NER model, AppTek supports any objective, technology-neutral metric would allow the Commission and the DAC to review the overall quality of captioning and see where improvements can be made.

As identified in the record, ASR has improved the quality of captioning and can be immediately put in place where captions are not provided (like sports and weather).9 The Commission can instruct the DAC to investigate captioning and issue recommendations on where best to immediately put ASR to use and where further investigation is needed. Further, with ASR providers on the DAC, the conversation can be truly informed to the capabilities of current ASR technologies and what improvements to these technologies would be most beneficial.

Sincerely,

/s/ William B. Wilhelm

William B. Wilhelm, Jr. Christian E. Hoefly Jr.

Counsel to AppTek cc: Suzy Rosen Singleton Eliot Greenwald Debra Patkin

8 See AppTek Comment at 9; and see David Keeble, The Canadian NER Trial, www.nertrial.com (last visited Jan. 16, 2020); English Broadcasters Group, Caption Test, www.captiontest.com (last visited Jan. 16, 2020); Pablo Romero-Fresco & Juan Martinez, Accuracy Rate in Live Subtitling – NER Model, http://www.captiontest.com/roehampton%20NER-English.pdf (last visited Jan. 16, 2020). 9 AppTek Reply Comment at 3.

Appendix A

Automatic Captioning State of the Art

1356 Beverly Road, Suite 300 McLean, VA 22101

AppTek | 2019 Proprietary and Confidential World-Leading Advanced Language Technology Platform

AppTek’s advanced language technology platform, based on artificial intelligence, machine learning and deep neural network technologies, covers the entire spectrum of language technologies, including:

Automatic Speech Neural Machine Natural Language Recognition Translation Understanding for use in localization of video, web and for use in sentiment analysis, named for use in automated captioning, real-time print content, improvement of subtitling entity recognition, intelligent bots, telephony transcription, media asset workflows, advanced dubbing and content clustering for topics and trends management and more. more. and more.

AppTek is at the forefront of research and development for next-generation language solutions including text-to-speech, speech-to-speech AI for dubbing, accessibility, sign-language recognition and more.

2019 Proprietary and Confidential 22 Products and Applications

CLOUD SERVICES REAL-TIME CAPTIONING CAPTIONING AND SUBTITLING MEDIA MONITORING ASR & MT APIs CC APPLIANCE CC WORKBENCH OMNI-MONITOR

Developer-friendly cloud-based API access to AppTek Fully automated, same-language captions for live content Cloud-based automated closed captioning, subtitling and Turn-key media monitoring solution utilizing AppTek’s Automatic Speech Recognition (ASR) and Machine adaptable to broadcaster’s programs, talents and voice for alignment of audio, video and text, in multiple languages, neural MT and ASR engines to improve speed and Translation (MT) technologies for a wide variety of use higher accuracy. Suitable for most domains with average with integrated distributed workforce post-editing via accuracy of results. Capabilities include web crawling, cases including telephony, archiving, IoT devices and latency of 4 seconds. cloud-based multilingual platform. social media monitoring and named entity detection. more.

SPEECH-2-SPEECH MT IVR CX SENTIMENT E-COMMERCE TALK2ME (APP STORE) VOXSPHERE LUCIDVUECX DIVA

Real-time speech-2-speech machine translation app Cloud-based IVR featuring continuous speech with Customer interaction analytics to passively uncover DIVA (Digital Intelligent Voice Assistant) creates a available on iOS and Android devices. Serves as a natural language understanding for superior consumer feedback including buying experience, loyalty frictionless speech-enabled buying experience with AI travel companion for instant translations. customer experience. and more across multiple channels and transform including extraction of user profile, willingness to conversations into valuable and actionable insights. purchase, relationship and more.

2019 Proprietary and Confidential 33 Automatic Speech Recognition Sample ASR Business Applications

• Automatic Speech Recognition (ASR) is the science of recognizing and transcribing audio content into text using AI, machine learning, and neural networks (speech-to-text)

• AppTek has cloud-based APIs and on-premise ASR servers for both 16kHz broadcast & entertainment and 8kHz telephone speech

• Customers include a major FAANG company, AAA, Ford, NBC, Televisa, Telemundo and the US Government

• Our ASR cover more than 30+ languages and dialects including: Live Captioning: Generate offline and streaming automated captioning for content accessibility and - English (US, UK, AUS, CAN, IN) [BTM] - Chinese (Simplified/Traditional) [BTM] - Russian [BTM] compliance, significantly reducing costs of human - French (CAN, EUR) [BM] - Korean [BM] - Turkish [BTM] captioning. - German [BM] - [BM] - [BM] - Italian [BM] - Persian/Farsi/Dari [BTM] - Tagalog [BM] Media Asset Management: Create rich metadata for topic extraction, indexing and future - Spanish (US, EUR) [BTM] - [BM] - Malay [BM] discoverability of media assets across a wide array - Portuguese (BR, EUR) [BM] - Afrikaans [BTM] - Indonesian Bahasa [BM] of languages. - Dutch [BM] - Tamil [BM] - Hebrew [BM] Subtitling and Editing: Automate subtitling and - (5 Dialects) [BTM] - Japanese [BM] editing processes by implementing ASR for later Legend: [B] Broadcast & Entertainment, [T] Telephony, [M] Mobile translation inside media workflows; opening new markets.

Contact Center Transcriptions: Transcribe 8kHz telephone conversations for valuable business insights and improved compliance.

2019 Proprietary and Confidential 44 Neural Machine Translation Sample MT Business Applications

•Neural Machine Translation (NMT) is a field of computational linguistics that translates text into different languages using state-of-the-art neural architectures •In 2017, AppTek launched Neural MT using cutting edge deep neural network (DNN) technology •Over 40+ direct language pairs supported and 400+ using English as a pivot:

- Arabic - Finnish - Korean - Russian - Bulgarian - French - Latvian - Slovak - Chinese - German - Lithuanian - Slovenian Content Localization: Adapt and localize media - Czech - Greek - Pashto - Spanish content to expand audience reach across the globe - Danish - Hebrew - Persian/Farsi - Swedish

- Dari - Hungarian - Polish - Turkish Subtitling and Editing: Automate subtitling and editing processes by implementing NMT for - Dutch - Japanese - Portuguese - Ukrainian later translation inside media workflows; - English - Italian - Romanian - Urdu opening new markets.

- Estonian E-Commerce: Expand Total Addressable Market by deploying NMT across high volume • In addition to 7+ cross-European languages global inventories

E-Discovery: Analyze and localize large volumes of critical content in a secure environment for fast discovery, search and retrieval

2019 Proprietary and Confidential 55 Natural Language Understanding Sample NLU Business Applications

• Natural Language Understanding (NLU) is the post-processing of text after the application of natural language processing algorithms that utilize context from ASR generated transcripts as well as text and chat and other forms, to discern meaning of sentences to understand intent and execute a series of actions. • There is considerable market interest in NLU due to its application in automated reasoning, MT, question answering, news gathering, text categorization, voice activation, archiving and large scale content analysis. • AppTek is currently focusing its internal R&D on the following: Sentiment Analysis: Mine opinions and trends inside conversation to help brands shift • Named Entity Extraction - Identify entities within written or spoken text, including proper names, marketing, sales and operational strategies brands, cities, currency amounts, etc, and label them for appropriate recall. • Dialog Management - AI-Supported Intelligent “Bots” for speech and text. Chatbots: Create personalized one-on-one interactions for experiences that engage • Sentiment Analysis - Extract the overall opinion, attitude or feeling over a specific topic or product customers without human resources and for deeper analysis of brand performance. overhead • Content Classification - Classify content into pre-existing categories by function, intention or purpose. Advertising: Identify new audiences by • Content Clustering - Identify main topics of discourse to discover new topics pertinent to an analyzing customer conversations aggregated organization or identify customer trends. across channels and improve media spend and audience targeting • Conversational Interfaces - Build fully functional virtual assistants/chatbots to enable customer communication. Market Intelligence: Stay informed across a broad array of content with advanced tools to filter critical information and topics

\ 2019 Proprietary and Confidential 66 Industry Experience

• Gallaudet • Broadcast/Media Captioning & Subtitling • TransPerfect • RedBeeMedia • YellaUmbrella • Televisa • Azteca/KJLA • News10/WHEC • CKSA • Chesapeake City Government • Juan Pablo Romero, Developer of captioning NER standards evaluation (http://captiontest.com/roehampton%20NER- English.pdf) • Yota Georgakopoulou, Industry Expert on Audiovisual Localization

\ 2019 Proprietary and Confidential 77 Artificial Intelligence & Accessibility Developments

Assistive Technology Deployment • AppTek has deployed real-time streaming speech-to-text conversion technology for deaf and hard-of-hearing individuals utilizing embedded ASR in a laptop environment with an advanced customizable user interface; • The platform utilizes speaker diarization to help users identify a change in speaker and highlight the names of individuals talking inside multi-participant conversations. • AppTek’s text-to-speech services allow users who have difficulty speaking to type via keyboard and have that input instantly converted to audible speech.

Accuracy and Syntax • Continuous ASR Training - Broadcast media and entertainment content, plus millions of subtitle data points across a wide array of languages, feed machine learning models. RESULT: AppTek ASR delivers more accurate and syntactically pleasing automated subtitles and closed captions. Formatting • AppTek’s proprietary Logical Line Segmentation (LLS) - Ensures speech and translated output can be exported in an appropriate subtitle format. • AI platform uses our novel subtitle segmentation algorithm and predicts the end of a subtitle line given the previous word-level context using a recurrent neural network learned from human segmentation decisions. • Text is laid out in subtitle lines segmented according to syntax and semantics, instead of speaker pauses, which has been the predominant method employed by mainstream ASR and MT providers to date. RESULT: The subtitle output is much closer to what a professional would produce.

\ 2019 Proprietary and Confidential 88 Evaluating ASR performance readiness with NER

• NER (Number, Edition Error, Recognition Error) – The NER Model measures accuracy by evaluating whether the meaning or information was lost or received by the consumer, rather than by the number of words omitted, added or mistranslated in the captions. • Measurement Criteria • Correct Edition (CE) (-0.0): CE is scored when captions are different from the verbatim audio but retain its full meaning, without interrupting words/phrases. • Omission of Main Meaning (OMM) (-0.5): The captions have lost the main AppTek’s Official idea presented in the audio’s independent idea unit (see page 2). Registered Scores • Omission of Detail (OD) (-0.25): Here, the captions have lost one or more modifying meaning, affecting a dependent idea unit (see page 2) but not the Range: 97.5-97.9% main idea. (Source: CKSA-TV cksatv.ca) • Benign Error (BE) (-0.25): A captioned word or phrase is incorrect, causing an interruption in the reading. However, the viewer can readily figure out the original meaning, from context (in video) or similarity to the real word. • Nonsense Error (NE) (-0.5): A captioned word or phrase is incorrect and the viewer can’t figure out the original meaning. If the impact of the word/phrase is to alter or omit the meaning of the idea unit, OD, OMM or FIE will be scored instead. • False Information Error (FIE) (-1.0): The captions make sense, but the information they present is different from the verbatim. The caption viewer \ cannot2019 Proprietary tell that and the Confidential meaning is false. 99 ASR - Comparative Benchmarking (English)

CLOUD PROVIDER 1 CLOUD PROVIDER 2 APPTEK H Tel 8 - kHz 6kz-Ent - kHz16 6kz-BCN - kHz16

2019 Proprietary and Confidential 10 ASR Comparative Test

https://www.youtube.com/watch?v=QcCJYGLlPWg

\ 2019 Proprietary and Confidential 1111

Appendix B

Closed Captioning Appliance CCAPP200

Contents Connecting the appliance ...... 1 Starting the appliance...... 1 Connecting the Ethernet port ...... 2 Appliance display output ...... 2 Accessing the appliance Web UI ...... 3 Home page ...... 4 Checking audio input levels ...... 5 Setting optimum audio level ...... 5 Configuration page ...... 6 Appliance registration ...... 7 Modifying Date & Time settings ...... 8 Modifying network setting ...... 9 Module Configuration...... 10 Configuring the appliance SDI/HDMI input ...... 10 Configuring the connection and setting for the Caption Encoder ...... 11 Profanity word filtering ...... 13 Word substitution ...... 13 Configuring the appliance Titling Output ...... 14 Managing the Profanity Word List ...... 16 Adding profanity words ...... 16 Managing the Word Substitutions ...... 17 Adding word substitutions ...... 17 Predefined word substitutions ...... 18 Export...... 19 Creating a new Edit Session ...... 19 Editing transcripts within an Edit Session ...... 20 Exporting Edit Session transcripts ...... 21 Exporting Edit Session audio ...... 21

i

Figure 1 - appliance front view Connecting the appliance 1. Connect the audio input. Depending on the appliance configuration, audio input is received via SDI, HDMI or unbalanced audio in via 3.5mm line-in jack. 2. Connect the output serial port. The caption text is output on a standard RS-232 serial port. Port is configured for 9600 baud, 8 bits, no parity and 1 stop bit. Depending on the device connecting to the appliance, a null- modem adapter may be required to connect the devices. 3. Optionally connect a computer monitor to the monitor output. 4. Optionally connect the Ethernet port. The Ethernet port connection is required to access the appliance web portal.

Figure 2 - appliance rear view (Digital embedded audio model shown. Other models contain analog audio input) Starting the appliance Once all the connections are made, simply turn on the appliance by pressing the red power button on the front of the device.

CCAPP200 May 2019 1

Connecting the Ethernet port The Ethernet port is required to access the appliance web portal.

Ethernet port Eth0 is configured to obtain an IP address automatically via DHCP. This can be modified via the system Web User Interface.

Ethernet port Eth1 is set to a fixed IP address of 192.168.168.168.

Appliance display output The appliance information screen can be seen by connecting a PC monitor to the appliance VGA output port. The screen displays information about the appliance state and can be used to monitor the application operation.

1 2a 2b 3 4

5

6

7

Figure 3 - appliance status screen

1. Displays the current appliance status 2. Displays the audio levels. ‘#’ indicates the average level and ‘=’ is the peak audio level. See the section titled “Setting optimum audio level” for checking the correct audio levels. a. The incoming audio level. b. The audio level after processing by software DSP. 3. The appliance license information. In this example, the license expiration date is displayed. If the license has expired, it will display “Expired”. If there is no expiration, it will display “Permanent”. 4. Display the appliance ID. 5. Displays the IP address of the Ethernet port Eth0. 6. Displays text recognized and output by the appliance. 7. Displays system messages. In the event of an error, this information will be helpful for AppTek technical support.

CCAPP200 May 2019 2

Accessing the appliance Web UI Once the appliance is started with the Ethernet port connected, the portal is accessed via a standard web browser (Firefox or Google Chrome is recommended). E.g. http://192.168.168.168

The Web UI is used to monitor the status of the appliance, modify module configuration options, edit and export stored transcripts.

The initial page will prompt for a Username and Password.

Figure 4 - appliance web user interface, login

The appliance has only one portal access account:

Username: ccadmin Password: Admin6867

This account name and password cannot be changed.

CCAPP200 May 2019 3

Home page Once singed in, the home page with the current status will be displayed.

Figure 5 – home page

The main menu is displayed at the top of the page.

Home - This is the main status page

Configuration - This page contains configuration options and allows modification of some values. This page can also be used to reset certain modules.

Profanity Words - This page is used to manage the Profanity Word List.

Word Substitutions - This page is used to manage the Word Substitution List.

Export - If the export functionality is enabled, this page allows the user to edit and export stored transcripts and recorded audio.

Below the main menu toolbar, the page content will display the Status of the appliance, the current Audio level, and the most recent transcript output in the Output pane.

CCAPP200 May 2019 4

Checking audio input levels You can use the caption server’s display or web portal to verify the device is receiving an audio signal.

The audio level is displayed graphically in a horizontal graph with the average audio level and the peak audio level.

Figure 6 - Audio level meter Setting optimum audio level For optimum recognition, it’s important that the audio level is not too low or too high. Use the audio level graph to ensure the peak input level is above 40% and below 80%. If viewing audio levels on the appliance display, see item 2 in the Appliance display output section.

CCAPP200 May 2019 5

Configuration page The configuration page is used to view and modify system settings like Date and Time and modify the network settings for Eth0.

In addition to system settings, various module configuration values can be modified. The modules vary depending on the system configuration.

Figure 7 – configuration page

The current appliance license information is displayed below the page title to the left next to the License label. If the license key has an expiration date, it will be displayed to the right of the License label.

Click the New License Key button to enter a new license key. Enter the new license key in the field provided and click “OK”. If the previous license key had expired, the appliance will need to be restarted for the license key to take effect.

The module configuration pane is displayed in the middle of the page. See the Configuration page for details on modifying various module settings.

Below the module configuration pane, the System Log panel contains the system log information. Click the ‘+’ button in the panel to expand the System Log panel. When expanded, you can modify what is visible by changing the Start Time by entering a date and time in the following format; “YYYY/MM/DD HH:MM”. This display all message after the date entered. Click the reload button the bottom right to refresh the log text displayed.

CCAPP200 May 2019 6

Appliance registration If the appliance is connected to the internet, the appliance can be registered through the AppTek Appliance Management system. This allows the appliance to be remotely managed. To register the appliance, click the link to open the appliance registration web site. If you have already setup an account with the Appliance Management system, enter the existing email address and password. If you don’t already have an account, click the “Sign up!” link.

Once logged in to the Appliance Management site, the window below will be displayed allowing you to set a friendly name of the appliance. This window also displays the Appliance Registration ID.

Appliance Name - Enter a friendly name to associate with this appliance. Be sure to click the Save button to update the appliance database.

Appliance ID - This is the appliance registration id. Click the Copy button to copy the registration id so that you can enter this in the field provided in the Appliance Web UI.

You can click the Goto Dashboard button to open the Appliance Management site to view and manage registered appliances.

IMPORTANT NOTE

Be sure to return to the Appliance Web UI and enter the registration ID. Otherwise, the appliance will remain unregistered. You can return to the Appliance Management site later to retrieve the registration ID.

The Appliance Management site is located at https://reg.svc.apptek.com

CCAPP200 May 2019 7

Modifying Date & Time settings Open the Date & Time settings window by clicking the Modify button to the right of the “Current Date and Time”.

Figure 8 - date and time settings

Time zone - Select the system time zone.

Time Setting - Select either “Manual” or “Time Server” option.

Manual - Enter the current date and time in the field provided.

Time Server - Enter the hostname or IP address of Network Time Server host

Click the “Apply” button to save the changes or click the window Close button to cancel and close the window.

CCAPP200 May 2019 8

Modifying network setting Click the Modify button to the right of the Network Settings label. The appliance has two network interfaces. The settings for Eth0 can be modified while Eth1 remains manually configured to 192.168.168.168, subnet 255.255.255.0.

Figure 9 - network settings

Automatic - When selected, the appliance will use DHCP to automatically assign an IP address.

Manual - Select this option to manually set the network settings.

IP address - Enter the static IP address to use

Subnet mask - Enter the network subnet mask

Gateway address - Enter the IP address of the default gateway address.

DNS address - Enter the IP address of the default DNS server.

Click the “Apply” button to save the changes or click the window Close button to cancel and close the window.

CCAPP200 May 2019 9

Module Configuration Various settings for different software modules can be viewed and modified within the module configuration panel.

To access a specific module configuration, click the tab containing the module name.

After making any changes the module configuration, click the “Update” button to save the changes. If there is a “Reset” button visible in the module configuration, the saved changes will not take effect until the “Reset” button is clicked or the appliance is restarted.

Configuring the appliance SDI/HDMI input If the appliance is equipped with a digital embedded audio interface, the “Capture” module settings are displayed on the “Configuration” page of the appliance web UI.

Figure 10 - Digital audio capture configuration

Input - Select either SDI or HDMI audio input.

Channels - Specifies which audio channel(s) to process. If you need to process more than one channel, enter each channel number separated by a comma. Audio from multiple channels are mixed into a single channel for processing.

SoftGain - Setting this value will boost the input audio level by the specified number of dB.

CCAPP200 May 2019 10

Configuring the connection and setting for the Caption Encoder By default, the AppTek Captioning Appliance outputs transcript text via the serial port to the external Caption Encoder. To switch to output on the network, login to the Appliance Web UI and go to the “Configuration” page. Select the “Caption Encoder” tab.

Figure 11 - caption encoder settings

Manufacturer - Select the supported Caption Encoder manufacturer. The output is compatible with most encoders that support the Control-A communications protocol.

Port - Select either serial or network output. If the serial port is used, see the “Serial Port” tab to change the serial port baud rate (Figure 12 - Serial port configuration). If the network output is selected, a new “Network Encoder” tab will be visible. Use the “Network Encoder” tab to specify the caption encoder’s IP address and port (Figure 13 - caption encoder network settings).

Init_Command - If using a caption encoder that does not support the Control-A protocol, enter a custom encoder initialization command. This command is sent to the encoder to switch it to live captioning mode.

Baseline - Specify the first screen text line where the captions will be displayed. Line 1 is the first line at the top of the screen and 15 is the last line.

Lines - Specify the number of roll-up caption lines, usually between 2 and 4.

Stream - Specify the caption text field (e.g. C1 is Caption field 1).

CCAPP200 May 2019 11

Setting the serial connection to the Caption Encode Click the “Serial Port” tab so display the serial port settings. Only the serial port baud rate can be modified. The other serial port settings are fixed – 8 data bits, 1 stop bit and no parity.

Figure 12 - Serial port configuration

Setting the network connection to the Caption Encoder Click on the “Network Encoder” tab within the “Module Configuration” pane.

Figure 13 - caption encoder network settings

Enter the correct Caption Encoder’s hostname or IP address and port number. (i.e. hostname:port). Click “Update” and “Reset”.

If you don’t see data being received by the Caption Encoder, check the AppTek Appliance display for any “NetEncoder” error messages. Also, verify the Caption Encoder’s network setting are valid and the network port is enabled.

CCAPP200 May 2019 12

Profanity word filtering The “Profanity Filter” module is used to mask or remove any undesirable words from the appliance text output. Use the “Profanity Filter” module configuration to control the function of the filter. The list of words to filter are managed in the “Profanity” page from the main menu bar.

Figure 14 - profanity filter module settings

Enabled - If this is checked, the Profanity Filter is enabled, and undesirable words will be filtered using the Mode selected.

ReplaceWith - If the selected filter Mode is set to “Replace”, all matching words in the Profanity Word List will be replaced with this text.

Mode - Select what to do with words matched in the Profanity Word List.

Remove All matching words are removed from the output. Replace All matching words are replaced with the ReplaceWith text. Mask Matching words will be masked by replace all letters after the first with asterisks (e.g. “xyz” will be replaced with “x**”) Word substitution The “Word Substitution” module is used to replace specific words recognized by the speech recognition engine with an alternate word. This is helpful for substituting different word spellings for words recognized, like changing the American spelling of “color” to the British spelling, “colour”. Use the “Word substitution” module configuration to control the function of the filter. The list of words and their substitute form are managed in the “Word Substitution” page from the main menu bar.

Figure 15 - word substitution module settings

Enabled - If this is checked, the Word Substitution module is enabled and output matching word in the “Word Substitution” list will be replaced with the specified text.

CCAPP200 May 2019 13

Configuring the appliance Titling Output The AppTek Captioning Appliance Titling Output allows the transcript text to be sent over the network to any simple TCP/IP socket receiver.

To configure the Titling Output, login to the Appliance Web UI and go to the “Configuration” page. Select the “Titling Output” tab.

Figure 16 - titling output module settings

Configuration options:

Port - The TCP/IP server port the Titling Output module will listen on for new incoming client connections.

Format - The format the transcript will be transmitted in:

TxtRaw The text is transmitted in a raw text stream (i.e. each word recognized is sent directly to the network). Txt Also a simple text stream, but unlike TxtRaw, individual words are formatted into ‘lines’ of text before transmitting. In this mode, Lines, LineLen and ReadingRate are used to determine the exact format of the lines. Xml An XML formatted stream where individual titles are encapsulated in a full XML markup. See “Sample Title Output XML format” for a detailed explanation. In this mode, Lines, LineLen and ReadingRate are used to determine the exact format of the lines.

Language - The two-letter language identified of the transcript. The language information is used to analyze the text to ensure correct title formatting.

CCAPP200 May 2019 14

(The following options are only used for the Txt and Xml formats)

Lines - The maximum number of lines in a single title output.

LineLen - The maximum number of characters per line.

ReadingRate - The reading rate (characters per second) used to determine how long a title is valid for. In the Xml format, an empty title is transmitted when a preceding title has expired.

After modifying any of the values, click the “Update” button to save the changes.

Sample Title Output XML format

is not yet but she was, actuallyreally it's like ali,

Each title transmitted contains a complete XML document like above. The title text is within the “tt:div” tag. Currently on one paragraph (“tt:p”) is output per title. The following attributes are included within paragraph tag: count - This is a sequential title number. dt - The absolute time the title should be displayed. Format = YYYYMMDDTHHMMSSmmm. E.g. “20150929T161820497” – September 29, 2015 16:18:20 and 497 milliseconds d - The calculated display duration, in seconds for this title. In the above example, the title should be displayed for 3.1 seconds.

CCAPP200 May 2019 15

Managing the Profanity Word List The Profanity Word List is displayed by clicking the “Profanity Words” item from the main menu.

Figure 17 - profanity words page

Any words you want to filter from the system output should be entered here. The table shows the list of words entered into the system.

Click the Download Words List button to download a text file containing the words entered into the system. Adding profanity words Click the Add Profanity Words button to add new words to the list.

Figure 18 - adding a profanity word

When the Add Profanity Words button is pressed, the “New Profanity Word” window is displayed allowing you to enter a new word. Enter the new word in the “Word” input and click the “Add” button. The new word is added to the list and the window remains visible, allowing you to add additional words. You can enter single or compound words for matching. I.E. “xyz” or “xyz abc def”. If entering a compound word, only the full word will be matched and not individual parts of the word.

Click the close button in the “’New Profanity Word” window to close the window.

CCAPP200 May 2019 16

To remove a word from the list, click the “Delete” button in the row next to the word to delete. Managing the Word Substitutions The Word Substitutions are displayed by clicking the “Word Substitution” item from the main menu.

Figure 19 - word substitution page

Word substitutions allow the user to replace a word, or sequence of words, with a predefined replacement. E.g. “miles per gallon” to “MPG”. Words added to the Word Substitution list will be displayed in the table.

Click the Download Words List button to download a text file containing the Word Substitutions. The “Word” and “Substitute” are separated by a comma. Adding word substitutions Click the Add Substitute button to add new substitution words.

Figure 20 - word substitutions

CCAPP200 May 2019 17

When the Add Substitute button is pressed, the “New Substitute Word” window is displayed allowing you to enter a new word and its replacement. Enter the word to match in the “Word” input field and the substitute word in the “Substitution” field. Click the “Add” button. The new word is added to the list and the window remains visible, allowing you to add additional words.

You can enter single or compound words for matching. I.E. “xyz” or “xyz abc def”. If entering a compound word, only the full word will be matched and not individual parts of the word. The substitute word can also be a single or compound word.

Click the close button in the “’New Substitute Word” window to close the window.

To delete a word substitution, click the “Delete” button in the row next to the word to delete.

Figure 21 - word substitution table

Predefined word substitutions The appliance may contain predefined word lists. If predefined word lists are available, they will be displayed in the “Import preset word list” dropdown list. To import a predefined word list, select the word list from the available presets and click the Import button. Predefined word substitutions will then be added to the list.

CCAPP200 May 2019 18

Export The export page is used to create export sessions. The export session contains transcripts and matching audio for the specified date and time range. The export function may not be enabled and will require an updated license key to enable this functionality.

Figure 22 - web UI, Export (Edit Sessions)

The appliance will record all audio and transcripts generated for a 30 days. In order to view, edit and export the data, an “Edit Session” must first be created. Creating a new Edit Session To create a new “Edit Session”, click the “Add” button at the bottom of the page.

A dialog is displayed requesting a date and time range for the data of interest.

Figure 23 - Create Edit Session

CCAPP200 May 2019 19

Enter a name for this Edit Session in the “Name” field then define the time period to retrieve transcripts for. Define a range by selecting or entering a start date and time in the “From Date” field and the end date and time in the “To Date” field.

Click the “Create” button to create the new Edit Session. The system will retrieve all transcripts and audio within the period defined. If there is no transcript text for the period, an Edit Session cannot be created. Editing transcripts within an Edit Session Select the edit session for the list and click on the “Edit” button. This will display a table containing the transcript segments.

Figure 24 - web UI, Edit Session Editing

Each row of the table contains a segment which includes:

- The start and end time of the segment - The transcript created during this time period

To edit a segment, click on the ‘edit’ icon ( ) within the row. To remove a segment, click on the ‘delete’ icon ( ).

When the ‘edit’ icon is clicked, the “Edit Segment Text” dialog will be displayed.

CCAPP200 May 2019 20

Figure 25 - web UI Edit Session, Edit Segment Text dialog

The “Edit Segment Text” dialog displays the full text transcript of the segment and plays the audio associated with the segment. The audio playback can be paused and resumed by clicking on the player control or using the keyboard shortcuts listed. Only the audio for this segment will be played and there may be some overlap of the audio from the prior and subsequent segments.

Use the “Transcript” box to make any changes to the transcript text. Click the “Save” button to save the modified transcript text or click “Cancel” to discard any changes.

Exporting Edit Session transcripts Select an edit session from the “Edit Sessions” list and click on the “Export SRT” button. This will export all the transcript segments within the edit session. The times stored in the SRT are relative from the start time of the edit session.

Exporting Edit Session audio Select an edit session from the “Edit Sessions” list and click on the “Export Audio” button. This will create and download a single MP3 file that encompasses all audio recorded within the edit session Date/Time range.

CCAPP200 May 2019 21