Transcription and Captioning

While captioning and transcription were originally designed for individuals who are deaf or hard of hearing, many people benefit from seeing the spoken word represented as text, including those who are:

listening to someone speaking in a language that is not their native language
situations where the sound quality is not good or there is a lot of background noise
those who benefit from seeing as well as hearing
those who may benefit from having a lasting transcript.

Automatic Transcription and Captioning

Transcription is the process of converting the spoken word to text. Transcription can be done by computer-based speech recognition technology, usually referred to as automatic transcription. Similarly, when captions for a video are computer generated, they are referred to as automatic captions. Because automatic transcription and captioning use speech recognition technology to produce a transcript, accuracy can vary significantly depending on several factors including sound quality, background noise, technical terms used and the speaker’s accent.

Live Transcription

Alternatively, live transcription can be done by a professionally trained transcriber, often referred to as CART. If a participant requests CART services as an accommodation for a live event, automatic transcription should not be substituted since the accuracy cannot be guaranteed. If you are producing videos, it is important to check the captions for accuracy and edit as needed or have the video professionally captioned.

To be able to view a transcript for your personal use during a live or virtual event, Otter.ai and Google Live Transcribe are options. These tools work best if the speaker has a microphone or if you are in close proximity to the speaker. Both tools are available on mobile devices.

Automatic transcription using Otter.ai
Otter.ai is a web-based tool that automatically transcribes speech, whether participants are meeting in-person or virtually. Similar to Zoom Live Transcription, Otter.ai can provide a transcript in real-time during virtual meetings or classes. All transcripts and meeting audio recordings are saved in Otter.ai and can be edited for accuracy as well as exported in multiple formats.

Contact ITSHelp@�Ĳʿ��.edu to request an Otter.ai account.

Google Live Transcribe
interprets verbal conversation and displays the text transcript on a mobile Android device. Live Transcribe is free on Android devices from the Google Play Store.

iOS Live Caption
With iOS 16, will turn the spoken word into text and display it in real time on your iPhone. Additionally, Live Caption will work with FaceTime, podcasts and other apps as well as live conversations around you.

As a meeting host, if you would like to offer live transcription to your virtual or in-person audience, the following options are available:

Automatic Transcription in Zoom
Zoom Live Transcription provides a text transcript in real-time within the Zoom meeting interface. The transcript can appear at the bottom of the Zoom window, like captions, or in a separate sidebar window. The participant can choose whether or not to view the transcript.

See instructions for .

CART services
CART is a transcription service that provides a human-generated transcript during an in-person or virtual event. At �Ĳʿ��, CART services are provided remotely by third party providers. Use the following links for more information about CART and ASL (American Sign Language) services at �Ĳʿ��.

Google Chrome Live Captions
If you encounter an online video that isn't captioned or a podcast without a transcript, Chrome Live Captions will provide a running transcript. Whenever anything with audio is playing in Chrome, you can toggle on Live Captions, giving you a small movable window displaying a transcript of the audio portion of the media that is playing. See as well as other .

Automatic captions in Panopto
All audio and video content created in or uploaded to Panopto will have automatic captions added. Anyone viewing the video has the option of turn the captions on and off. As with all computer-generated speech recognition processes, accuracy will be variable based on conditions at recording. Captions can be edited for improved accuracy in the Panopto video editor. See .

Professional video captioning and transcription services
If highly accurate video captions or audio transcripts are needed, there are many online services available that can provide 99% accuracy. �Ĳʿ�� contracts with to provide professional captioning services. Contact ITSHelp@�Ĳʿ��.edu for more information.

Audio description for video
Audio description can be added to a video to make the video accessible to persons who are blind, those who have low vision or otherwise can benefit from hearing a description of what is taking place in the video. Audio description is narration added to pauses in the soundtrack that describes the most important visual details that cannot be understood from the main soundtrack alone. It provides information about actions, characters, scene changes, on-screen text and other visual content. See an example of audio descriptions in this .

Audio descriptions can be added to videos in Panopto as time-stamped, text-based descriptions. After audio descriptions have been added, if a view has audio description turned on, playback is paused and descriptions are read aloud using computer-generated speech at the appropriate points in the video. See .

�Ĳʿ���