Live Captions

This page deals with live media, which includes both audio and video. The criterion states:

Captions are provided for all live audio content in synchronized media.

Video Calls

In a previous blog post I wrote about this success criterion, I focussed on video calling platforms such as Teams and Zoom. It was right in the middle of the Covid-19 Pandemic. Although we had previously had some video conferencing tools, there had been an explosion in the numbers of such tools and the features they contained.

However, on reading the Understanding Document for 1.2.4 Captions - Live, I realise that this is not intended to cover two-way calls involving two or more participants. In such situations, the responsibility for providing captions is with the caller or the host... so the person who sends the Teams invite... rather than with the application itself.

That said, Teams and Zoom now have pretty good live captioning tools, so I think it worth a mention.


Automated live captions are rarely completely accurate. They depend on the clarity of speech, background noise and interference, and even the tone and pitch of the speaker. If the speaker has a strong accent, this will usually affect the accuracy of captions.

Broadcasted Media

This success criterion is intended to address broadcasted media. That can include things like live television, live webinars and training events, and anything else that is broadcast or streamed live.

The kind of live captions that are needed do not rely on automated tools, which are often inaccurate. Best practice is to use Communication Access Realtime Translation (CART). CART captions are produced by a trained captioner, using special equipment that allows them to produce captions quickly. It is still not perfect, as they are working at the speed of the speaker, but human-produced captions are far more accurate, as they can interpret the speech in ways that automated captions cannot.