How To Add Closed Captioning To A Video – And What I Learned

Background: 2018, Glasgow

Back in September 2018 I attended a WordPress Contributor Day in Glasgow.

Contributor Days are events where WordPress users are invited to contribute to the project.

A common misconception is that giving something to WordPress means writing code. But there are many other ways to make your mark in the WordPress project.

For example, you could get involved by:

  • Translating WordPress into other languages.
  • Documenting the WordPress software.
  • Helping to market the WordPress CMS.

As I am interested in WordPress accessibility, I chose to work on contribution for that team. Joining me was Ahmed Khalifa, one of the WordPress Edinburgh meetup team and lead organiser of WordCamp Edinburgh 2018.

Being hard of hearing, Ahmed was interested in checking what proportion of videos produced by the WordPress project were captioned.

WordPress.tv and subtitles

Ahmed and I had a look at WordPress.tv where videos from WordPress events are hosted.

The first thing we discovered was that it wasn’t easy to find subtitled videos.

We had to go to the WordCamp TV category and then scroll down quite far and visit the Browse subtitled videos link, below all the languages.

The link to browse subtitled videos
The link to browse subtitled videos

We were disappointed to discover that only a tiny proportion of videos had closed captions.

Two of the captioned videos; there are 6 pages of them in all
Two of the captioned videos; there are 6 pages of them in all

(Open captions, which are more commonly used on social media videos, are burned into the video. Closed captions can be toggled on or off. Find out more about open captions vs closed captions.)

Checking now (December 2019) I count 57 subtitled videos on WordPress.tv. Here’s one example:

A still from a captioned video with Rian Rietveld
A still from a captioned video with Rian Rietveld

I don’t know how many videos are uploaded to WordPress.tv in all. in 2018 there were something like 1,848 uploaded.

Videos have been added to the site for 10 years. So If I multiply my 1,848 by 10 I get 18,480.

57 out of 18,480 is about 0.3% of the total. An extremely low percentage, sadly!

Ahmed picked a keynote talk to add captions to and got started on the job.

Later Ahmed wrote about his time at Contributor Day, his attempt at video captioning and the aftermath. He was somewhat dismayed by the whole experience:

Being Shot Down Because of Accessibility Needs is Not Something Anyone Should Expect

Writing about the captioning procedure, he said,

Picture the scene: the video is 48 minutes long so the process of:

  1. listening (which is hard enough for me as it is)
  2. pausing
  3. typing/correcting transcripts
  4. playing the video
  5. and repeat the process multiple times…

…is not exactly a quick job. And that’s without double-checking what you have written and re-listening the video.

Nevertheless, I also wanted to try and redress the balance a tiny bit by having a go at captioning a video myself.

How to subtitle a WordPress.tv video using Amara

On every wordpress.tv video there’s a link to Subtitle this video, which takes you to a link with the following video on how to get started:

WordPress.tv recommends using the free Amara editor to subtitle a video.The steps are:
  1. Copy the video URL
  2. Launch the video in Amara
  3. Download the translation file
  4. Upload the caption file

Sounds easy, right?

However, there are actually some further steps between 2 and 3 in Amara, i.e.:

  1. (Watch the video and) type what you hear
  2. Sync the timing
  3. Review your work

How to add closed captioning to a video – what I learned

I chose to transcribe a talk by Graham Armfield from WordCamp Manchester 2017.

The topic was Designing for Accessibility – quite appropriate!

I started off the captioning not long after the Contributor Day, then it went on to the back burner until I picked it up a few weeks ago.

Transcribing video takes a long time

According to this blog post on transcription time it can take between 4 and 9 hours to transcribe an hour of audio.

The time depends on factors such as the quality of audio, number of speakers involved and how technical the subject is.

Also, bear in mind that these figures are for experienced transcribers. Although the video I chose was shorter than an hour (just under 36 minutes) it took me much, much longer to type up all the speech. I had to replay sections of the video quite a few times to get all the words.

It obviously helps if you are a fast and accurate typist.

By the time I had finished my subtitles I was on my 75th revision!

Amara subtitles revision 75
Amara subtitles revision 75

Don’t make each caption too long

The next issue I encountered was that sentences in a talk can be quite long and run on. But there is a captioning rule that each line of text shouldn’t be more than 21 characters. You also shouldn’t have more than two lines of text visible on screen at one time.

I had to make decisions about where to break sentences so that the sense wouldn’t be lost. Because unlike reading a transcript, you can’t scan back so easily in a video.

Don’t make a caption too short, either

I was tempted once or twice to have single words as captions, and got this warning from Amara:

Briefly displayed subtitles are hard to read; the duration should be more than 700 milliseconds.

You need to make choices about capturing speech verbatim

I found there were quite a few things to think about regarding how I presented the captions.

Spelling – did I use British English or American English? Because the speaker was British, I went with the former.

Ums, ahs, and ers – these filler words are known as disfluencies in speech. I chose to cut them out as they didn’t add anything to the meaning.

Quotations – when a person was quoted in the talk, I used single quote marks around their speech.

Other speakers – the majority of the video was Graham presenting, but also speaking on the video were the MC for the track and an audience member asking a question. I used a couple of methods to indicate a different speaker:

– Preceding the text with a dash

(Another speaker) Using brackets before the text

Repeated text – if a few words were repeated, I only included one instance.

The BBC’s subtitle guidelines give some good advice.

Plus Amara has its own guidelines in a pop-up.

Amara style guidelines

Syncing captions to speech is an art

Once I’d finished my captions, I needed to synchronize them with the speech. This turned out to be more complicated than I thought!

It was quite tricky to get the timings right, so the captions matched up with the beginning and end of each instance of speech.

It would have been quite jarring for a viewer to see the text popping up too early or too late, so I spent a while getting it right. For sections of video with long pauses between sentences, I had to replay to get the timing correct.

Synchronization proved particularly challenging with fast talking. Sometimes there were just too many words to fit into a small space and the reading rate went over the magic 21 characters/second.

Amara warning that subtitle is too long to read

In this case I had a few choices to solve the problem:

  • Remove a word or two so that the caption fitted with the timing. This seemed heretical to me at first! But losing words such as “like”, “just”, “right” or “actually” usually wasn’t detrimental to the overall message.
  • Change the words to fit, but preserve the meaning.
  • Start the next caption a fraction later. This was less than ideal, but I did choose to do this on a few occasions when removing or changing words wasn’t possible.
I removed the word "really" from the end of a sentence
I removed the word “really” from the end of a sentence

Captions can be in any number of languages

I captioned my video in English, its native language. However, the more languages a video is subtitled in, the more potential reach it has.

Popular talks on Ted.com, like the one below, are often subtitled in multiple languages.

Try playing the video and activate the speech button with 3 dots to see a selection of languages.

There are several different caption file formats

Amara lets you export captions in the following formats:

  • SBV
  • DFXP
  • VTT
  • SRT (the most common format)
  • TXT
  • SSA

The slight awkwardness is that WordPress.tv uses the TTML format, which Amara does not generate. But you can get around that downloading in DFXP format and then changing the file extension to TTML. Then you can upload your TTML caption file to WordPress.tv.

Learn more about what caption file formats and what platforms they are used for.

I have so much respect for professional captioners and transcribers!

During the process I got advice from Orla Pearson of MyClearText who specialises in this area.

A big thanks to her for answering all my questions!

For example:

Q: Is it ok to use “tosser” in subtitles?

A: It’s fine to use swear words if they are said. In fact, edit to get them in! Never censor!

Where’s the subtitled video?

This is the video. I’ve submitted the subtitles, but at this time they are in moderation. I hope they will be published soon!

When the subtitles pass moderation, there will be a CC button on the video controls. You can then click or tap on that and select the language to display them.

Making captioning easier

Obviously, I could have saved a lot of time having the captions professionally done. But I wouldn’t have learned as much about how to add closed captioning to a video if I hadn’t tried doing it myself.

John Espirian has a helpful step-by-step guide on captioning that includes how to use a professional captioning service.

What about auto-captioning?

There are 2,844 videos on the WordPress TV YouTube channel. Since those ones are auto-captioned by YouTube, does that solve the problem?

Not really. While auto-captioning is a start, it isn’t as accurate as a manual version, and can sometimes lead to nonsensical or unintentionally amusing results.

Graham is saying "talking colloquially, colloquially perhaps." The subtitles say "talking locally a prick locally"
Graham is saying “talking colloquially, colloquially perhaps.” The subtitles say “talking locally a prick locally”

I don’t know if it’s possible to download YouTube’s auto-captions for a video you haven’t uploaded and import them into Amara to edit. If it is, that would certainly make the process faster.

Final thoughts

Who uses captions?

The argument that only a small proportion of people need captions isn’t a valid one. Ofcom found that 80% of people who use captions are not deaf or hard of hearing. 80% sure isn’t a small number!

The same study reported that 55% of viewers were disappointed when captions were not available.

There’s also an oft-quoted statistic that says that 85% of Facebook video is watched without sound. Most people – including myself – loathe autoplaying videos with sound blasted at them. But if a video has captions, folks can still get the meaning from it if they happen to be in a noisy environment or don’t have headphones handy.

If you still need more convincing why captions and transcripts are beneficial, Ahmed has a brilliant post and video on this subject:

Top 15 Benefits of Subtitles/Captions & Transcriptions (They’re Not Just for Deaf People Y’know)

Who provides captions – and who should?

I’ve been to a few instances of WordCamp London and they have had live captioning on site.

It’s not just appreciated by people who are deaf or hard of hearing:

There are powerful arguments for including captioning in the budget of events like WordCamps.

Adrian Roselli reasons well for captioning conference talks:

You might be surprised how many attendees are willing to give up the free conference t-shirt or other swag in lieu of supporting captions if you give them the option.

He adds:

Do not make the mistake of trying to do the captions yourself. I assure you it will take you far longer than you expect. Even if you let YouTube take a pass with auto-captions, and then circle back to clean them up, it will take you a lot of time.

There was a call for volunteers to caption more WordPress.tv videos in November 2018, but it doesn’t seem to have been answered.

I feel that we can do better. We should aim to caption every WordCamp or WordPress event talk.

Rather than relying on unpaid volunteers, it does seem like the time to hand over the videos to the caption professionals do their job. It just needs the money!

What do you think? Do you make videos and caption them? Have you ever gone the DIY route? Let me know in the comments.

Accessibility Toolbar