The number of YouTube videos with automatic captions now exceeds 1bn – and consumers watch videos with said captions more than 15m times per day, the platform said. Google first launched video captions in 2006. It added automated captions on YouTube in 2009. “This was a big leap forward to help us keep up with YouTube’s growing scale,” Product Manager Liat Kaver wrote in a blog post.
In addition, Kaver noted YouTube has made great strides in terms of the numbers of videos with captions, as well as in the accuracy of those captions. This doesn’t just benefit consumers watching videos without sound, but also those who are deaf or hard of hearing.
“One of the ways that we were able to scale the availability of captions was by combining Google's automatic speech recognition (ASR) technology with the YouTube caption system to offer automatic captions for videos,” Kaver wrote. “There were limitations with the technology that underscored the need to improve the captions themselves. Results were sometimes less than perfect, prompting some creators to have a little fun at our expense!”
As a result, a big goal has been improving the accuracy of automatic captions, which Kaver said is not easy for a platform of YouTube’s size – especially considering the diversity of content. “Key to the success of this endeavor was improving our speech recognition, machine learning algorithms and expanding our training data,” Kaver wrote. “All together, those technological efforts have resulted in a 50[%] leap in accuracy for automatic captions in English, which is getting us closer and closer to human transcription error rates.”
Continuing to improve the accuracy of captions remains an important goal moving forward and YouTube wants to extend this work to its ten supported languages.
“But we can’t do it alone. We count on the amazing YouTube community of creators and viewers everywhere,” Kaver added. “Ideally, every video would have an automatic caption track generated by our system and then reviewed and edited by the creator. With the improvements we’ve made to the automated speech recognition, this is now easier than ever.”