Mozilla releases dataset and model to lower voice-recognition barriers
Mozilla has released its Common Voice collection, which contains almost 400,000 recordings from 20,000 people, and is claimed to be the second-largest voice dataset publicly available. The voice samples in the collection were obtained from Mozilla's Common Voice project, which allowed users via an iOS app or website to donate their utterances. It is hoped that creating a large public dataset will allow for better voice-enabled applications.
Alongside its dataset, Mozilla also released its open-source Project DeepSpeech voice-recognition model based on work done by Chinese internet giant Baidu. It is claimed that with its 6.5 percent error rate on the LibriSpeech dataset, DeepSpeech is approaching human levels of recognition.
In August, Microsoft said it had reached a voice-recognition error rate of 5.1 percent on the Switchboard corpus, the same level as professional human transcribers. Earlier in the year, Google said it had a 4.9 percent error rate in its speech-recognition software.