Whisper Large-v3 Release
Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.
The
large-v3
model shows improved performance over a wide variety of languages, and the plot below includes all languages where Whisperlarge-v3
performs lower than 60% error rate on Common Voice 15 and Fleurs, showing 10% to 20% reduction of errors compared to large-v2:
![](https://kbin.cafe/media/cache/resolve/entry_thumb/e7/7b/e77bfb8f8ea2e5a2b64990a9e71efb35f37010731ede94193e6c708530390a23.png)
Add comment