Skip to Main content Skip to Navigation
Conference papers

Plugging a neural phoneme recognizer into a simple language model: a workflow for low-resource settings

Abstract : Recently, several works have shown that fine-tuning a multilingual model of speech representation (typically XLS-R) with very small amounts of annotated data allows for the development of phonemic transcription systems of sufficient quality to help field linguists in their efforts to document the languages of the world. In this work, we explain how the quality of these systems can be improved by a very simple method, namely integrating them with a language model. Our experiments on an endangered language, Japhug (Trans-Himalayan/Tibeto-Burman), show that this approach can significantly reduce the WER, reaching the stage of automatic recognition of entire words.
Document type :
Conference papers
Complete list of metadata

https://halshs.archives-ouvertes.fr/halshs-03625581
Contributor : Alexis Michaud Connect in order to contact the contributor
Submitted on : Sunday, July 3, 2022 - 8:55:36 AM
Last modification on : Thursday, September 1, 2022 - 10:57:02 PM

File

Interspeech2022_ASR_Endangered...
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution - NonCommercial - ShareAlike 4.0 International License

Identifiers

Citation

Séverine Guillaume, Guillaume Wisniewski, Benjamin Galliot, Minh-Châu Nguyễn, Maxime Fily, et al.. Plugging a neural phoneme recognizer into a simple language model: a workflow for low-resource settings. Interspeech 2022 - 23rd Annual Conference of the International Speech Communication Association, Sep 2022, Incheon, South Korea. ⟨10.5281/zenodo.5521111⟩. ⟨halshs-03625581v2⟩

Share

Metrics

Record views

437

Files downloads

104