Change search
ReferencesLink to record
Permanent link

Direct link
A Discriminative Approach to Pronunciation Variation Modeling in Speech Recognition
Norwegian University of Science and Technology, Faculty of Information Technology, Mathematics and Electrical Engineering, Department of Electronics and Telecommunications.
2013 (English)Doctoral thesis, monograph (Other academic)
Abstract [en]

Put in the most general terms, this dissertation addresses the problem of automatic recognition of non-native proper names. Proper names in themselves tend to pose a severe challenge to speech recognition engines, as these names can typically be pronounced in a variety of ways, and do not necessarily follow generally governing pronunciation conventions. Non-native proper names add still further levels of complication, caused by such variables as the speaker’s familiarity with the foreign name, proficiency in the foreign language, and tendency to adapt pronunciation of the name to the native language or, obversely, to adopt foreign speech characteristics in order to pronounce the name as faithfully as possible. When confronted with nonnative proper names, it is therefore particularly important for an automatic speech recognition system to be able to handle a considerable amount of pronunciation variety. Traditionally, the more or less self-evident approach to cope with this variety has been simply to add pronunciation variants to the recognition lexicon. However, introducing such variants typically entails the risk of increasing confusability between different lexicon entries, as new variants of previously more distinct units are likely to augment phonetic similarities within the lexicon. It would seem crucial for recognition success, then, to optimize the balance between lexical coverage and confusability. In this work, we strive to attain such a balance by submitting pronunciation variants to selection procedures rather than adding variants to the recognition lexicon indiscriminately. The selective addition of pronunciation variants to a recognition lexicon has a clear intuitive appeal. It is the objective of this dissertation to confirm that intuition experimentally by measuring the improvements in recognition accuracy yielded by various selection methods. Particularly, we propose a new pronunciation variant selection criterion that is directly related to the effective recognition error rate. To estimate the number of errors corrected by a particular variant, scores based on the Minimum Classification Error framework are calculated before and after the addition of the variant to the lexicon. Using this criterion, three different variant selection procedures are proposed in this work: a single-pass approach, an iterative approach and a tree-search approach. These selection methods aim to optimize the recognition lexicon in terms of size and recognition performance by adding to the lexicon only those pronunciation variants that effect an actual decrease in the error rate. We contrast these selection methods with more traditional approaches to populate the recognition lexicon, such as using all available variants indiscriminately, and selecting on the basis of the probabilities obtained during the generation of possible new pronunciation variants. Our experiments show that we can significantly reduce the error rate and the required number of variants per name by applying our proposed selection approaches.

Place, publisher, year, edition, pages
NTNU, 2013.
Doctoral theses at NTNU, ISSN 1503-8181 ; 2013:15
National Category
Electronics Telecommunication
URN: urn:nbn:no:ntnu:diva-20191ISBN: 978-82-471-4120-5 (electronic version)ISBN: 978-82-471-4119-9 (printed version)OAI: diva2:605767
Public defence
2013-01-14, 00:00
Available from: 2013-02-15 Created: 2013-02-15Bibliographically approved

Open Access in DiVA

fulltekst(2824 kB)531 downloads
File information
File name FULLTEXT01.pdfFile size 2824 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
Department of Electronics and Telecommunications

Search outside of DiVA

GoogleGoogle Scholar
Total: 531 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 186 hits
ReferencesLink to record
Permanent link

Direct link