Evaluation of a speech recognition system Pocketsphinx
Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Speech recognition is the process of translating an audio signal into text using a computer program. The technique is today widely used in a large variety of areas. Pocketsphinx which is an open source speech recognition system is one of the more promising systems on the market today. It is designed to be effective and to be able to translate speech in real-time on low performance platforms. In this thesis Pocketsphinx is measured with respect to word error rate and translation time using data recorded by two Swedish speakers. A proof of concept was made using Pocketsphinx to control a robot by voice. The system was compared to the similar speech recognition system Google speech with respect to word error rate and translation time. The resulting data from measurements suggests that Google speech has a considerably better accuracy on long grammatically correct sentences, while Pocketsphinx is a less demanding and faster system. The word error rate and translation time of Pocketsphinx is more affected by noise than that of Google speech. The accuracy and recognition speed of Pocketsphinx showed huge improvements when stripping down the dictionary, thus it is more suited for controlling a robot with a limited number of commands in real-time.
Place, publisher, year, edition, pages
2015. , 33 p.
, UMNAD, 1023
IdentifiersURN: urn:nbn:se:umu:diva-108293OAI: oai:DiVA.org:umu-108293DiVA: diva2:852179
Bachelor of Science Programme in Computing Science