iVector Based Language Recognition
The focus of this thesis is an fairly new approach to phonotactic language recognition, i.e. identifying a language from the sounds in an spoken utterance, known as iVector subspace modeling. The goal of the iVector is to compactly represent the discriminative information in a utterance so that further processing of the utterance is less computationally intensive. This might enable the system to be trained with more data, and thereby reach an higher performance. We present both the theory behind iVectors and experiments to better fit the iVector space to our development data. The final system got comparable result to our baseline PRLM system on the NIST LRE03 30 second evaluation set.
Place, publisher, year, edition, pages
Institutt for elektronikk og telekommunikasjon , 2012. , 83 p.
ntnudaim:8174, MTKOM kommunikasjonsteknologi, Lyd- og bildebehandling
IdentifiersURN: urn:nbn:no:ntnu:diva-19079Local ID: ntnudaim:8174OAI: oai:DiVA.org:ntnu-19079DiVA: diva2:566457
Svendsen, Torbjørn, ProfessorSoufifar, Mehdi