Change search
ReferencesLink to record
Permanent link

Direct link
Synthesizing expressive speech from amateur audiobook recordings
Show others and affiliations
2012 (English)In: Spoken Language Technology Workshop (SLT), 2012, 297-302 p.Conference paper (Refereed)Text
Abstract [en]

Freely available audiobooks are a rich resource of expressive speech recordings that can be used for the purposes of speech synthesis. Natural sounding, expressive synthetic voices have previously been built from audiobooks that contained large amounts of highly expressive speech recorded from a profes- sionally trained speaker. The majority of freely available au- diobooks, however, are read by amateur speakers, are shorter and contain less expressive (less emphatic, less emotional, etc.) speech both in terms of quality and quantity. Synthesiz- ing expressive speech from a typical online audiobook there- fore poses many challenges. In this work we address these challenges by applying a method consisting of minimally su- pervised techniques to align the text with the recorded speech, select groups of expressive speech segments and build expres- sive voices for hidden Markov-model based synthesis using speaker adaptation. Subjective listening tests have shown that the expressive synthetic speech generated with this method is often able to produce utterances suited to an emotional mes- sage. We used a restricted amount of speech data in our exper- iment, in order to show that the method is generally applicable to most typical audiobooks widely available online. 

Place, publisher, year, edition, pages
2012. 297-302 p.
National Category
Engineering and Technology
URN: urn:nbn:se:kth:diva-185533OAI: diva2:922169
Spoken Language Technology Workshop (SLT. 2-5 December.

QC 20160422

Available from: 2016-04-22 Created: 2016-04-21 Last updated: 2016-04-22Bibliographically approved

Open Access in DiVA

fulltext(744 kB)12 downloads
File information
File name FULLTEXT01.pdfFile size 744 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links

Spoken Language Technology Workshop (SLT)

Search in DiVA

By author/editor
Székely, Éva
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 12 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 19 hits
ReferencesLink to record
Permanent link

Direct link