Complex-Valued Independent Component Analysis for Online Blind Speech Extraction
Blekinge Institute of Technology, School of Engineering, Department of Signal Processing2008 (English)In: IEEE Transactions on Audio, Speech, and Language Processing, ISSN 1558-7916, Vol. 16, no 8, 1624-1632 p.Article in journal (Refereed) Published
This paper presents a theoretical analysis of a certain criterion for complex-valued independent component analysis (ICA) with a focus on blind speech extraction (BSE) of a spatio–temporally nonstationary speech source. In the paper, the proposed criteria denoted KSICA is related to the well-known FastICA method with the Kurtosis contrast function. The proposed method is shown to share the important fixed-point feature withthe FastICA method, although an improvement with the proposed method is that it does not exhibit the divergent behavior for a mixture of Gaussian-only sources that the FastICA method tends to do, and it shows better performance in online implementations. Compared to the FastICA, the KSICA method provides a 10 dB higher source extraction performance and a 10 dB lower standard deviation in a data batch approach when the data batch size is less than 100 samples. For larger batch sizes, the KSICA metod performs equally well. In an online application with spatially stationary sources the KSICA method provides around 10 dB higher interference suppression, and 1 MOS-unit lower speech distortion compared to the FastICA for 0.15 s time constant in the algorithm update parameter. Thus, the FastICA performance matches the KSICA performance for a time constant above 1 s. Finally, in an online application with a moving speech source, the KSICA method provides 10 dB higher interference suppression, compared to the FastICA for the same algorithm settings. All in all, the proposed KSICA method is shown to be a viable alternative for online BSE of complex-valued signal mixtures.
Place, publisher, year, edition, pages
IEEE , 2008. Vol. 16, no 8, 1624-1632 p.
Array signal processing, higher order statistics, speech enhancement
IdentifiersURN: urn:nbn:se:bth-8385DOI: 10.1109/TASL.2008.2002058ISI: 000260463800022Local ID: oai:bth.se:forskinfo65AB698F1066AAF9C12574EC002AE0E4OAI: oai:DiVA.org:bth-8385DiVA: diva2:836100