Bu öğeden alıntı yapmak, öğeye bağlanmak için bu tanımlayıcıyı kullanınız: http://hdl.handle.net/11452/32501
Başlık: Speaker identification from shouted speech: Analysis and compensation
Yazarlar: Kinnunen, Tomi
Saeidi, Rahim
Pohjalainen, Jouni
Alku, Paavo
Uludağ Üniversitesi/Mühendislik Fakültesi/Elektrik-Elektronik Mühendisliği Bölümü.
Hanilçi, Cemal
Ertaş, Figen
AAH-4188-2021
S-4967-2016
35781455400
24724154500
Anahtar kelimeler: Acoustics
Engineering
Speaker identification
Shouted speech
Loudspeakers
Mapping
Signal processing
Speech
Emotional speech
Gaussian mixture model
Identification accuracy
Mapping techniques
Mel-frequency cepstral coefficients
Recognition accuracy
Speaker identification
Text-independent speaker identification
Speech recognition
Yayın Tarihi: 2013
Yayıncı: IEEE
Atıf: Hanilçi, C. vd. (2013). “Speaker identification from shouted speech: Analysis and compensation”. International Conference on Acoustics Speech and Signal Processing ICASSP, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 8027-8031.
Özet: Text-independent speaker identification is studied using neutral and shouted speech in Finnish to analyze the effect of vocal mode mismatch between training and test utterances. Standard mel-frequency cepstral coefficient (MFCC) features with Gaussian mixture model (GMM) recognizer are used for speaker identification. The results indicate that speaker identification accuracy reduces from perfect (100 %) to 8.71 % under vocal mode mismatch. Because of this dramatic degradation in recognition accuracy, we propose to use a joint density GMM mapping technique for compensating the MFCC features. This mapping is trained on a disjoint emotional speech corpus to create a completely speaker- and speech mode independent emotion-neutralizing mapping. As a result of the compensation, the 8.71 % identification accuracy increases to 32.00 % without degrading the non-mismatched train-test conditions much.
Açıklama: Bu çalışma, 26-31 Mayıs 2013 tarihleri arasında Vancouver[Kanada]’da düzenlenen IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)’da bildiri olarak sunulmuştur.
URI: https://doi.org/10.1109/ICASSP.2013.6639228
http://hdl.handle.net/11452/32501
ISSN: 1520-6149
Koleksiyonlarda Görünür:Scopus
Web of Science

Bu öğenin dosyaları:
Dosya Açıklama BoyutBiçim 
Hanilçi_vd_2013.pdf563.22 kBAdobe PDFKüçük resim
Göster/Aç


Bu öğe kapsamında lisanslı Creative Commons License Creative Commons