Speaker identification from shouted speech: Analysis and compensation

Kinnunen, Tomi; Saeidi, Rahim; Pohjalainen, Jouni; Alku, Paavo

Bu öğeden alıntı yapmak, öğeye bağlanmak için bu tanımlayıcıyı kullanınız: http://hdl.handle.net/11452/32501

Tüm üstveri kaydı

Dublin Core Alanı	Değer	Dil
dc.contributor.author	Kinnunen, Tomi	-
dc.contributor.author	Saeidi, Rahim	-
dc.contributor.author	Pohjalainen, Jouni	-
dc.contributor.author	Alku, Paavo	-
dc.date.accessioned	2023-05-03T10:43:45Z	-
dc.date.available	2023-05-03T10:43:45Z	-
dc.date.issued	2013	-
dc.identifier.citation	Hanilçi, C. vd. (2013). “Speaker identification from shouted speech: Analysis and compensation”. International Conference on Acoustics Speech and Signal Processing ICASSP, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 8027-8031.	en_US
dc.identifier.issn	1520-6149	-
dc.identifier.uri	https://doi.org/10.1109/ICASSP.2013.6639228	-
dc.identifier.uri	http://hdl.handle.net/11452/32501	-
dc.description	Bu çalışma, 26-31 Mayıs 2013 tarihleri arasında Vancouver[Kanada]’da düzenlenen IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)’da bildiri olarak sunulmuştur.	tr_TR
dc.description.abstract	Text-independent speaker identification is studied using neutral and shouted speech in Finnish to analyze the effect of vocal mode mismatch between training and test utterances. Standard mel-frequency cepstral coefficient (MFCC) features with Gaussian mixture model (GMM) recognizer are used for speaker identification. The results indicate that speaker identification accuracy reduces from perfect (100 %) to 8.71 % under vocal mode mismatch. Because of this dramatic degradation in recognition accuracy, we propose to use a joint density GMM mapping technique for compensating the MFCC features. This mapping is trained on a disjoint emotional speech corpus to create a completely speaker- and speech mode independent emotion-neutralizing mapping. As a result of the compensation, the 8.71 % identification accuracy increases to 32.00 % without degrading the non-mismatched train-test conditions much.	en_US
dc.description.sponsorship	Inst Elect & Elect Engineers	en_US
dc.description.sponsorship	Inst Elect & Elect Engineers Signal Proc Soc	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.rights	Atıf Gayri Ticari Türetilemez 4.0 Uluslararası	tr_TR
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	Acoustics	en_US
dc.subject	Engineering	en_US
dc.subject	Speaker identification	en_US
dc.subject	Shouted speech	en_US
dc.subject	Loudspeakers	en_US
dc.subject	Mapping	en_US
dc.subject	Signal processing	en_US
dc.subject	Speech	en_US
dc.subject	Emotional speech	en_US
dc.subject	Gaussian mixture model	en_US
dc.subject	Identification accuracy	en_US
dc.subject	Mapping techniques	en_US
dc.subject	Mel-frequency cepstral coefficients	en_US
dc.subject	Recognition accuracy	en_US
dc.subject	Speaker identification	en_US
dc.subject	Text-independent speaker identification	en_US
dc.subject	Speech recognition	en_US
dc.title	Speaker identification from shouted speech: Analysis and compensation	en_US
dc.type	Proceedings Paper	en_US
dc.identifier.wos	000329611508038	tr_TR
dc.identifier.scopus	2-s2.0-84890452416	tr_TR
dc.relation.publicationcategory	Konferans Öğesi - Uluslararası	tr_TR
dc.contributor.department	Uludağ Üniversitesi/Mühendislik Fakültesi/Elektrik-Elektronik Mühendisliği Bölümü.	tr_TR
dc.identifier.startpage	8027	tr_TR
dc.identifier.endpage	8031	tr_TR
dc.relation.journal	International Conference on Acoustics Speech and Signal Processing ICASSP, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing	en_US
dc.contributor.buuauthor	Hanilçi, Cemal	-
dc.contributor.buuauthor	Ertaş, Figen	-
dc.contributor.researcherid	AAH-4188-2021	tr_TR
dc.contributor.researcherid	S-4967-2016	tr_TR
dc.relation.collaboration	Yurt dışı	tr_TR
dc.subject.wos	Acoustics	en_US
dc.subject.wos	Engineering, electrical & electronic	en_US
dc.indexed.wos	CPCIS	en_US
dc.indexed.scopus	Scopus	en_US
dc.contributor.scopusid	35781455400	tr_TR
dc.contributor.scopusid	24724154500	tr_TR
dc.subject.scopus	Whispers; Speech Recognition; Public Speaking	en_US
Koleksiyonlarda Görünür:	Scopus Web of Science

Bu öğenin dosyaları:

Dosya	Açıklama	Boyut	Biçim
Hanilçi_vd_2013.pdf		563.22 kB	Adobe PDF	Göster/Aç

Kısa Öğe Kaydını Göster İstatistikler

Bu öğe kapsamında lisanslı Creative Commons License

Bursa Uludağ Üniversitesi Açık Erişim Sistemi

Bursa Uludağ Üniversitesinin araştırma çıktılarının yer aldığı açık erişim sistemidir.