Please use this identifier to cite or link to this item:
http://hdl.handle.net/11452/32501
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kinnunen, Tomi | - |
dc.contributor.author | Saeidi, Rahim | - |
dc.contributor.author | Pohjalainen, Jouni | - |
dc.contributor.author | Alku, Paavo | - |
dc.date.accessioned | 2023-05-03T10:43:45Z | - |
dc.date.available | 2023-05-03T10:43:45Z | - |
dc.date.issued | 2013 | - |
dc.identifier.citation | Hanilçi, C. vd. (2013). “Speaker identification from shouted speech: Analysis and compensation”. International Conference on Acoustics Speech and Signal Processing ICASSP, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 8027-8031. | en_US |
dc.identifier.issn | 1520-6149 | - |
dc.identifier.uri | https://doi.org/10.1109/ICASSP.2013.6639228 | - |
dc.identifier.uri | http://hdl.handle.net/11452/32501 | - |
dc.description | Bu çalışma, 26-31 Mayıs 2013 tarihleri arasında Vancouver[Kanada]’da düzenlenen IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)’da bildiri olarak sunulmuştur. | tr_TR |
dc.description.abstract | Text-independent speaker identification is studied using neutral and shouted speech in Finnish to analyze the effect of vocal mode mismatch between training and test utterances. Standard mel-frequency cepstral coefficient (MFCC) features with Gaussian mixture model (GMM) recognizer are used for speaker identification. The results indicate that speaker identification accuracy reduces from perfect (100 %) to 8.71 % under vocal mode mismatch. Because of this dramatic degradation in recognition accuracy, we propose to use a joint density GMM mapping technique for compensating the MFCC features. This mapping is trained on a disjoint emotional speech corpus to create a completely speaker- and speech mode independent emotion-neutralizing mapping. As a result of the compensation, the 8.71 % identification accuracy increases to 32.00 % without degrading the non-mismatched train-test conditions much. | en_US |
dc.description.sponsorship | Inst Elect & Elect Engineers | en_US |
dc.description.sponsorship | Inst Elect & Elect Engineers Signal Proc Soc | en_US |
dc.language.iso | en | en_US |
dc.publisher | IEEE | en_US |
dc.rights | info:eu-repo/semantics/openAccess | en_US |
dc.rights | Atıf Gayri Ticari Türetilemez 4.0 Uluslararası | tr_TR |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | Acoustics | en_US |
dc.subject | Engineering | en_US |
dc.subject | Speaker identification | en_US |
dc.subject | Shouted speech | en_US |
dc.subject | Loudspeakers | en_US |
dc.subject | Mapping | en_US |
dc.subject | Signal processing | en_US |
dc.subject | Speech | en_US |
dc.subject | Emotional speech | en_US |
dc.subject | Gaussian mixture model | en_US |
dc.subject | Identification accuracy | en_US |
dc.subject | Mapping techniques | en_US |
dc.subject | Mel-frequency cepstral coefficients | en_US |
dc.subject | Recognition accuracy | en_US |
dc.subject | Speaker identification | en_US |
dc.subject | Text-independent speaker identification | en_US |
dc.subject | Speech recognition | en_US |
dc.title | Speaker identification from shouted speech: Analysis and compensation | en_US |
dc.type | Proceedings Paper | en_US |
dc.identifier.wos | 000329611508038 | tr_TR |
dc.identifier.scopus | 2-s2.0-84890452416 | tr_TR |
dc.relation.publicationcategory | Konferans Öğesi - Uluslararası | tr_TR |
dc.contributor.department | Uludağ Üniversitesi/Mühendislik Fakültesi/Elektrik-Elektronik Mühendisliği Bölümü. | tr_TR |
dc.identifier.startpage | 8027 | tr_TR |
dc.identifier.endpage | 8031 | tr_TR |
dc.relation.journal | International Conference on Acoustics Speech and Signal Processing ICASSP, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing | en_US |
dc.contributor.buuauthor | Hanilçi, Cemal | - |
dc.contributor.buuauthor | Ertaş, Figen | - |
dc.contributor.researcherid | AAH-4188-2021 | tr_TR |
dc.contributor.researcherid | S-4967-2016 | tr_TR |
dc.relation.collaboration | Yurt dışı | tr_TR |
dc.subject.wos | Acoustics | en_US |
dc.subject.wos | Engineering, electrical & electronic | en_US |
dc.indexed.wos | CPCIS | en_US |
dc.indexed.scopus | Scopus | en_US |
dc.contributor.scopusid | 35781455400 | tr_TR |
dc.contributor.scopusid | 24724154500 | tr_TR |
dc.subject.scopus | Whispers; Speech Recognition; Public Speaking | en_US |
Appears in Collections: | Scopus Web of Science |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Hanilçi_vd_2013.pdf | 563.22 kB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License