Please use this identifier to cite or link to this item:
http://hdl.handle.net/11452/30193
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Rajan, Padmanabhan | - |
dc.contributor.author | Kinnunen, Tomi H. | - |
dc.contributor.author | Pohjalainen, Jouni | - |
dc.contributor.author | Alku, Paavo | - |
dc.contributor.author | Bimbot, F. | - |
dc.contributor.author | Cerisara, C. | - |
dc.contributor.author | Fougeron, C. | - |
dc.contributor.author | Gravier, G. | - |
dc.contributor.author | Lamel, L. | - |
dc.contributor.author | Pellegrino, F. | - |
dc.contributor.author | Perrier, P. | - |
dc.date.accessioned | 2022-12-30T11:58:03Z | - |
dc.date.available | 2022-12-30T11:58:03Z | - |
dc.date.issued | 2013 | - |
dc.identifier.citation | Rajan, P. vd. (2013). "Using group delay functions from all-pole models for speaker recognition". 14th Annual Conference of the International Speech Communication Association (Interspeech 2013), 1-5, 2488-2492. | en_US |
dc.identifier.issn | 2308-457X | - |
dc.identifier.uri | http://faculty.iitmandi.ac.in/~padman/papers/padman_gdAllPole_interspeech2013.pdf | - |
dc.identifier.uri | http://hdl.handle.net/11452/30193 | - |
dc.description | Bu çalışma, 25-29 Ağustos 2013 tarihlerinde Lyon[Fransa]'da düzenlenen 14. Annual Conference of the International Speech Communication Association [Interspeech 2013]'da bildiri olarak sunulmuştur. | tr_TR |
dc.description.abstract | Popular features for speech processing, such as mel-frequency cepstral coefficients (MFCCs), are derived from the short-term magnitude spectrum, whereas the phase spectrum remains unused. While the common argument to use only the magnitude spectrum is that the human ear is phase-deaf, phase-based features have remained less explored due to additional signal processing difficulties they introduce. A useful representation of the phase is the group delay function, but its robust computation remains difficult. This paper advocates the use of group delay functions derived from parametric all-pole models instead of their direct computation from the discrete Fourier transform. Using a subset of the vocal effort data in the NIST 2010 speaker recognition evaluation (SRE) corpus, we show that group delay features derived via parametric all-pole models improve recognition accuracy, especially under high vocal effort. Additionally, the group delay features provide comparable or improved accuracy over conventional magnitude-based MFCC features. Thus, the use of group delay functions derived from all-pole models provide an effective way to utilize information from the phase spectrum of speech signals. | en_US |
dc.description.sponsorship | Academy of Finland (253120) | en_US |
dc.description.sponsorship | Int Speech Commun Association | en_US |
dc.description.sponsorship | Amazon | en_US |
dc.description.sponsorship | Microsoft | en_US |
dc.description.sponsorship | en_US | |
dc.description.sponsorship | TcL SYTRAL | en_US |
dc.description.sponsorship | European Language Resources Association | en_US |
dc.description.sponsorship | Ouaero | en_US |
dc.description.sponsorship | Imaginove | en_US |
dc.description.sponsorship | VOCAPIA Research | en_US |
dc.description.sponsorship | Acapela | en_US |
dc.description.sponsorship | Speech Ocean | en_US |
dc.description.sponsorship | ALDEBARAN | en_US |
dc.description.sponsorship | Orange | en_US |
dc.description.sponsorship | Vecsys | en_US |
dc.description.sponsorship | IBM Research | en_US |
dc.description.sponsorship | Raytheon BBN Technology | en_US |
dc.description.sponsorship | Voxygen | en_US |
dc.language.iso | en | en_US |
dc.publisher | Isc-Int Speech Communication Association | en_US |
dc.rights | info:eu-repo/semantics/openAccess | en_US |
dc.rights | Atıf Gayri Ticari Türetilemez 4.0 Uluslararası | tr_TR |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | Computer science | en_US |
dc.subject | Engineering | en_US |
dc.subject | Speaker verification | en_US |
dc.subject | Group delay functions | en_US |
dc.subject | High vocal effort | en_US |
dc.subject | Additive noise | en_US |
dc.subject | Verification | en_US |
dc.subject | Discrete Fourier transforms | en_US |
dc.subject | Group delay | en_US |
dc.subject | Poles | en_US |
dc.subject | Signal processing | en_US |
dc.subject | Speech processing | en_US |
dc.subject | Direct computations | en_US |
dc.subject | Group delay functions | en_US |
dc.subject | Mel-frequency cepstral coefficients | en_US |
dc.subject | Recognition accuracy | en_US |
dc.subject | Speaker recognition | en_US |
dc.subject | Speaker recognition evaluations | en_US |
dc.subject | Speaker verification | en_US |
dc.subject | Vocal efforts | en_US |
dc.subject | Speech recognition | en_US |
dc.title | Using group delay functions from all-pole models for speaker recognition | en_US |
dc.type | Proceedings Paper | en_US |
dc.identifier.wos | 000395050001036 | tr_TR |
dc.identifier.scopus | 2-s2.0-84906257507 | tr_TR |
dc.relation.publicationcategory | Konferans Öğesi - Uluslararası | tr_TR |
dc.contributor.department | Uludağ Üniversitesi/Mühendislik Fakültesi/Elektrik Elektronik Mühendisliği Bölümü. | tr_TR |
dc.identifier.startpage | 2488 | tr_TR |
dc.identifier.endpage | 2492 | tr_TR |
dc.identifier.volume | 1-5 | tr_TR |
dc.relation.journal | 14th Annual Conference of the International Speech Communication Association (Interspeech 2013) | en_US |
dc.contributor.buuauthor | Hanilçi, Cemal | - |
dc.contributor.researcherid | S-4967-2016 | tr_TR |
dc.subject.wos | Computer science, artificial intelligence | en_US |
dc.subject.wos | Engineering, electrical & electronic | en_US |
dc.indexed.wos | CPCIS | en_US |
dc.indexed.scopus | Scopus | en_US |
dc.contributor.scopusid | 35781455400 | tr_TR |
dc.subject.scopus | Speaker Verification; Speech Enhancement; Attack | en_US |
Appears in Collections: | Scopus Web of Science |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Hanilci_vd_2013.pdf | 123.35 kB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License