Please use this identifier to cite or link to this item:
http://hdl.handle.net/11452/23860
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.date.accessioned | 2022-01-05T07:27:45Z | - |
dc.date.available | 2022-01-05T07:27:45Z | - |
dc.date.issued | 2011-01 | - |
dc.identifier.citation | Hanilci, C. vd. (2011). "Comparison of the impact of some Minkowski metrics on VQ/GMM based speaker recognition". Computers and Electrical Engineering, 37(1), 41-56. | en_US |
dc.identifier.issn | 0045-7906 | - |
dc.identifier.issn | 1879-0755 | - |
dc.identifier.uri | https://doi.org/10.1016/j.compeleceng.2010.08.001 | - |
dc.identifier.uri | https://dl.acm.org/doi/abs/10.1016/j.compeleceng.2010.08.001 | - |
dc.identifier.uri | http://hdl.handle.net/11452/23860 | - |
dc.description.abstract | This paper evaluates the impact of three special forms of the Minkowski metric (Euclidean, City Block, and Chebychev distances) on the performance of the conventional vector quantization (VQ) and Gaussian mixture model (GMM) based closed-set text-independent speaker recognition systems, in terms of recognition rate and confidence on decisions. For the VQ based system, evaluations are carried out using the two most common clustering algorithms, LBG and K-means, and it is revealed which clustering algorithm and distance pair should be used to exploit the best attribute of both to achieve the best recognition rate for a given codebook size. In the case of GMM based system, we introduce the metrics into the GMM using a concatenation of the LBG and K-means algorithms in estimating the initial mean vectors, to which the system performance is sensitive, and explore their impact on system performance. We also make comparison of results obtained from evaluations on clean speech (TIMIT) and telephone speech databases (NTIMIT and NIST2001) with the modern classifiers VQ-UBM and GMM-UBM. It is found that there are cases where conventional VQ based system outperforms the modern systems. Moreover, the impact of distance metrics on the performance of the conventional and modern systems depends on the recognition task imposed (verification/identification). | en_US |
dc.language.iso | en | en_US |
dc.publisher | Pergamon-Elsevier Science | en_US |
dc.rights | info:eu-repo/semantics/closedAccess | en_US |
dc.subject | Computer science | en_US |
dc.subject | Engineering | en_US |
dc.subject | Identification | en_US |
dc.subject | Algorithm | en_US |
dc.subject | Character recognition | en_US |
dc.subject | Speech recognition | en_US |
dc.subject | Vector quantization | en_US |
dc.subject | City block | en_US |
dc.subject | Clean speech | en_US |
dc.subject | Codebooks | en_US |
dc.subject | Distance metrics | en_US |
dc.subject | Euclidean | en_US |
dc.subject | Gaussian Mixture Model | en_US |
dc.subject | K-means | en_US |
dc.subject | k-Means algorithm | en_US |
dc.subject | Mean vector | en_US |
dc.subject | Minkowski | en_US |
dc.subject | Minkowski metrics | en_US |
dc.subject | Recognition rates | en_US |
dc.subject | Speaker recognition | en_US |
dc.subject | Speaker recognition system | en_US |
dc.subject | Special forms | en_US |
dc.subject | Telephone speech | en_US |
dc.subject | Clustering algorithms | en_US |
dc.title | Comparison of the impact of some Minkowski metrics on VQ/GMM based speaker recognition | en_US |
dc.type | Article | en_US |
dc.identifier.wos | 000287560300004 | tr_TR |
dc.identifier.scopus | 2-s2.0-79251600402 | tr_TR |
dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi | tr_TR |
dc.contributor.department | Uludağ Üniversitesi/Mühendislik Fakültesi/Elektrik-Elektronik Mühendisliği Bölümü. | tr_TR |
dc.identifier.startpage | 41 | tr_TR |
dc.identifier.endpage | 56 | tr_TR |
dc.identifier.volume | 37 | tr_TR |
dc.identifier.issue | 1 | tr_TR |
dc.relation.journal | Computers and Electrical Engineering | en_US |
dc.contributor.buuauthor | Hanilci, Cemal | - |
dc.contributor.buuauthor | Ertaş, Figen | - |
dc.contributor.researcherid | S-4967-2016 | tr_TR |
dc.contributor.researcherid | AAH-4188-2021 | tr_TR |
dc.subject.wos | Computer science, hardware & architecture | en_US |
dc.subject.wos | Computer science, interdisciplinary applications | en_US |
dc.subject.wos | Engineering, electrical & electronic | en_US |
dc.indexed.wos | SCIE | en_US |
dc.indexed.scopus | Scopus | en_US |
dc.wos.quartile | Q3 | en_US |
dc.contributor.scopusid | 35781455400 | tr_TR |
dc.contributor.scopusid | 24724154500 | tr_TR |
dc.subject.scopus | Speaker Verification; Language Recognition; Utterance | en_US |
Appears in Collections: | Scopus Web of Science |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.