Bu öğeden alıntı yapmak, öğeye bağlanmak için bu tanımlayıcıyı kullanınız: http://hdl.handle.net/11452/26025
Başlık: Comparing spectrum estimators in speaker verification under additive noise degradation
Yazarlar: Kinnunen, Tomi H.
Saeidi, Rahim
Pohjalainen, Jouni
Alku, Paavo
Sandberg, Johan
Hansson-Sandsten, Maria
Uludağ Üniversitesi/Mühendislik Fakültesi/Elektronik Mühendisliği Bölümü.
Hanilci, Cemal
Ertaş, Figen
AAH-4188-2021
S-4967-2016
35781455400
24724154500
Anahtar kelimeler: Acoustics
Engineering
Spectrum estimation
Speaker verification
Weighted linear prediction
Speech
Recognition
Acoustic noise
Additive noise
Discrete
Signal processing
Spectrum analysis
Babble noise
Dft method
Equal error rate
Linear prediction
Mel-frequency cepstral coefficients
Minimum variance distortionless response
Noise contamination
Noise degradations
Recognition performance
Speaker recognition
Fourier transforms
Spectrum estimators
Speech frames
Speech recognition
Yayın Tarihi: 2012
Yayıncı: IEEE
Atıf: Hanilci, C. vd. (2012). "Comparing spectrum estimators in speaker verification under additive noise degradation". International Conference on Acoustics Speech and Signal Processing ICASSP, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 4769-4772.
Özet: Different short-term spectrum estimators for speaker verification under additive noise are considered. Conventionally, mel-frequency cepstral coefficients (MFCCs) are computed from discrete Fourier transform (DFT) spectra of windowed speech frames. Recently, linear prediction (LP) and its temporally weighted variants have been substituted as the spectrum analysis method in speech and speaker recognition. In this paper, 12 different short-term spectrum estimation methods are compared for speaker verification under additive noise contamination. Experimental results conducted on NIST 2002 SRE show that the spectrum estimation method has a large effect on recognition performance and stabilized weighted LP (SWLP) and minimum variance distortionless response (MVDR) methods yield approximately 7 % and 8 % relative improvements over the standard DFT method at -10 dB SNR level of factory and babble noises, respectively in terms of equal error rate (EER).
Açıklama: Bu çalışma, 25-30 Mart 2012 tarihleri arasında Kyoto[Japonya]’da düzenlenen IEEE International Conference on Acoustics, Speech and Signal Processing’da bildiri olarak sunulmuştur.
URI: https://doi.org/10.1109/ICASSP.2012.6288985
https://ieeexplore.ieee.org/document/6288985
http://hdl.handle.net/11452/26025
ISBN: 978-1-4673-0046-9
ISSN: 1520-6149
Koleksiyonlarda Görünür:Scopus
Web of Science

Bu öğenin dosyaları:
Dosya Açıklama BoyutBiçim 
Hanilci_vd_2012.pdf306.62 kBAdobe PDFKüçük resim
Göster/Aç


Bu öğe kapsamında lisanslı Creative Commons License Creative Commons