基于小波神經(jīng)網(wǎng)絡的與文本無關說話人識別方法研究
Research on Text-Independent Speaker Recognition Methods Using Wavelet Neural Network
-
摘要: 基于神經(jīng)網(wǎng)絡的說話人識別方法可以在一定程度上模仿人腦的功能,是說話人識別中的一種主要技術,但它通常難以確定隱層單元的數(shù)目,收斂速度慢,易于收斂到極小點。該文研究了一種用于說話人識別的小波神經(jīng)網(wǎng)絡模型,給出了網(wǎng)絡結構和學習算法。采用Mel頻率倒譜系數(shù)作為與文本無關的說話人識別的特征參數(shù),并利用該模型進行了5個人的說話人識別實驗,得到99.5%的識別率。實驗結果表明,小波網(wǎng)絡和傳統(tǒng)的BP網(wǎng)絡相比,訓練速度和識別率都有了較大提高,具有良好的應用前景和進一步研究的價值。Abstract: The approach for speaker recognition based on neural networks is able to emulate the function of human brain in some degree, so it is a main implementation technology in the speaker recognition. But it is difficult to determine the number of hidden layer neurons, slowly convergent and easy to fall into local minimum point. The model of wavelet neural networks is studied. The structure of the network and learning algorithm are given. The recognition correctness reaches to 99.5% for 5 speakers using Mel frequency cepstral coefficient as feature parameters. The experimental at results show that the learning rate and recognition correctness are improved much compared to the BP networks. It has a good application prospect and worth to research further more.
-
Zhang Qinhua, Benveniste Al. Wavelet networks[J].IEEE Trans. on Neural Networks.1992, 3(6):889-[2]Szu H, Telfer B, Kadambe S. Neural network adaptive wavelets for signal representation and classification. Optical Engineering, 1992, 31(9):907.1016.[3]彭玉華. 小波變換與工程應用. 北京: 科學出版社, 2002:7.8[4]Zhang J, Walter G. Wavelet neural networks for function learning[J].IEEE Trans. on Signal Processing.1995, 43(6):1485-[5]李衛(wèi)斌, 劉芳.小波神經(jīng)網(wǎng)絡的構造. 模式識別與人工智能,2003, 16(4):403.406.[6]焦李成. 神經(jīng)網(wǎng)絡的應用與實現(xiàn). 西安:西安電子科技大學出版社, 1996, 第一章.[7]Yoshihiro Yamamoto, Nikiforuk P N. A new supervised learning algorithm for multilayered and inter-connected neural networks[J].IEEE Trans. on Neural Network.2000,11(1):36-[8]李金平,王風濤,楊波. BP小波神經(jīng)網(wǎng)絡快速學習算法研究. 系統(tǒng)工程與電子技術,2001, 23(8):72.75.[9]趙學智,鄒春華,陳統(tǒng)堅. 小波神經(jīng)網(wǎng)絡的參數(shù)初始化研究. 華南理工大學學報(自然科學版), 2003, 31 (2):77.80.[10]Lamel L F, Kessel R H, Seneff S. Speech database development :Design and analysis of the acoustic-phonetic corpus. Proc.Speech Recognition Workshop(DARPA), 1986: 100.109.[11]甄斌,吳璽宏,劉志敏. 語音識別和說話人識別中各倒譜分量的相對重要性. 北京大學學報, 2001, 37(3): 371.378. -
計量
- 文章訪問數(shù): 2351
- HTML全文瀏覽量: 80
- PDF下載量: 963
- 被引次數(shù): 0