基于分類特征空間高斯混合模型和神經(jīng)網(wǎng)絡(luò)融合的說話人識別
Speaker Identification Based on Classify Feature Sub-space Gaussian Mixture Model and Neural Net Fusion
-
摘要: 該文提出了一種基于分類高斯混合模型和神經(jīng)網(wǎng)絡(luò)融合(FS-GMM/NN)的說話人識別方法,通過對特征矢量進行聚類分析,將說話人的訓(xùn)練語音分成若干類。然后根據(jù)各個類中含特征矢量的多少采用不同的模型混合度,訓(xùn)練建立分類高斯混合模型。并采用神經(jīng)網(wǎng)絡(luò)實現(xiàn)各個分類高斯混合模型輸出的融合。在100個男性話者的與文本無關(guān)的說話人識別實驗中,基于分類高斯混合模型和神經(jīng)網(wǎng)絡(luò)融合的方法在識別性能及噪聲魯棒性上都優(yōu)于不分類的GMM識別系統(tǒng),并具有較高的模型訓(xùn)練效率,且可以有效地降低話者模型的混合度和測試語音長度。Abstract: In this paper, a speaker identification system is proposed based on classify Fea-ture Sub-space Gaussian Mixture Model and Neural Net fusion (FS-GMM/NN) . With clus-tering analysis of the feature vectors, the speakers training feature vectors can be classified to some subsets and training classify Gaussian Mixture Models (GMM) with different mix-tures according to the subsets feature vectorss number. Finally, the outputs of every classify GMM will be fused by Neural Net (NN). In the experiment of text-independent speaker iden-tification of 100 speakers (male), the system based on FS-GMM/NN overmatch the Baseline Gaussian Mixture Model (B-GMM) in identification performance and noise robustness with fewer mixtures and shorter test speech. Moreover, the training of FS-GMM/NN is more effective.
-
Reynolds D A, Rose R C. Robust text-independent speaker identification using Gaussian mixture speaker models[J].IEEE Trans. on Speech Audio Process.1995, 3(1):72-83[2]Reynolds D A. Speaker identification and verification using Gaussian mixture speaker models[J].Speech Communication.1995, 17(1-2):91-108[3]Reynolds D A. Speaker verification using adapted Gaussian mixture models[J].Digital Signal Processing.2000, 10(1-3):19-41[4]Deller J R, Proakisa J G, Hansenm J H L. Discrete-Time Processing of Speech Signals. New York: Macmillan Publishing Company, 1993.[5]Reynolds D A. Experimental evaluation of features for robust speaker identification[J].IEEE Trans.on Speech Audio Process.1994, 2(4):639-643[6]Chang E, Shi Y, Zhou J, Huang C. Speech lab in a box: A mandarin speech toolbox to jumpstart speech related research. in EUROSPEECH, Aalborg, Denmark, 2001: 192-199. -
計量
- 文章訪問數(shù): 2810
- HTML全文瀏覽量: 153
- PDF下載量: 1057
- 被引次數(shù): 0