漢語連續(xù)語音識別中不同基元聲學模型的復合

張輝; 杜利民

一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

留言板

尊敬的讀者、作者、審稿人, 關于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復。謝謝您的支持!

姓名

郵箱

手機號碼

標題

留言內容

驗證碼

漢語連續(xù)語音識別中不同基元聲學模型的復合

張輝, 杜利民

文章導航 > 電子與信息學報 > 2006 > 28(11): 2045-2049

張輝, 杜利民. 漢語連續(xù)語音識別中不同基元聲學模型的復合[J]. 電子與信息學報, 2006, 28(11): 2045-2049.

引用本文:

張輝, 杜利民. 漢語連續(xù)語音識別中不同基元聲學模型的復合[J]. 電子與信息學報, 2006, 28(11): 2045-2049.

Zhang Hui, Du Li-min. Combination of Acoustic Models Trained from Different Unit Sets for Chinese Continuous Speech Recognition[J]. Journal of Electronics & Information Technology, 2006, 28(11): 2045-2049.

Citation:

Zhang Hui, Du Li-min. Combination of Acoustic Models Trained from Different Unit Sets for Chinese Continuous Speech Recognition[J]. Journal of Electronics & Information Technology, 2006, 28(11): 2045-2049.

張輝, 杜利民. 漢語連續(xù)語音識別中不同基元聲學模型的復合[J]. 電子與信息學報, 2006, 28(11): 2045-2049.

引用本文:

張輝, 杜利民. 漢語連續(xù)語音識別中不同基元聲學模型的復合[J]. 電子與信息學報, 2006, 28(11): 2045-2049.

Citation:

漢語連續(xù)語音識別中不同基元聲學模型的復合

張輝,
杜利民

計量
- 文章訪問數(shù): 2412
- HTML全文瀏覽量: 131
- PDF下載量: 1325
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2005-03-08
- 修回日期: 2005-08-15
- 刊出日期: 2006-11-19

Combination of Acoustic Models Trained from Different Unit Sets for Chinese Continuous Speech Recognition

摘要

摘要: 該文研究由不同聲學基元訓練的聲學模型的復合。在漢語連續(xù)語音識別中，流行的基元包括上下文相關的聲韻母基元和音素基元。實驗發(fā)現(xiàn)，有些漢語音節(jié)在聲韻母模型下有更高的識別率，有些音節(jié)在音素模型下有更高的識別率。該文提出一種復合這兩種聲學模型的方法，一方面在識別過程中同時使用兩種模型，另一方面在識別過程中避開造成低識別率的模型。實驗表明，采用本文的方法后，音節(jié)錯誤率比音素模型和聲韻母模型分別下降了9.60%和6.10%。
- 語音識別; 聲學模型復合; 聲學模型選擇; 錯誤率
Abstract: Combination of acoustic models trained from different unit sets is studied in this paper. For Chinese continuous speech recognition, Prevailing unit sets include context-dependent initial-final unit set and context-dependent phone unit set. Through experiments it is discovered that some Chinese syllables have higher recognition rates under initial-final model while some have higher recognition rates under phone model. In this paper, a method is proposed to combine these two acoustic models. On one hand the two acoustic models can be fully utilized during the recognition process; on the other hand, some models that lead to low recognition rate will not be used. Experiments show that in comparison with initial-final model and phone model, syllable error rate is reduced by 9.60% and 6.10% respectively after using the provided method.