一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復。謝謝您的支持!

姓名
郵箱
手機號碼
標題
留言內(nèi)容
驗證碼

基于本征音子說話人子空間的說話人自適應算法

屈丹 張文林

屈丹, 張文林. 基于本征音子說話人子空間的說話人自適應算法[J]. 電子與信息學報, 2015, 37(6): 1350-1356. doi: 10.11999/JEIT141264
引用本文: 屈丹, 張文林. 基于本征音子說話人子空間的說話人自適應算法[J]. 電子與信息學報, 2015, 37(6): 1350-1356. doi: 10.11999/JEIT141264
Qu Dan, Zhang Wen-lin. Speaker Adaptation Method Based on Eigenphone Speaker Subspace for Speech Recognition[J]. Journal of Electronics & Information Technology, 2015, 37(6): 1350-1356. doi: 10.11999/JEIT141264
Citation: Qu Dan, Zhang Wen-lin. Speaker Adaptation Method Based on Eigenphone Speaker Subspace for Speech Recognition[J]. Journal of Electronics & Information Technology, 2015, 37(6): 1350-1356. doi: 10.11999/JEIT141264

基于本征音子說話人子空間的說話人自適應算法

doi: 10.11999/JEIT141264
基金項目: 

國家自然科學基金(61175017, 61302107和61403415)資助課題

Speaker Adaptation Method Based on Eigenphone Speaker Subspace for Speech Recognition

  • 摘要: 本征音子說話人自適應算法在自適應數(shù)據(jù)量充足時可以取得很好的自適應效果,但在自適應數(shù)據(jù)量不足時會出現(xiàn)嚴重的過擬合現(xiàn)象。為此該文提出一種基于本征音子說話人子空間的說話人自適應算法來克服這一問題。首先給出基于隱馬爾可夫模型-高斯混合模型(HMM-GMM)的語音識別系統(tǒng)中本征音子說話人自適應的基本原理。其次通過引入說話人子空間對不同說話人的本征音子矩陣間的相關(guān)性信息進行建模;然后通過估計說話人相關(guān)坐標矢量得到一種新的本征音子說話人子空間自適應算法。最后將本征音子說話人子空間自適應算法與傳統(tǒng)說話人子空間自適應算法進行了對比?;谖④浾Z料庫的漢語連續(xù)語音識別實驗表明,與本征音子說話人自適應算法相比,該算法在自適應數(shù)據(jù)量極少時能大幅提升性能,較好地克服過擬合現(xiàn)象。與本征音自適應算法相比,該算法以較小的性能犧牲代價獲得了更低的空間復雜度而更具實用性。
  • Zhang Wen-lin, Zhang Wei-qiang, Li Bi-cheng, et al.. Bayesian speaker adaptation based on a new hierarchical probabilistic model[J]. IEEE Transactions on Audio, Speech and Language Processing, 2012, 20(7): 2002-2015.
    Solomonoff A, Campbell W M, and Boardman I. Advances in channel compensation for SVM speaker recognition[C]. Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Philadelphia, United States, 2005: 629-632.
    Kumar D S P, Prasad N V, Joshi V, et al.. Modified splice and its extension to non-stereo data for noise robust speech recognition[C]. Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop(ASRU), Olomouc, Czech Republic, 2013: 174-179.
    Ghalehjegh S H and Rose R C. Two-stage speaker adaptation in subspace Gaussian mixture models[C]. Proceedings of International Conference on Audio, Speech and Signal Processing(ICASSP), Florence, Italy, 2014: 6374-6378.
    Wang Y Q and Gale M J F. Tandem system adaptation using multiple linear feature transforms[C]. Proceedings of International Conference on Audio, Speech and Signal Processing(ICASSP), Vancouver, Canada, 2013: 7932-7936.
    Kenny P, Boulianne G, and Dumouchel P. Eigenvoice modeling with sparse training data[J]. IEEE Transactions on Speech and Audio Processing, 2005, 13(3): 345-354.
    Kenny P, Boulianne G, Dumouchel P, et al.. Speaker adaptation using an eigenphone basis[J]. IEEE Transaction on Speech and Audio Processing, 2004, 12(6): 579-589.
    Zhang Wen-lin, Zhang Wei-qiang, and Li Bi-cheng. Speaker adaptation based on speaker-dependent eigenphone estimation[C]. Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop(ASRU), Hawaii, United States, 2011: 48-52.
    張文林, 張連海, 陳琦, 等. 語音識別中基于低秩約束的本征音子說話人自適應方法[J]. 電子與信息學報, 2014, 36(4): 981-987.
    Zhang Wen-lin, Zhang Lian-hai, Chen Qi, et al.. Low-rank constraint eigenphone speaker adaptation method for speech recognition[J]. Journal of Electronics Information Technology, 2014, 36(4): 981-987.
    Zhang Wen-lin, Qu Dan, and Zhang Wei-qiang. Speaker adaptation based on sparse and low-rank eigenphone matrix estimation[C]. Proceedings of Annual Conference on International Speech Communication Association (INTERSPEECH), Singapore, 2014: 2972-2976.
    Wang N, Lee S, Seide F, et al.. Rapid speaker adaptation using a priori knowledge by eigenspace analysis of MLLR parameters[C]. Proceedings of International Conference on Audio, Speech and Signal Processing(ICASSP), Salt Lake City, United States, 2001: 345-348.
    Povey D and Yao K. A basis representation of constrained MLLR transforms for Robust adaptation[J]. Computer Speech and Language, 2012, 26(1): 35-51.
    Miao Y, Metze F, and Waibel A. Learning discriminative basis coefficients for eigenspace MLLR unsupervised adaptation[C]. Proceedings of International Conference on Audio, Speech and Signal Processing(ICASSP), Vancouver, Canada, 2013: 7927-7931.
    Saz O and Hain T. Using contextual information in joint factor eigenspace MLLR for speech recognition in diverse scenarios[C]. Proceedings of International Conference on Audio, Speech and Signal Processing(ICASSP), Florence, Italy, 2014: 6364-6368.
    Chang E, Shi Y, Zhou J, et al.. Speech lab in a box: a Mandarin speech toolbox to jumpstart speech related research[C]. Proceedings of 7th?European Conference on Speech Communication and Technology(Eurospeech), Aalborg, Denmark, 2001: 2799-2802.
  • 加載中
計量
  • 文章訪問數(shù):  1273
  • HTML全文瀏覽量:  118
  • PDF下載量:  513
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2014-09-30
  • 修回日期:  2014-12-29
  • 刊出日期:  2015-06-19

目錄

    /

    返回文章
    返回