一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復(fù)。謝謝您的支持!

姓名
郵箱
手機(jī)號碼
標(biāo)題
留言內(nèi)容
驗(yàn)證碼

基于受限玻爾茲曼機(jī)的語音帶寬擴(kuò)展

王迎雪 趙勝輝 于瑩瑩 匡鏡明

王迎雪, 趙勝輝, 于瑩瑩, 匡鏡明. 基于受限玻爾茲曼機(jī)的語音帶寬擴(kuò)展[J]. 電子與信息學(xué)報(bào), 2016, 38(7): 1717-1723. doi: 10.11999/JEIT151034
引用本文: 王迎雪, 趙勝輝, 于瑩瑩, 匡鏡明. 基于受限玻爾茲曼機(jī)的語音帶寬擴(kuò)展[J]. 電子與信息學(xué)報(bào), 2016, 38(7): 1717-1723. doi: 10.11999/JEIT151034
WANG Yingxue, ZHAO Shenghui, YU Yingying, KUANG Jingming. Speech Bandwidth Extension Based on Restricted Boltzmann Machines[J]. Journal of Electronics & Information Technology, 2016, 38(7): 1717-1723. doi: 10.11999/JEIT151034
Citation: WANG Yingxue, ZHAO Shenghui, YU Yingying, KUANG Jingming. Speech Bandwidth Extension Based on Restricted Boltzmann Machines[J]. Journal of Electronics & Information Technology, 2016, 38(7): 1717-1723. doi: 10.11999/JEIT151034

基于受限玻爾茲曼機(jī)的語音帶寬擴(kuò)展

doi: 10.11999/JEIT151034

Speech Bandwidth Extension Based on Restricted Boltzmann Machines

  • 摘要: 語音帶寬擴(kuò)展是為了提高語音質(zhì)量,利用語音低頻和高頻之間的相關(guān)性重構(gòu)語音高頻的一種技術(shù)。高斯混合模型法是語音帶寬技術(shù)中被廣泛應(yīng)用的一種方法,但是,由于該方法假設(shè)語音高頻、低頻服從高斯分布,且只表征了語音低頻、高頻之間的線性關(guān)系,從而導(dǎo)致合成的高頻語音出現(xiàn)失真。因此,該文提出一種基于受限玻爾茲曼機(jī)的方法,該方法利用兩個(gè)高斯伯努利受限玻爾茲曼機(jī)提取語音低頻和高頻中蘊(yùn)含的高階統(tǒng)計(jì)特性;并利用前饋神經(jīng)網(wǎng)絡(luò)將語音低頻高階統(tǒng)計(jì)特性參數(shù)映射為高頻高階統(tǒng)計(jì)特性參數(shù)。這樣,通過提取語音低頻和高頻中蘊(yùn)含的高階統(tǒng)計(jì)特性,該方法可以深層挖掘語音高頻和語音低頻之間的實(shí)際關(guān)系,從而更加準(zhǔn)確地模擬頻譜包絡(luò)分布,合成質(zhì)量更高的語音??陀^測試、主觀測試結(jié)果表明,該方法性能優(yōu)于傳統(tǒng)的高斯混合模型方法。
  • BAUER P, ABEL J, FISCHER V, et al. Automatic recognition of wideband telephone speech with limited amount of matched training data[C]. Proceedings of the 22nd European Signal Processing Conference (EUSIPCO), Lisbon, Portugal, 2013: 1232-1236.
    GANDHIMATHI G and JAYAKUMAR S. Speech enhancement using an artificial bandwidth extension algorithm in multicast conferencing through cloud services[J]. Information Technology Journal, 2014, 13(12): 1953-1960. doi: 10.3923/itj.2014.1953.1960.
    YOSHIDA Y and ABE M. An algorithm to reconstruct wideband speech from narrowband speech based on codebook mapping[C]. Proceedings of the International Conference on Spoken Language Processing, Yokohama, Japan, 1994: 1591-1594.
    WANG Yingxue, ZHAO Shenghui, et al. Superwideband extension for AMR-WB using conditional codebooks[C]. Proceedings of the IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Florence, Italy, 2014: 3695-3698.
    NAKATOH Yoshihisa, TSUSHIMA Mineo, NORIMATSU Takeshi, et al. Generation of broadband speech from narrowband speech using on linear mapping[J]. Electronics and Communications in Japan, Part 2 (Electronics), 2002, 85(8): 44-53. doi: 10.1002/ecjb.10065.
    DUY N D, SUZUKI M, MINEMSTSU N, et al. Artificial bandwidth extension based on regularized piecewise linear mapping with discriminative region weighting and long-Span features[C]. INTERSPEECH, Lyon, France, 2013: 3453-3457.
    PARK K Y and KIM H S. Narrowband to wideband conversion of speech using GMM based transformation[C]. Proceedings of the IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Istanbul, Turkey, 2000: 1843-1846.
    PULAKKA H, REMES U, PALOMAKI K, et al. Speech bandwidth extension using gaussian mixture model-based estimation of the highband Mel spectrum[C]. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 2011: 5100-5103.
    JAX P and VARY P. Artificial bandwidth extension of speech signals using mmse estimation based on a hidden markov model[C]. Proceedings of the IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Hong Kong, 2003: 680-683.
    BAUER P, ABEL J, et al. HMM-based artificial bandwidth extension supported by neural networks[C]. 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC), Juan-les-Pins, France, 2014: 1-5.
    LIU Haojie, BAO Changchun, and LIU Xin. Spectral envelope estimation used for audio bandwidth extension based on RBF neural network[C]. Proceedings of the IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Vancouver, Canada, 2013: 543-547.
    LI K and LEE C H. A deep neural network approach to speech bandwidth expansion[C]. Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, 2015: 4395-4399.
    SEO H, KANG H G, and SOONG F. A maximum a Posterior-based reconstruction approach to speech bandwidth expansion in noise[C]. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 2014: 6087-6091.
    LIU Xin and BAO Changchun. Audio bandwidth extension based on temporal smoothing cepstral coefficients[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2014, 2014(1): 1-16.
    OHTANI Y, AMURA M, ORITA M, et al. GMM-based bandwidth extension using sub-band basis spectrum model[C]. Fifteenth Annual Conference of the International Speech Communication Association, Singapore, 2014: 2489-2493.
    ACKLEY D H, HINTON G E, et al. A learning algorithm for Boltzmann machines[J]. Cognitive Science, 1985, 9(1): 147-169. doi: 10.1207/s15516709cog0901_7.
    MOHAME A, DAHL G E, and HINTON G E. Acoustic modeling using deep belief networks[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(1): 14-22.
    HINTON G E. Training products of experts by minimizing contrastive divergence[J]. Neural Computation, 2002, 14(8): 1771-1800.
    HINTON G E and SALAKHUTDINOV R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507.
    com/products e/speech, 1994.
    MAKINEN J, BESSETTE B, BRUHN S, et al. AMR-WB+: A new audio coding standard for 3rd generation mobile audio services[C]. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Pennsylvania, USA, 2005: 1109-1112.
    張勇, 胡瑞敏. 基于高斯混合模型的語音帶寬擴(kuò)展算法的研究[J]. 聲學(xué)學(xué)報(bào), 2009, 34(5): 471-480.
    ZHANG Yong and HU Ruimin. Speech bandwidth extension based on Gaussian mixture model[J]. Acta Acustica, 2009, 34(5): 471-480.
    NOUR-ELDIN AMR H and KABAL P. Mel-frequency cepstral coefficient-based bandwidth extension of narrowband speech[C]. INTERSPEECH, Brisbane, Australia, 2008: 53-56.
  • 加載中
計(jì)量
  • 文章訪問數(shù):  1503
  • HTML全文瀏覽量:  124
  • PDF下載量:  723
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2015-09-14
  • 修回日期:  2016-03-03
  • 刊出日期:  2016-07-19

目錄

    /

    返回文章
    返回