一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級搜索

留言板

尊敬的讀者、作者、審稿人, 關于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復。謝謝您的支持!

姓名
郵箱
手機號碼
標題
留言內(nèi)容
驗證碼

基于判別鄰域嵌入算法的說話人識別

梁春燕 袁文浩 李艷玲 夏斌 孫文珠

梁春燕, 袁文浩, 李艷玲, 夏斌, 孫文珠. 基于判別鄰域嵌入算法的說話人識別[J]. 電子與信息學報, 2019, 41(7): 1774-1778. doi: 10.11999/JEIT180761
引用本文: 梁春燕, 袁文浩, 李艷玲, 夏斌, 孫文珠. 基于判別鄰域嵌入算法的說話人識別[J]. 電子與信息學報, 2019, 41(7): 1774-1778. doi: 10.11999/JEIT180761
Chunyan LIANG, Wenhao YUAN, Yanling LI, Bin XIA, Wenzhu SUN. Speaker Recognition Using Discriminant Neighborhood Embedding[J]. Journal of Electronics & Information Technology, 2019, 41(7): 1774-1778. doi: 10.11999/JEIT180761
Citation: Chunyan LIANG, Wenhao YUAN, Yanling LI, Bin XIA, Wenzhu SUN. Speaker Recognition Using Discriminant Neighborhood Embedding[J]. Journal of Electronics & Information Technology, 2019, 41(7): 1774-1778. doi: 10.11999/JEIT180761

基于判別鄰域嵌入算法的說話人識別

doi: 10.11999/JEIT180761
基金項目: 國家自然科學基金(11704229, 61701286, 61562068),山東省自然科學基金(ZR2017LA011, ZR2015FL003, ZR2017MF047),山東省高等學??萍加媱濏椖?J17KA078),內(nèi)蒙古自然科學基金項目(2015MS0629)
詳細信息
    作者簡介:

    梁春燕:女,1986年生,講師,研究方向為說話人識別、語種識別

    袁文浩:男,1985年生,講師,研究方向為語音信號處理、語音增強

    李艷玲:女,1978年生,副教授,研究方向為自然語言處理、口語理解、機器學習

    夏斌:男,1973年生,副教授,研究方向為深度學習、信號與信息處理

    孫文珠:男,1983年生,講師,研究方向為多媒體信號傳輸

    通訊作者:

    梁春燕 liangchunyan@sdut.edu.cn

  • 中圖分類號: TP391.42

Speaker Recognition Using Discriminant Neighborhood Embedding

Funds: The National Natural Science Foundation of China (11704229, 61701286, 61562068), The Shandong Provincial Natural Science Foundation (ZR2017LA011, ZR2015FL003, ZR2017MF047), The Project of Shandong Province Higher Educational Science and Technology Program (J17KA078), The Natural Science Foundation of Inner Mongolia Autonomous Region of China (2015MS0629)
  • 摘要: 該文提出一種基于判別鄰域嵌入(DNE)算法的說話人識別。判別鄰域嵌入算法作為流形學習方法的一種,可以通過構建鄰接圖獲取數(shù)據(jù)的局部鄰域結構信息;同時該算法可以充分利用類間判別信息,具有更強的判別能力。在美國國家標準技術研究院2010年說話人識別評測(NIST SRE 2010)電話-電話核心測試集上的實驗結果表明了該算法的有效性。
  • 表  1  NIST SRE 2010電話-電話測試集上DNE和NPE的EER和minDCF比較(無信道補償)

    系統(tǒng)男聲 女聲
    EER(%)minDCFEER(%)minDCF
    NPE5.760.0575 6.980.0744
    DNE5.280.05446.350.0683
    下載: 導出CSV

    表  2  NIST SRE 2010電話-電話測試集上DNE和NPE的EER和minDCF比較(LDA信道補償)

    系統(tǒng)男聲 女聲
    EER(%)minDCFEER(%)minDCF
    NPE+LDA4.710.0492 6.110.0633
    DNE+LDA4.190.04535.570.0604
    下載: 導出CSV

    表  3  NIST SRE 2010電話-電話測試集上DNE和NPE的EER和minDCF比較(WCCN信道補償)

    系統(tǒng)男聲 女聲
    EER(%)minDCFEER(%)minDCF
    NPE+WCCN5.070.0512 6.490.0677
    DNE+WCCN4.590.04785.830.0617
    下載: 導出CSV

    表  4  NIST SRE 2010電話-電話測試集上DNE和NPE的EER和minDCF比較(LDA+WCCN信道補償)

    系統(tǒng)男聲 女聲
    EER(%)minDCFEER(%)minDCF
    NPE+LDA+WCCN4.410.0476 5.720.0584
    DNE+LDA+WCCN4.150.04345.240.0553
    下載: 導出CSV

    表  5  NIST SRE 2010電話-電話測試集上DNE和PLDA的EER和minDCF比較

    系統(tǒng)男聲 女聲
    EER(%)minDCFEER(%)minDCF
    DNE+LDA+WCCN4.150.0434 5.240.0553
    PLDA4.120.04285.370.0532
    下載: 導出CSV
  • REYNOLDS D A and ROSE R C. Robust text-independent speaker identification using Gaussian mixture speaker models[J]. IEEE Transactions on Speech and Audio Processing, 1995, 3(1): 72–83. doi: 10.1109/89.365379
    KINNUNEN T and LI Haizhou. An overview of text-independent speaker recognition: From features to supervectors[J]. Speech Communication, 2010, 52(1): 12–40. doi: 10.1016/j.specom.2009.08.009
    王偉, 韓紀慶, 鄭鐵然, 等. 基于Fisher判別字典學習的說話人識別[J]. 電子與信息學報, 2016, 38(2): 367–372. doi: 10.11999/JEIT150566

    WANG Wei, HAN Jiqing, ZHENG Tieran, et al. Speaker recognition based on fisher discrimination dictionary Learning[J]. Journal of Electronics &Information Technology, 2016, 38(2): 367–372. doi: 10.11999/JEIT150566
    KENNY P, BOULIANNE G, OUELLET P, et al. Speaker and session variability in GMM-based speaker verification[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1448–1460. doi: 10.1109/tasl.2007.894527
    郭武, 戴禮榮, 王仁華. 采用因子分析和支持向量機的說話人確認系統(tǒng)[J]. 電子與信息學報, 2009, 31(2): 302–305. doi: 10.3724/SP.J.1146.2007.01289

    GUO Wu, DAI Lirong, and WANG Renhua. Speaker verification based on factor analysis and SVM[J]. Journal of Electronics &Information Technology, 2009, 31(2): 302–305. doi: 10.3724/SP.J.1146.2007.01289
    DEHAK N, KENNY P J, DEHAK R, et al. Front-end factor analysis for speaker verification[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(4): 788–798. doi: 10.1109/tasl.2010.2064307
    DHANUSH B K, SUPARNA S, AARTHY R, et al. Factor analysis methods for joint speaker verification and spoof detection[C]. Proceedings of 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, USA, 2017: 5385–5389.
    SU Hang and WEGMANN S. Factor analysis based speaker verification using ASR[C]. Proceedings of the Interspeech 2016, San Francisco, USA, 2016: 2223–2227.
    MAK M W, PANG Xiaomin, and CHIEN J T. Mixture of PLDA for noise robust i-vector speaker verification[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2016, 24(1): 130–142. doi: 10.1109/TASLP.2015.2499038
    LEI Yun and HANSEN J H L. Speaker recognition using supervised probabilistic principal component analysis[C]. Proceedings of the Interspeech 2010, Makuhari, Japan, 2010: 382–385.
    LIANG Chunyan, YANG Lin, ZHAO Qingwei, et al. Factor Analysis of neighborhood-preserving embedding for speaker verification[J]. IEICE Transactions on Information and Systems, 2012, 95(10): 2572–2576. doi: 10.1587/transinf.e95.d.2572
    YANG Jinchao, LIANG Chunyan, YANG Lin, et al. Factor analysis of Laplacian approach for speaker recognition[C]. Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, Japan, 2012: 4221–4224.
    CHIEN J T and HSU C W. Variational manifold learning for speaker recognition[C]. Proceedings of 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, USA, 2017: 4935–4939.
    WU Di. Speaker recognition based on i-vector and improved local preserving projection[C]. Proceedings of the 2015 Chinese Intelligent Automation Conference, Fuzhou, China, 2015: 115–121.
    HE Xiaofei, CAI Deng, YAN Shuicheng, et al. Neighborhood preserving embedding[C]. Proceedings of the Tenth IEEE International Conference on Computer Vision, Beijing, China, 2005: 1208–1213.
    KAJAREKAR S S and STOLCKE A. NAP and WCCN: Comparison of approaches using MLLR-SVM speaker verification system[C]. Proceedings of 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, Honolulu, USA, 2007: IV-249–IV-252.
    HAEB-UMBACH R and NEY H. Linear discriminant analysis for improved large vocabulary continuous speech recognition[C]. Proceedings of 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, San Francisco, USA, 1992: 13–16.
    DING Chuntao and ZHANG Li. Double adjacency graphs-based discriminant neighborhood embedding[J]. Pattern Recognition, 2015, 48(5): 1734–1742. doi: 10.1016/j.patcog.2014.08.025
    WANG Jing, CHEN Fang, and GAO Quanxue. Discriminant neighborhood structure embedding using trace ratio criterion for image recognition[J]. Journal of Computer and Communications, 2015, 3(11): 61282. doi: 10.4236/jcc.2015.311011
    魏權齡, 王日爽, 徐冰, 等. 數(shù)學規(guī)劃與優(yōu)化設計[M]. 北京: 國防工業(yè)出版社, 1984: 358–470.

    WEI Quanling, WANG Rishuang, XU Bing, et al. Mathematical Programming and Optimization Design[M]. Beijing: National Defense Industry Press, 1984: 358–470.
    NIST. The NIST year 2010 speaker recognition evaluation plan[EB/OL]. http://www.oalib.com/references/16891962, 2012.
    SCHEFFER N, FERRER L, GRACIARENA M, et al. The SRI NIST 2010 speaker recognition evaluation system[C]. Proceedings of 2011 IEEE International Conference on Acoustics, Speech and Signal Processing, Prague, Czech Republic, 2011: 5292–5295.
    JOACHIMS T. SVM-light support vector machine[EB/OL]. http://svmlight.joachims.org/, 2008.
    KINNUNEN T, JUVELA L, ALKU P, et al. Non-parallel voice conversion using i-vector PLDA: towards unifying speaker verification and transformation[C]. Proceedings of 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, USA, 2017: 5535–5539.
    BAHMANINEZHAD F and HANSEN J H L. i-Vector/PLDA speaker recognition using support vectors with discriminant analysis[C]. Proceedings of 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, USA, 2017: 5410–5414.
  • 加載中
表(5)
計量
  • 文章訪問數(shù):  2198
  • HTML全文瀏覽量:  627
  • PDF下載量:  75
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2018-08-03
  • 修回日期:  2019-01-21
  • 網(wǎng)絡出版日期:  2019-02-24
  • 刊出日期:  2019-07-01

目錄

    /

    返回文章
    返回