一種新的基于稀疏表示的單通道盲源分離算法
doi: 10.11999/JEIT160888
基金項(xiàng)目:
國(guó)家自然科學(xué)基金(61372167),航空科學(xué)基金(20152096019)
Novel Single Channel Blind Source Separation Algorithm Based on Sparse Representation
Funds:
The National Natural Science Foundation of China (61372167), The Aviation Science Foundation of China (20152096019)
-
摘要: 該文針對(duì)稀疏表示應(yīng)用于單通道盲源分離中存在字典間互干擾的問(wèn)題,通過(guò)在常規(guī)聯(lián)合字典中引入一個(gè)新的子字典 共同子字典,提出一種新的基于稀疏表示的單通道盲源分離算法。新的字典學(xué)習(xí)目標(biāo)函數(shù)中單個(gè)源的保真度由對(duì)應(yīng)子字典和共同子字典構(gòu)成,共同子字典的存在可以有效避免某一源信號(hào)在其他子字典上尋求成份而帶來(lái)的互干擾問(wèn)題。目標(biāo)函數(shù)的求解通過(guò)交替執(zhí)行稀疏表示、字典更新和比例系數(shù)優(yōu)化3個(gè)步驟來(lái)實(shí)現(xiàn)。在測(cè)試階段,通過(guò)收集單個(gè)源所對(duì)應(yīng)子字典和共同子字典上的分量可以估計(jì)出混合信號(hào)中的單個(gè)源信號(hào),從而達(dá)到盲源分離的目的。在語(yǔ)音數(shù)據(jù)庫(kù)上進(jìn)行的對(duì)比實(shí)驗(yàn)發(fā)現(xiàn),所提算法較傳統(tǒng)算法和前沿算法在兩個(gè)通用評(píng)價(jià)指標(biāo)上最高有近1 dB的提高。
-
關(guān)鍵詞:
- 稀疏表示 /
- 單通道盲源分離 /
- 字典學(xué)習(xí) /
- 鑒別力 /
- 保真度
Abstract: The main drawback of sparse representation based Single Channel Blind Source Separation (SCBSS) is the interference between sub-dictionaries. To alleviate this drawback, an extra sub-dictionary, named common sub-dictionary, is proposed to add into traditional union dictionary. The single source is reconstructed by linear combining sparsely activity atoms of its corresponding sub-dictionary and common sub-dictionary. The common sub-dictionary can pure discriminative information in each sources specified sub-dictionary since the common information different sources shared together is gathered in common sub-dictionary. The optimization of objective function involves three steps: sparse representation, dictionary updating and weight coefficients optimization, the three steps are iteratively performed for a specified number of times or until convergence. In test stage, single source separation is achieved by combining atoms in source corresponding sub-dictionary and common sub-dictionary with the sparse coefficients of single mixed signal over union dictionary. Experimental results on speech dataset show that, when compared with traditional and state of art algorithms, the proposed algorithm can improve the performance 1 dB at most. -
VANEPH A, MCNEIL E, RIGAUD F, et al. An automated source separation technology and its practical applications[C]. Audio Engineering Society Convention 140. Audio Engineering Society, Paris, France, 2016: 181-182. 杜健, 鞏克現(xiàn), 葛臨東. 基于單路定時(shí)準(zhǔn)確的低復(fù)雜度成對(duì)載波復(fù)用多址信號(hào)盲分離算法[J]. 電子與信息學(xué)報(bào), 2014, 36(8): 1872-1877. doi: 10.3724/SP.J.1146.2013.01459. DU Jian, GONG Kexian, and GE Lindong. Low complexity algorithm on blind separation of paired carrier multiple access signals based on single way timing accuracy[J]. Journal of Electronics Information Technology, 2014, 36(8): 1872-1877. doi: 10.3724/SP.J.1146.2013.01459. LOPEZ A R, ONO N, REMES U, et al. Designing multichannel source separation based on single-channel source separation[C]. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Brisbane, Australia, 2015: 469-473. doi: 10.1109/ICASSP. 2015.7178013. 吳迪, 陶智, 張曉俊, 等. 感知聽(tīng)覺(jué)場(chǎng)景分析的說(shuō)話(huà)人識(shí)別[J]. 聲學(xué)學(xué)報(bào), 2016, 41(2): 260-272. doi: 10.15949/j.cnki.0371- 0025.2016.02.015. WU Di, TAO Rui, and ZHANG Xiaojun, et al. Perception auditory scene analysis for speaker recognition[J]. Acta Acustica, 2016, 41(2): 260-272. doi: 10.15949/j.cnki.0371- 0025.2016.02.015. 楊立東, 王晶, 謝湘, 等. 基于低秩張量補(bǔ)全的多聲道音頻信號(hào)恢復(fù)方法[J]. 電子與信息學(xué)報(bào), 2016, 38(2): 394-399. doi: 10.11999/JEIT150589. YANG Lidong, WANG Jing, and XIE Xiang, et al. Low rank tensor completion for recovering missing data in multi-channel audio signal[J]. Journal of Electronics Information Technology, 2016, 38(2): 394-399. doi: 10.11999 /JEIT150589. JANG G J, LEE T W, and OH Y H. Single-channel signal separation using time-domain basis functions[J]. IEEE Signal Processing Letters, 2003, 10(6): 168-171. doi: 10.1109/LSP. 2003.811630. 王鋼, 孫斌. 盲信號(hào)分離技術(shù)及算法研究[J]. 航天電子對(duì)抗, 2015, 31(4): 53-56. doi: 10.16328/j.htdz8511.2015.04.015. WANG Gang and SUN Bin. Research on blind signal separation technology and algorithm[J]. Aerospace Electronic Warfare, 2015, 31(4): 53-56. doi: 10.16328/j.htdz8511.2015. 04.015. SCHMIDT M N and OLSSON R K. Single-channel speech separation using sparse non-negative matrix factorization[C]. ISCA International Conference on Spoken Language Proceesing, (INTERSPEECH), Pittsburgh, Pennsylvania, 2006: 2614-2617. KING B J and ATLAS L. Single-channel source separation using complex matrix factorization[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(8): 2591-2597. doi: 10.1109/TASL.2011.2156786. GRAIS E M and ERDOGAN H. Single channel speech music separation using nonnegative matrix factorization with sliding window and spectral masks[C]. Annual Conference of the International Speech Communication Association (INTERSPEECH), Florence, Italy, 2011: 1773-1776. GRAIS E M and ERDOGAN H. Discriminative nonnegative dictionary learning using cross-coherence penalties for single channel source separation[C]. INTERSPEECH, Lyon, France, 2013: 808-812. WENINGER F, LE Roux J, HERSHEY J R, et al. Discriminative NMF and its application to single-channel source separation[C]. Annual Conference of the International Speech Communication Association (INTERSPEECH), Singapore, 2014: 865-869. BAO G, XU Y, and YE Z. Learning a discriminative dictionary for single-channel speech separation[J]. IEEE/ ACM Transactions on Audio, Speech, and Language Processing, 2014, 22(7): 1130-1138. doi: 10.1109/TASLP. 2014.2320575. WANG Z and SHA F. Discriminative non-negative matrix factorization for single-channel speech separation[C]. 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 2014: 3749-3753. doi: 10.1109/ICASSP.2014.6854302. 張春梅, 尹忠科, 肖明霞. 基于冗余字典的信號(hào)超完備表示與稀疏分解[J]. 科學(xué)通報(bào), 2006, 51(6): 628-633. ZHANG Chunmei, YIN Zhongke, and XIAO Mingxia. Signal over-complete representation and sparse decomposition based on redundant dictionary[J]. Chinese Science Bulletin, 2006, 51(6): 628-633. AHARON M, ELAD M, and BRUCKSTEIN A. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation[J]. IEEE Transactions on Signal Processing, 2006, 54(11): 4311-4322. doi: 10.1109/TSP.2006.881199. COOKE M, BARKER J, CUNNINGHAM S, et al. An audio-visual corpus for speech perception and automatic speech recognition[J]. The Journal of the Acoustical Society of America, 2006, 120(5): 2421-2424. doi: 10.1121/1.2229005. VINCENT E, GRIBONVAL R, and FEVOTTE C. Performance measurement in blind audio source separation[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2006, 14(4): 1462-1469. doi: 10.1109/TSA.2005. 858005. THOMAS S, SAON G, KUO H, et al. The IBM BOLT speech transcription system[C]. Sixteenth Annual Conference of the International Speech Communication Association, Dresden, Germany, 2015: 3150-3153. NORRIS D, MCQUEEN J M, and CUTLER A. Prediction, Bayesian inference and feedback in speech recognition[J]. Language, Cognition and Neuroscience, 2016, 31(1): 4-18. doi: 10.1080/23273798.2015.1081703. -
計(jì)量
- 文章訪(fǎng)問(wèn)數(shù): 1163
- HTML全文瀏覽量: 99
- PDF下載量: 367
- 被引次數(shù): 0