基于參數(shù)估計(jì)和感知提升的語音增強(qiáng)降噪算法
doi: 10.11999/JEIT150504
基金項(xiàng)目:
國家自然科學(xué)基金(61473041, 11461141004, 61571044),北京市高等學(xué)校青年英才計(jì)劃(YETP1202)
Speech Enhancement Denoising Algorithm Based on Parameters Estimation and Perception Improvement
Funds:
The National Natural Science Foundation of China (61473041, 11461141004, 61571044), The Beijing Higher Education Young Elite Teacher Project (YETP1202)
-
摘要: 為了提高單通道語音增強(qiáng)降噪算法的整體質(zhì)量,該文從噪聲消除和語音感知兩個(gè)角度出發(fā)對(duì)傳統(tǒng)語音增強(qiáng)算法進(jìn)行改進(jìn),通過引入多種處理手段來達(dá)到最佳優(yōu)化效果。首先在參數(shù)估計(jì)方面,把基于弱語音出現(xiàn)的平滑算法加入到基于固定先驗(yàn)信噪比的軟判決方法中來解決噪聲譜過估計(jì)問題,并根據(jù)語音幀存在概率動(dòng)態(tài)調(diào)整平滑因子,從而提高先驗(yàn)信噪比的跟蹤效果。其次在語音質(zhì)量感知提升方面,采用諧波恢復(fù)的方法重建語音段的高頻諧波分量,并采用相位補(bǔ)償和增益平滑的方法消除靜默段和語音段的音樂噪聲。實(shí)驗(yàn)結(jié)果表明,相比傳統(tǒng)算法,該文算法通過引入?yún)?shù)估計(jì)改進(jìn)模塊和感知質(zhì)量提升模塊,在消噪效果和語音質(zhì)量兩方面均得到了較大的提高,并適用于多類噪聲環(huán)境和信噪比條件。
-
關(guān)鍵詞:
- 語音增強(qiáng) /
- 噪聲功率密度估計(jì) /
- 先驗(yàn)信噪比 /
- 諧波恢復(fù) /
- 相位補(bǔ)償
Abstract: In order to enhance the whole quality of single channel speech enhancement denoising algorithm, both noise reducing and speech perception are considered to improve the traditional speech enhancement algorithm and many kinds of processing methods are taken to achieve the best optimization effect. Firstly, in the view of parameters estimation, spectrum smoothing algorithm based on weak speech presence is added to the soft decision method based on fixed prior signal-to-noise ratio in order to solve the problem of noise spectrum overestimation. Moreover, the smoothing parameter is dynamically controlled by the speech presence probability in order to enhance the tracing effect of prior signal-to-noise ratio. Secondly, in the view of the speech perception improvement, the harmonic reconstruction method is used to reconstruct the harmonic components in high frequencies of speech section. Phase compensation method and gain smoothing method are also employed to remove the annoying musical noise in speech and silence segment. The experimental results show that compared with the traditional algorithm, the proposed algorithm obtains good performance in both denoising effect and speech quality by introducing parameter estimation improvement module and perceived quality improvement module, and it is suitable for many kinds of noise environment and signal-to-noise ratio conditions. -
MARTIN R. Noise power spectral density estimation based on optimal smoothing and minimum statistics[J]. IEEE Transactions on Speech Audio Processing, 2001, 9(5): 504-512. COHEN I. Noise estimation by minima controlled recursive averaging for robust speech enhancement[J]. IEEE Signal Processing Letters, 2002, 9(1): 12-15. COHEN I. Noise spectrum estimation in adverse environment: improved minima controlled recursive averaging[J]. IEEE Transactions on Speech Audio Processing, 2003, 11(5): 466-475. EPHRAIM Y and MALAH D. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator[J]. IEEE Transactions on Acoustics Speech and Signal Processing, 1985, 33(2): 443-445. CYRIL P, CLAUDE M, and PASCAL St. Improved signal-to-noise ratio estimation for speech enhancement[J]. IEEE Transactions on Speech and Language Processing, 2006, 14(6): 2098-2108. GERKMANN T and HENDRIKS R C. Noise power estimation based on the probability of speech presence[C]. Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New York, 2011: 145-148. GERKMANN T, BREITHAUPT C, and MARTIN R. Improved a posteriori speech presence probability estimation based on a likelihood ratio with fixed priors[J]. IEEE Transactions on Audio Speech and Language Processing, 2008, 16(5): 910-919. FENG Y and AN B. Noise power spectrum estimation based on weak speech protection for speech enhancement[C]. Proceedings of 12th International Conference on Signal Processing (ICSP), Hangzhou, 2014: 484-487. 袁文浩. 基于噪聲估計(jì)的語音增強(qiáng)方法研究[D]. [碩士論文], 華東理工大學(xué), 2013. YUAN Wenhao. The study of speech algorithms based on noise power spectrum estimation[D]. [Master dissertation], East China University of Science and Technology, 2013. PLAPOUS C, MARRO C, and SCALART P. Speech enhancement using harmonic regeneration[C]. Proceedings of International Conference on Acoustics Speech and Signal Processing, Pennsylvania, 2005: 157-160. 顏麗君. 基于噪聲估計(jì)和掩蔽效應(yīng)的語音增強(qiáng)[D]. [碩士論文], 西南交通大學(xué), 2014. YAN Lijun. Speech enhancement based on noise estimation and masking effect[D]. [Master dissertation], Southwest Jiaotong University, 2014. ESCH T and VARY P. Efficient musical noise suppression for speech enhancement systems[C]. Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, 2009: 4409-4412. WOJCICKI K, MILACIC M, STARK A, et al. Exploiting conjugate symmetry of the short-time Fourier spectrum for speech enhancement[J]. IEEE Signal Processing Letters, 2008, 15: 461-464. ISLAM Md T and SHAHNAZ C. Speech enhancement based on noise-compensated phase spectral[C]. Proceedings of International Conference on Electronic Engineering and Information Communication Technology (ICEEICT), Yichang, 2014: 1-5. PALIWAL K, W?JCICKI K, and SHANNON B. The importance of phase in speech enhancement[J]. Speech Communications, 2011, 53(4): 465-494. 卜凡亮, 王為民, 戴啟軍. 基于噪聲被掩蔽概率的優(yōu)化語音增強(qiáng)方法[J]. 電子與信息學(xué)報(bào), 2005, 27(5): 753-756. BU Fangliang, WANG Weimin, and DAI Qijun. Optimizing speech enhancement based on noise masked probability[J]. Journal of Electronics Information Technology, 2005, 27(5): 753-756. ALAYA S, ZOGHLAMI N, and LACHIRI Z. Speech enhancement based on perceptual filter bank improvement[J]. International Journal of Speech Technology, 2014, 17(3): 253-258. HU Y and LOIZOU P. Evaluation of objective measures for speech enhancement[C]. Proceeding of Interspeech, Pittsburgh, 2006: 1447-1450. ZHANG Jie, ZHAO Xiaoqun, and XU Jingyun. Suitability of speech quality evaluation measures in speech enhancement[C]. 2014 International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, 2014: 22-26. -
計(jì)量
- 文章訪問數(shù): 1741
- HTML全文瀏覽量: 130
- PDF下載量: 940
- 被引次數(shù): 0