面向卷积混叠环境下的盲源分离新方法

解元; 邹涛; 孙为军; 谢胜利

doi:10.16383/j.aas.c211207

面向卷积混叠环境下的盲源分离新方法

doi: 10.16383/j.aas.c211207 cstr: 32138.14.j.aas.c211207

解元^{1, 2,},
邹涛^1,,
孙为军^{3, 4,},
谢胜利^{5, 6,}

1.
广州大学机械与电气工程学院广州 510006
2.
物联网智能信息处理与系统集成教育部重点实验室广州 510006
3.
广东省物联网信息技术重点实验室广州 510006
4.
智能检测与制造物联教育部重点实验室广州 510006
5.
基于物联网技术的离散制造智能化学科创新引智基地广州 510006
6.
粤港澳离散制造智能化联合实验室广州 510006

基金项目: 国家重点研发计划(2018YFB1802400), 国家自然科学基金(62003095, 52171331)资助

详细信息

作者简介:
解元：广州大学机械与电气工程学院讲师. 主要研究方向为盲信号分离, 信号处理和机器学习. E-mail: yuanxiemath@hotmail.com

邹涛：广州大学机械与电气工程学院教授. 主要研究方向为工业过程建模与仿真, 模型预测控制, 先进过程控制和实时优化技术研究与应用. 本文通信作者. E-mail: tzou@gzhu.edu.cn

孙为军：广东省物联网信息技术重点实验室、智能检测与制造物联教育部重点实验室副教授. 主要研究方向为模式识别, 机器学习. E-mail: gdutswj@163.com

谢胜利：基于物联网技术的离散制造智能化学科创新引智基地、粤港澳离散制造智能化联合实验室教授. 主要研究方向为无线网络, 自动控制和盲信号处理. E-mail: shlxie@gdut.edu.cn

计量
- 文章访问数: 791
- HTML全文浏览量: 424
- PDF下载量: 206
- 被引次数: 0
出版历程
- 收稿日期: 2021-12-18
- 录用日期: 2022-10-18
- 网络出版日期: 2022-11-27
- 刊出日期: 2023-05-20

Novel Blind Source Separation Method for Convolutive Mixed Environment

XIE Yuan^{1, 2
,},
ZOU Tao^1
,,
SUN Wei-Jun^{3, 4
,},
XIE Sheng-Li^{5, 6
,}

1.
School of Mechanical and Electrical Engineering, Guangzhou University, Guangzhou 510006
2.
Key Laboratory of Intelligent Information Processing and System Integration of Internet of Things, Ministry of Education, Guangzhou 510006
3.
Guangdong Provincial Key Laboratory of Information Technology of Internet of Things, Guangzhou 510006
4.
Key Laboratory of Intelligent Detection and the Internet of Things in Manufacturing, Ministry of Education, Guangzhou 510006
5.
Discrete Manufacturing Intelligence Discipline Innovation and Talent Introduction Base Based on Internet of Things Technology, Guangzhou 510006
6.
Guangdong-Hong Kong-Macao Joint Laboratory for Smart Discrete Manufacturing, Guangzhou 510006

Funds: Supported by National Key Research and Development Project (2018YFB1802400) and National Natural Science Foundation of China (62003095, 52171331)

More Information

Author Bio:
XIE Yuan　Lecturer at the School of Mechanical and Electrical Engineering, Guangzhou University. His research interest covers blind signal separation, signal processing, and machine learning

ZOU Tao　Professor at the School of Mechanical and Electrical Engineering, Guangzhou University. His research interest covers industrial process modeling and simulation, model predictive control, advanced process control, and real-time optimization technology research and application. Corresponding author of this paper

SUN Wei-Jun　Associate professor of Guangdong Provincial Key Laboratory of Information Technology of Internet of Things, and Key Laboratory of Intelligent Detection and the Internet of Things in Manufacturing, Ministry of Education. His research interest covers pattern recognition and machine learning

XIE Sheng-Li　Professor of Discrete Manufacturing Intelligence Discipline Innovation and Talent Introduction Base Based on Internet of Things Technology, and Guangdong-Hong Kong-Macao Joint Laboratory for Smart Discrete Manufacturing. His research interest covers wireless networks, automatic control, and blind signal processing

摘要

摘要: 卷积混叠环境下的盲源分离(Blind source separation, BSS)是一个极具挑战性和实际意义的问题. 本文在独立分量分析框架下, 建立非负矩阵分解(Nonnegative matrix factorization, NMF)模型, 设计新的优化目标函数, 通过严格的数学理论推导, 得到新的模型参数更新规则; 并对解混叠矩阵进行标准化处理, 避免幅度歧义性问题; 在源信号的重构阶段, 通过实时更新非负矩阵分解模型参数, 避免源信号的排序歧义性问题. 实验结果验证了所提算法在分离中英文语音混叠信号、音乐混叠信号时的有效性和优越性.
- 盲源分离 /
- 卷积混叠 /
- 独立分量分析 /
- 非负矩阵分解
Abstract: Blind source separation (BSS) for convolutive mixed environment is a challenging and practical topic. In this paper, a nonnegative matrix factorization (NMF) model is established based on the framework of independent component analysis, and a new optimization objective function is designed. Through strict mathematical theory derivation, new model parameters update rules are obtained, and the demixing matrix is standardized to avoid the scale ambiguity. In the stage of source reconstruction, the permutation ambiguity can be avoided by updating the parameters of the NMF model in real time. Experimental results verify the effectiveness and superiority of the proposed algorithm in separating Chinese speech mixtures, English speech mixtures, and music signal mixtures.
- Blind source separation (BSS) /
- convolutive mixtures /
- independent component analysis /
- nonnegative matrix factorization (NMF)

HTML全文

图 1 中文语音混叠信号盲源分离SDR性能对比

Fig. 1 SDR performance comparison for BSS of Chinese speech mixtures

下载: 全尺寸图片幻灯片

图 2 中文语音混叠信号盲源分离SIR性能对比

Fig. 2 SIR performance comparison for BSS of Chinese speech mixtures

下载: 全尺寸图片幻灯片

图 3 英文语音混叠信号盲源分离SDR性能对比

Fig. 3 SDR performance comparison for BSS of English speech mixtures

下载: 全尺寸图片幻灯片

图 4 英文语音混叠信号盲源分离SIR性能对比

Fig. 4 SIR performance comparison for BSS of English speech mixtures

下载: 全尺寸图片幻灯片

图 5 音乐混叠信号盲源分离SDR性能对比

Fig. 5 SDR performance comparison for BSS of music mixtures

下载: 全尺寸图片幻灯片

图 6 音乐混叠信号盲源分离SIR性能对比

Fig. 6 SIR performance comparison for BSS of music mixtures

下载: 全尺寸图片幻灯片

图 7 噪声对语音信号盲分离SDR性能的影响

Fig. 7 Effect of noise on SDR performance for BSS of Chinese speech mixtures

下载: 全尺寸图片幻灯片

图 8 噪声对语音信号盲分离SIR性能的影响

Fig. 8 Effect of noise on SIR performance for BSS of Chinese speech mixtures

下载: 全尺寸图片幻灯片

图 9 噪声对音乐信号盲分离SDR性能的影响

Fig. 9 Effect of noise on SDR performance for BSS of music mixtures

下载: 全尺寸图片幻灯片

图 10 噪声对音乐信号盲分离SIR性能的影响

Fig. 10 Effect of noise on SIR performance for BSS of music mixtures

下载: 全尺寸图片幻灯片

表 1 两组中文语音源信号

Table 1 Two groups of Chinese speech sources

中文数据	源信号	时长
语音 1	IC0936W0131	5 s
语音 2	IC0936W0134	5 s

下载: 导出CSV

表 2 两组英文语音源信号

Table 2 Two groups of English speech sources

英文数据	源信号	时长
语音 1	dev1_female3_src_1	10 s
语音 2	dev1_female3_src_2	10 s

下载: 导出CSV

表 3 两组音乐源信号

Table 3 Two groups of music sources

音乐数据	源信号	时长
音乐 1	dev1_wdrums_src_1	11 s
音乐 2	dev1_wdrums_src_3	11 s

下载: 导出CSV

表 4 高混响、高噪声环境中的实验结果

Table 4 Experimental results in high reverberation and high noise environment

	$RT_{60}=400$ ms		SNR = 5 dB
	SDR	SIR	SDR	SIR
Full-Rank	0.1969	4.5580	−4.2087	6.7379
VolMin-AO	1.1786	4.3729	−3.8684	6.6486
Rank1-NMF	−1.8239	0.7933	−9.8632	2.7641
RBTD	−6.7646	1.2411	−9.1111	1.8784
Proposed	1.0278	5.7190	−1.8554	4.6515

下载: 导出CSV

参考文献(50)

[1]	张贤达, 保铮. 盲信号分离. 电子学报, 2001, 29(z1): 1766-1771 doi: 10.3321/j.issn:0372-2112.2001.z1.010 Zhang Xian-Da, Bao Zheng. Blind signal separation. Acta Electronica Sinica, 2001, 29(z1): 1766-1771 doi: 10.3321/j.issn:0372-2112.2001.z1.010
[2]	Yilmaz O, Rickard S. Blind separation of speech mixtures via time-frequency masking. IEEE Transactions on Signal Processing, 2004, 52(7): 1830-1847. doi: 10.1109/TSP.2004.828896
[3]	Mcdermott J H. The cocktail party problem. Neural Computation, 2005, 17(9): 1875-1902 doi: 10.1162/0899766054322964
[4]	Ozerov A, Fevotte C. Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Transactions on Audio Speech and Language Processing, 2010, 18(3): 550-563 doi: 10.1109/TASL.2009.2031510
[5]	Ito N, Ikeshita R, Sawada H, Nakatani T. A joint diagonalization based efficient approach to underdetermined blind audio source separation using the multichannel wiener filter. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 1950-1965 doi: 10.1109/TASLP.2021.3079815
[6]	Shah G, Koch P, Papadias C B. On the blind recovery of cardiac and respiratory sounds. IEEE Journal of Biomedical and Health Informatics, 2015, 19(1): 151-157 doi: 10.1109/JBHI.2014.2349156
[7]	Xie Y, Xie K, Yang Q Y, Xie S L. Reverberant blind separation of heart and lung sounds using nonnegative matrix factorization and auxiliary function technique. Biomedical Signal Processing and Control, 2021, 69(7): Article No. 102899
[8]	Zhang S Q, You B, Lang X, Zhou Y F, An F, Dai Y, et al. Efficient rejection of artifacts for short-term few-channel EEG based on fast adaptive multidimensional sub-bands blind source separation. IEEE Transactions on Instrumentation and Measurement, 2021, 70: Article No. 4007516
[9]	Miettinen J, Nitzan E, Vorobyov S A, Ollila E. Graph signal processing meets blind source separation. IEEE Transactions on Signal Processing, 2020, 69: 2585-2599
[10]	Einizade A, Sardouie S H, Shamsollahi M B. Simultaneous graph learning and blind separation of graph signal sources. IEEE Signal Processing Letters, 2021, 28: 1495-1499 doi: 10.1109/LSP.2021.3093872
[11]	Yang Y C, Nagarajaiah S. Structural damage identification via a combination of blind feature extraction and sparse representation classification. Mechanical Systems and Signal Processing, 2014, 45(1): 1-23 doi: 10.1016/j.ymssp.2013.09.009
[12]	Yang Y C, Li S L, Nagarajaiah S, Li H, Zhou P. Real-time output-only identification of time-varying cable tension from accelerations via complexity pursuit. Journal of Structural Engineering, 2016, 142(1): Article No. 04015083
[13]	谢胜利, 何昭水, 傅予力. 基于稀疏元分析的欠定混叠自适应盲分离方法. 中国科学(E辑: 信息科学), 2007, 37(8): 1086-1098 Xie Sheng-Li, He Zhao-Shui, Fu Yu-Li. Underdetermined aliasing adaptive blind separation method based on sparse element analysis. Chinese Science (Series E: Information Science), 2007, 37(8): 1086-1098
[14]	Lathauwer L D, Castaing J. Blind identification of underdetermined mixtures by simultaneous matrix diagonalization. IEEE Transactions on Signal Processing, 2008, 56(3): 1096-1105 doi: 10.1109/TSP.2007.908929
[15]	汤辉, 王殊. 基于稳健联合分块对角化的卷积盲分离. 自动化学报, 2013, 39(9): 1502-1510 Tang Hui, Wang Shu. Robust joint block diagonalization based convolutive blind source separation. Acta Automatica Sinica, 2013, 39(9): 1502-1510
[16]	朱孝龙, 张贤达. 基于奇异值分解的超定盲信号分离. 电子与信息学报, 2004, 26(3): 337-343 Zhu Xiao-Long, Zhang Xian-Da, Overdetermined blind signal separation based on singular value decomposition. Journal of Electronics & Information Technology, 2004, 26(3): 337-343
[17]	Yatabe K, Kitamura D. Determined BSS based on time-frequency masking and its application to harmonic vector analysis. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 1609-1625 doi: 10.1109/TASLP.2021.3073863
[18]	肖明, 谢胜利, 傅予力. 基于超平面法矢量的欠定盲信号分离算法. 自动化学报, 2008, 34(2): 142-149 Xiao Ming, Xie Sheng-Li, Fu Yu-Li. Underdetermined blind signal separation algorithm based on hyperplane normal vector. Acta Automatica Sinica, 2008, 34(2): 142-149
[19]	Nion D, Mokios K N, Sidiropoulos N D, Potamianos A. Batch and adaptive PARAFAC-based blind separation of convolutive speech mixtures. IEEE Transactions on Audio, Speech, and Language Processing, 2010, 18(6): 1193-1207 doi: 10.1109/TASL.2009.2031694
[20]	Matsuoka K. Minimal distortion principle for blind source separation. In: Procedings of the 3rd International Conference on Independent Component Analysis and Blind Signal Separation. Tobata, Japan: IEEE, 2001. 722−729
[21]	Sawada H, Mukai R, Araki S. A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Transactions Speech Audio Processing, 2004, 12(5): 530-538 doi: 10.1109/TSA.2004.832994
[22]	Sawada H, Araki S, Mukai R. Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Transactions on Audio Speech and Language Processing, 2011, 19(3): 516-527 doi: 10.1109/TASL.2010.2051355
[23]	Xie K, Zhou G X, Yang J J, He Z S, Xie S L. Eliminating the permutation ambiguity of convolutive blind source separation by using coupled frequency bins. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(2): 589-599 doi: 10.1109/TNNLS.2019.2906833
[24]	Sawada H, Araki S, Mukai R, Makina S. Grouping separated frequency components by estimating propagation model parameters in frequency-domain blind source separation. IEEE Transactions on Audio Speech and Language Processing, 2007, 15(5): 1592-1604 doi: 10.1109/TASL.2007.899218
[25]	Xie S L, Yang L, Yang J M, Zhou G X, Xiang Y. Time-frequency approach to underdetermined blind source separation. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(2): 306-316 doi: 10.1109/TNNLS.2011.2177475
[26]	刘秋红, 许漫坤, 李天昀, 陆明明. 基于互补对称滤波器的APCMA信号的盲分离算法. 电子学报, 2020, 48(12): 2394-2401 Liu Qiu-Hong, Xu Man-Kun, Li Tian-Jun, Lu Ming-Ming. Blind separation algorithm of APCMA signal based on complementary symmetric filter. Acta Electronica Sinica, 2020, 48(12): 2394-2401
[27]	He Z S, Xie S L, Ding S X, Cichocki A. Convolutive blind source separation in the frequency domain based on sparse representation. IEEE Transactions on Audio Speech and Language Processing, 2007, 15(5): 1551-1563 doi: 10.1109/TASL.2007.898457
[28]	Xie Y, Xie K, Xie S L. Underdetermined blind source separation of speech mixtures unifying dictionary learning and sparse representation. International Journal of Machine Learning and Cybernetics, 2021, 12(12), 3573-3583 doi: 10.1007/s13042-021-01406-5
[29]	Xu Z B, Zhang H, Wang Y, Chang X Y, Liang Y. L_1/2 regularization. Science China (Information Sciences), 2010, 53(6): 1159-1169 doi: 10.1007/s11432-010-0090-0
[30]	Xu Z B, Chang X Y, Xu F M, Zhang H. L_1/2 Regularization: A thresholding representation theory and a fast solver. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(7): 1013-1027 doi: 10.1109/TNNLS.2012.2197412
[31]	Yang J J, Guo Y, Yang Z Y, Xie S L. Underdetermined convolutive blind source separation combining density-based clustering and sparse reconstruction in time-frequency domain. IEEE Transactions on Circuits and Systems I: Regular Papers, 2019, 66(8): 3015-3027 doi: 10.1109/TCSI.2019.2908394
[32]	Xie Y, Xie K, Xie S L. Underdetermined blind separation of source using L_p-norm diversity measures. Neurocomputing, 2020, 411, 259-267 doi: 10.1016/j.neucom.2020.06.029
[33]	Lee D D, Seung H S. Learning the parts of objects by non-negative matrix factorization. Nature, 1999, 401(6755): 788-791 doi: 10.1038/44565
[34]	Gillis N, Vavasis S A. Fast and robust recursive algorithms for separable nonnegative matrix factorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(4): 698-714 doi: 10.1109/TPAMI.2013.226
[35]	Rahiche A, Cheriet M. Blind decomposition of multispectral document images using orthogonal nonnegative matrix factorization. IEEE Transactions on Image Processing, 2021, 30: 5997-6012 doi: 10.1109/TIP.2021.3088266
[36]	Kitamura D, Ono N, Sawada H, Kameoka H, Saruwatari H. Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization. IEEE/ACM Transactions on Audio Speech and Language Processing, 2016, 24(9): 1626-1641 doi: 10.1109/TASLP.2016.2577880
[37]	Al-Tmeme A, Woo W L, Dlay S S, Gao B. Underdetermined convolutive source separation using GEM-MU with variational approximated optimum model order NMF2D. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2017, 25(1): 31-45
[38]	Xie Y, Xie K, Xie S L. Underdetermined convolutive blind separation of sources integrating tensor factorization and expectation maximization. Digital Signal Processing, 2019, 87: 145-154 doi: 10.1016/j.dsp.2019.01.022
[39]	Sekiguchi K, Bando Y, Nugraha A A, Yoshii K, Kawahara T. Fast multichannel nonnegative matrix factorization with directivity-aware jointly-diagonalizable spatial covariance matrices for blind source separation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020, 28: 2610-2625 doi: 10.1109/TASLP.2020.3019181
[40]	Duong N, Vincent E, Gribonval R. Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Transactions on Audio, Speech, and Language Processing, 2010, 18(7): 1830-1840 doi: 10.1109/TASL.2010.2050716
[41]	Bando Y, Sekiguchi K, Masuyama Y, Nugraha A A, Fontaine M, Yoshii K. Neural full-rank spatial covariance analysis for blind source separation. IEEE Signal Processing Letters, 2021, 28: 1670-1674 doi: 10.1109/LSP.2021.3101699
[42]	Kolda T. Tensor decompositions and applications. Siam Review, 2009, 51(3): 455-500 doi: 10.1137/07070111X
[43]	Weiss A. Blind direction-of-arrival estimation in acoustic vector-sensor arrays via tensor decomposition and Kullback-Leibler divergence covariance fitting. IEEE Transactions on Signal Processing, 2021, 69: 531-545 doi: 10.1109/TSP.2020.3043814
[44]	Mitsufuji Y, Takamune N, Koyama S, Saruwatari H. Multichannel blind source separation based on evanescent-region-aware non-negative tensor factorization in spherical harmonic domain. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 607-617 doi: 10.1109/TASLP.2020.3045528
[45]	Tan V Y F, Févotte C. Automatic relevance determination in nonnegative matrix factorization with the-divergence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(7): 1592-1605 doi: 10.1109/TPAMI.2012.240
[46]	Audio Labs. RIR generator [Online], available: https://www.audiolabs-erlangen.de/fau/professor/habets/software/rir-generator, November 22, 2022
[47]	Vincent E, Gribonval R, Fevotte C. Performance measurement in blind audio source separation. IEEE Transactions on Audio Speech and Language Processing, 2006, 14(4): 1462-1469 doi: 10.1109/TSA.2005.858005
[48]	Fu X, Ma W K, Huang K, Sidiropoulos N. Blind separation of quasi-stationary sources: Exploiting convex geometry in covariance domain. IEEE Transactions on Signal Processing, 2015, 63(9): 2306-2320 doi: 10.1109/TSP.2015.2404577
[49]	AISHELL-ASR0009-OS1 open source mandarin speech corpus [Online], available: http://www.aishelltech.com/kysjcp, November 22, 2022
[50]	SiSEC 2013. Audio source separation [Online], available: http://sisec.wiki.irisa.fr/tiki-index.php?page=Professionally+produced+music+recordings, November 22, 2022