2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

面向卷积混叠环境下的盲源分离新方法

解元 邹涛 孙为军 谢胜利

解元, 邹涛, 孙为军, 谢胜利. 面向卷积混叠环境下的盲源分离新方法. 自动化学报, 2023, 49(5): 1062−1072 doi: 10.16383/j.aas.c211207
引用本文: 解元, 邹涛, 孙为军, 谢胜利. 面向卷积混叠环境下的盲源分离新方法. 自动化学报, 2023, 49(5): 1062−1072 doi: 10.16383/j.aas.c211207
Xie Yuan, Zou Tao, Sun Wei-Jun, Xie Sheng-Li. Novel blind source separation method for convolutive mixed environment. Acta Automatica Sinica, 2023, 49(5): 1062−1072 doi: 10.16383/j.aas.c211207
Citation: Xie Yuan, Zou Tao, Sun Wei-Jun, Xie Sheng-Li. Novel blind source separation method for convolutive mixed environment. Acta Automatica Sinica, 2023, 49(5): 1062−1072 doi: 10.16383/j.aas.c211207

面向卷积混叠环境下的盲源分离新方法

doi: 10.16383/j.aas.c211207
基金项目: 国家重点研发计划(2018YFB1802400), 国家自然科学基金(62003095, 52171331)资助
详细信息
    作者简介:

    解元:广州大学机械与电气工程学院讲师. 主要研究方向为盲信号分离, 信号处理和机器学习. E-mail: yuanxiemath@hotmail.com

    邹涛:广州大学机械与电气工程学院教授. 主要研究方向为工业过程建模与仿真, 模型预测控制, 先进过程控制和实时优化技术研究与应用. 本文通信作者. E-mail: tzou@gzhu.edu.cn

    孙为军:广东省物联网信息技术重点实验室、智能检测与制造物联教育部重点实验室副教授. 主要研究方向为模式识别, 机器学习. E-mail: gdutswj@163.com

    谢胜利:基于物联网技术的离散制造智能化学科创新引智基地、粤港澳离散制造智能化联合实验室教授. 主要研究方向为无线网络, 自动控制和盲信号处理. E-mail: shlxie@gdut.edu.cn

Novel Blind Source Separation Method for Convolutive Mixed Environment

Funds: Supported by National Key Research and Development Project (2018YFB1802400) and National Natural Science Foundation of China (62003095, 52171331)
More Information
    Author Bio:

    XIE Yuan Lecturer at the School of Mechanical and Electrical Engineering, Guangzhou University. His research interest covers blind signal separation, signal processing, and machine learning

    ZOU Tao Professor at the School of Mechanical and Electrical Engineering, Guangzhou University. His research interest covers industrial process modeling and simulation, model predictive control, advanced process control, and real-time optimization technology research and application. Corresponding author of this paper

    SUN Wei-Jun Associate professor of Guangdong Provincial Key Laboratory of Information Technology of Internet of Things, and Key Laboratory of Intelligent Detection and the Internet of Things in Manufacturing, Ministry of Education. His research interest covers pattern recognition and machine learning

    XIE Sheng-Li Professor of Discrete Manufacturing Intelligence Discipline Innovation and Talent Introduction Base Based on Internet of Things Technology, and Guangdong-Hong Kong-Macao Joint Laboratory for Smart Discrete Manufacturing. His research interest covers wireless networks, automatic control, and blind signal processing

  • 摘要: 卷积混叠环境下的盲源分离(Blind source separation, BSS)是一个极具挑战性和实际意义的问题. 本文在独立分量分析框架下, 建立非负矩阵分解(Nonnegative matrix factorization, NMF)模型, 设计新的优化目标函数, 通过严格的数学理论推导, 得到新的模型参数更新规则; 并对解混叠矩阵进行标准化处理, 避免幅度歧义性问题; 在源信号的重构阶段, 通过实时更新非负矩阵分解模型参数, 避免源信号的排序歧义性问题. 实验结果验证了所提算法在分离中英文语音混叠信号、音乐混叠信号时的有效性和优越性.
  • 图  1  中文语音混叠信号盲源分离SDR性能对比

    Fig.  1  SDR performance comparison for BSS of Chinese speech mixtures

    图  2  中文语音混叠信号盲源分离SIR性能对比

    Fig.  2  SIR performance comparison for BSS of Chinese speech mixtures

    图  3  英文语音混叠信号盲源分离SDR性能对比

    Fig.  3  SDR performance comparison for BSS of English speech mixtures

    图  4  英文语音混叠信号盲源分离SIR性能对比

    Fig.  4  SIR performance comparison for BSS of English speech mixtures

    图  5  音乐混叠信号盲源分离SDR性能对比

    Fig.  5  SDR performance comparison for BSS of music mixtures

    图  6  音乐混叠信号盲源分离SIR性能对比

    Fig.  6  SIR performance comparison for BSS of music mixtures

    图  7  噪声对语音信号盲分离SDR性能的影响

    Fig.  7  Effect of noise on SDR performance for BSS of Chinese speech mixtures

    图  8  噪声对语音信号盲分离SIR性能的影响

    Fig.  8  Effect of noise on SIR performance for BSS of Chinese speech mixtures

    图  9  噪声对音乐信号盲分离SDR性能的影响

    Fig.  9  Effect of noise on SDR performance for BSS of music mixtures

    图  10  噪声对音乐信号盲分离SIR性能的影响

    Fig.  10  Effect of noise on SIR performance for BSS of music mixtures

    表  1  两组中文语音源信号

    Table  1  Two groups of Chinese speech sources

    中文数据源信号时长
    语音 1IC0936W01315 s
    语音 2IC0936W01345 s
    下载: 导出CSV

    表  2  两组英文语音源信号

    Table  2  Two groups of English speech sources

    英文数据源信号时长
    语音 1dev1_female3_src_110 s
    语音 2dev1_female3_src_210 s
    下载: 导出CSV

    表  3  两组音乐源信号

    Table  3  Two groups of music sources

    音乐数据源信号时长
    音乐 1dev1_wdrums_src_111 s
    音乐 2dev1_wdrums_src_311 s
    下载: 导出CSV

    表  4  高混响、高噪声环境中的实验结果

    Table  4  Experimental results in high reverberation and high noise environment

    $RT_{60}=400$ msSNR = 5 dB
    SDRSIRSDRSIR
    Full-Rank0.19694.5580 −4.20876.7379
    VolMin-AO1.17864.3729−3.86846.6486
    Rank1-NMF−1.82390.7933−9.86322.7641
    RBTD−6.76461.2411−9.11111.8784
    Proposed1.02785.7190−1.85544.6515
    下载: 导出CSV
  • [1] 张贤达, 保铮. 盲信号分离. 电子学报, 2001, 29(z1): 1766-1771 doi: 10.3321/j.issn:0372-2112.2001.z1.010

    Zhang Xian-Da, Bao Zheng. Blind signal separation. Acta Electronica Sinica, 2001, 29(z1): 1766-1771 doi: 10.3321/j.issn:0372-2112.2001.z1.010
    [2] Yilmaz O, Rickard S. Blind separation of speech mixtures via time-frequency masking. IEEE Transactions on Signal Processing, 2004, 52(7): 1830-1847. doi: 10.1109/TSP.2004.828896
    [3] Mcdermott J H. The cocktail party problem. Neural Computation, 2005, 17(9): 1875-1902 doi: 10.1162/0899766054322964
    [4] Ozerov A, Fevotte C. Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Transactions on Audio Speech and Language Processing, 2010, 18(3): 550-563 doi: 10.1109/TASL.2009.2031510
    [5] Ito N, Ikeshita R, Sawada H, Nakatani T. A joint diagonalization based efficient approach to underdetermined blind audio source separation using the multichannel wiener filter. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 1950-1965 doi: 10.1109/TASLP.2021.3079815
    [6] Shah G, Koch P, Papadias C B. On the blind recovery of cardiac and respiratory sounds. IEEE Journal of Biomedical and Health Informatics, 2015, 19(1): 151-157 doi: 10.1109/JBHI.2014.2349156
    [7] Xie Y, Xie K, Yang Q Y, Xie S L. Reverberant blind separation of heart and lung sounds using nonnegative matrix factorization and auxiliary function technique. Biomedical Signal Processing and Control, 2021, 69(7): Article No. 102899
    [8] Zhang S Q, You B, Lang X, Zhou Y F, An F, Dai Y, et al. Efficient rejection of artifacts for short-term few-channel EEG based on fast adaptive multidimensional sub-bands blind source separation. IEEE Transactions on Instrumentation and Measurement, 2021, 70: Article No. 4007516
    [9] Miettinen J, Nitzan E, Vorobyov S A, Ollila E. Graph signal processing meets blind source separation. IEEE Transactions on Signal Processing, 2020, 69: 2585-2599
    [10] Einizade A, Sardouie S H, Shamsollahi M B. Simultaneous graph learning and blind separation of graph signal sources. IEEE Signal Processing Letters, 2021, 28: 1495-1499 doi: 10.1109/LSP.2021.3093872
    [11] Yang Y C, Nagarajaiah S. Structural damage identification via a combination of blind feature extraction and sparse representation classification. Mechanical Systems and Signal Processing, 2014, 45(1): 1-23 doi: 10.1016/j.ymssp.2013.09.009
    [12] Yang Y C, Li S L, Nagarajaiah S, Li H, Zhou P. Real-time output-only identification of time-varying cable tension from accelerations via complexity pursuit. Journal of Structural Engineering, 2016, 142(1): Article No. 04015083
    [13] 谢胜利, 何昭水, 傅予力. 基于稀疏元分析的欠定混叠自适应盲分离方法. 中国科学(E辑: 信息科学), 2007, 37(8): 1086-1098

    Xie Sheng-Li, He Zhao-Shui, Fu Yu-Li. Underdetermined aliasing adaptive blind separation method based on sparse element analysis. Chinese Science (Series E: Information Science), 2007, 37(8): 1086-1098
    [14] Lathauwer L D, Castaing J. Blind identification of underdetermined mixtures by simultaneous matrix diagonalization. IEEE Transactions on Signal Processing, 2008, 56(3): 1096-1105 doi: 10.1109/TSP.2007.908929
    [15] 汤辉, 王殊. 基于稳健联合分块对角化的卷积盲分离. 自动化学报, 2013, 39(9): 1502-1510

    Tang Hui, Wang Shu. Robust joint block diagonalization based convolutive blind source separation. Acta Automatica Sinica, 2013, 39(9): 1502-1510
    [16] 朱孝龙, 张贤达. 基于奇异值分解的超定盲信号分离. 电子与信息学报, 2004, 26(3): 337-343

    Zhu Xiao-Long, Zhang Xian-Da, Overdetermined blind signal separation based on singular value decomposition. Journal of Electronics & Information Technology, 2004, 26(3): 337-343
    [17] Yatabe K, Kitamura D. Determined BSS based on time-frequency masking and its application to harmonic vector analysis. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 1609-1625 doi: 10.1109/TASLP.2021.3073863
    [18] 肖明, 谢胜利, 傅予力. 基于超平面法矢量的欠定盲信号分离算法. 自动化学报, 2008, 34(2): 142-149

    Xiao Ming, Xie Sheng-Li, Fu Yu-Li. Underdetermined blind signal separation algorithm based on hyperplane normal vector. Acta Automatica Sinica, 2008, 34(2): 142-149
    [19] Nion D, Mokios K N, Sidiropoulos N D, Potamianos A. Batch and adaptive PARAFAC-based blind separation of convolutive speech mixtures. IEEE Transactions on Audio, Speech, and Language Processing, 2010, 18(6): 1193-1207 doi: 10.1109/TASL.2009.2031694
    [20] Matsuoka K. Minimal distortion principle for blind source separation. In: Procedings of the 3rd International Conference on Independent Component Analysis and Blind Signal Separation. Tobata, Japan: IEEE, 2001. 722−729
    [21] Sawada H, Mukai R, Araki S. A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Transactions Speech Audio Processing, 2004, 12(5): 530-538 doi: 10.1109/TSA.2004.832994
    [22] Sawada H, Araki S, Mukai R. Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Transactions on Audio Speech and Language Processing, 2011, 19(3): 516-527 doi: 10.1109/TASL.2010.2051355
    [23] Xie K, Zhou G X, Yang J J, He Z S, Xie S L. Eliminating the permutation ambiguity of convolutive blind source separation by using coupled frequency bins. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(2): 589-599 doi: 10.1109/TNNLS.2019.2906833
    [24] Sawada H, Araki S, Mukai R, Makina S. Grouping separated frequency components by estimating propagation model parameters in frequency-domain blind source separation. IEEE Transactions on Audio Speech and Language Processing, 2007, 15(5): 1592-1604 doi: 10.1109/TASL.2007.899218
    [25] Xie S L, Yang L, Yang J M, Zhou G X, Xiang Y. Time-frequency approach to underdetermined blind source separation. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(2): 306-316 doi: 10.1109/TNNLS.2011.2177475
    [26] 刘秋红, 许漫坤, 李天昀, 陆明明. 基于互补对称滤波器的APCMA信号的盲分离算法. 电子学报, 2020, 48(12): 2394-2401

    Liu Qiu-Hong, Xu Man-Kun, Li Tian-Jun, Lu Ming-Ming. Blind separation algorithm of APCMA signal based on complementary symmetric filter. Acta Electronica Sinica, 2020, 48(12): 2394-2401
    [27] He Z S, Xie S L, Ding S X, Cichocki A. Convolutive blind source separation in the frequency domain based on sparse representation. IEEE Transactions on Audio Speech and Language Processing, 2007, 15(5): 1551-1563 doi: 10.1109/TASL.2007.898457
    [28] Xie Y, Xie K, Xie S L. Underdetermined blind source separation of speech mixtures unifying dictionary learning and sparse representation. International Journal of Machine Learning and Cybernetics, 2021, 12(12), 3573-3583 doi: 10.1007/s13042-021-01406-5
    [29] Xu Z B, Zhang H, Wang Y, Chang X Y, Liang Y. L1/2 regularization. Science China (Information Sciences), 2010, 53(6): 1159-1169 doi: 10.1007/s11432-010-0090-0
    [30] Xu Z B, Chang X Y, Xu F M, Zhang H. L1/2 Regularization: A thresholding representation theory and a fast solver. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(7): 1013-1027 doi: 10.1109/TNNLS.2012.2197412
    [31] Yang J J, Guo Y, Yang Z Y, Xie S L. Underdetermined convolutive blind source separation combining density-based clustering and sparse reconstruction in time-frequency domain. IEEE Transactions on Circuits and Systems I: Regular Papers, 2019, 66(8): 3015-3027 doi: 10.1109/TCSI.2019.2908394
    [32] Xie Y, Xie K, Xie S L. Underdetermined blind separation of source using Lp-norm diversity measures. Neurocomputing, 2020, 411, 259-267 doi: 10.1016/j.neucom.2020.06.029
    [33] Lee D D, Seung H S. Learning the parts of objects by non-negative matrix factorization. Nature, 1999, 401(6755): 788-791 doi: 10.1038/44565
    [34] Gillis N, Vavasis S A. Fast and robust recursive algorithms for separable nonnegative matrix factorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(4): 698-714 doi: 10.1109/TPAMI.2013.226
    [35] Rahiche A, Cheriet M. Blind decomposition of multispectral document images using orthogonal nonnegative matrix factorization. IEEE Transactions on Image Processing, 2021, 30: 5997-6012 doi: 10.1109/TIP.2021.3088266
    [36] Kitamura D, Ono N, Sawada H, Kameoka H, Saruwatari H. Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization. IEEE/ACM Transactions on Audio Speech and Language Processing, 2016, 24(9): 1626-1641 doi: 10.1109/TASLP.2016.2577880
    [37] Al-Tmeme A, Woo W L, Dlay S S, Gao B. Underdetermined convolutive source separation using GEM-MU with variational approximated optimum model order NMF2D. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2017, 25(1): 31-45
    [38] Xie Y, Xie K, Xie S L. Underdetermined convolutive blind separation of sources integrating tensor factorization and expectation maximization. Digital Signal Processing, 2019, 87: 145-154 doi: 10.1016/j.dsp.2019.01.022
    [39] Sekiguchi K, Bando Y, Nugraha A A, Yoshii K, Kawahara T. Fast multichannel nonnegative matrix factorization with directivity-aware jointly-diagonalizable spatial covariance matrices for blind source separation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020, 28: 2610-2625 doi: 10.1109/TASLP.2020.3019181
    [40] Duong N, Vincent E, Gribonval R. Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Transactions on Audio, Speech, and Language Processing, 2010, 18(7): 1830-1840 doi: 10.1109/TASL.2010.2050716
    [41] Bando Y, Sekiguchi K, Masuyama Y, Nugraha A A, Fontaine M, Yoshii K. Neural full-rank spatial covariance analysis for blind source separation. IEEE Signal Processing Letters, 2021, 28: 1670-1674 doi: 10.1109/LSP.2021.3101699
    [42] Kolda T. Tensor decompositions and applications. Siam Review, 2009, 51(3): 455-500 doi: 10.1137/07070111X
    [43] Weiss A. Blind direction-of-arrival estimation in acoustic vector-sensor arrays via tensor decomposition and Kullback-Leibler divergence covariance fitting. IEEE Transactions on Signal Processing, 2021, 69: 531-545 doi: 10.1109/TSP.2020.3043814
    [44] Mitsufuji Y, Takamune N, Koyama S, Saruwatari H. Multichannel blind source separation based on evanescent-region-aware non-negative tensor factorization in spherical harmonic domain. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 607-617 doi: 10.1109/TASLP.2020.3045528
    [45] Tan V Y F, Févotte C. Automatic relevance determination in nonnegative matrix factorization with the-divergence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(7): 1592-1605 doi: 10.1109/TPAMI.2012.240
    [46] Audio Labs. RIR generator [Online], available: https://www.audiolabs-erlangen.de/fau/professor/habets/software/rir-generator, November 22, 2022
    [47] Vincent E, Gribonval R, Fevotte C. Performance measurement in blind audio source separation. IEEE Transactions on Audio Speech and Language Processing, 2006, 14(4): 1462-1469 doi: 10.1109/TSA.2005.858005
    [48] Fu X, Ma W K, Huang K, Sidiropoulos N. Blind separation of quasi-stationary sources: Exploiting convex geometry in covariance domain. IEEE Transactions on Signal Processing, 2015, 63(9): 2306-2320 doi: 10.1109/TSP.2015.2404577
    [49] AISHELL-ASR0009-OS1 open source mandarin speech corpus [Online], available: http://www.aishelltech.com/kysjcp, November 22, 2022
    [50] SiSEC 2013. Audio source separation [Online], available: http://sisec.wiki.irisa.fr/tiki-index.php?page=Professionally+produced+music+recordings, November 22, 2022
  • 加载中
图(10) / 表(4)
计量
  • 文章访问数:  535
  • HTML全文浏览量:  258
  • PDF下载量:  181
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-12-18
  • 录用日期:  2022-10-18
  • 网络出版日期:  2022-11-27
  • 刊出日期:  2023-05-20

目录

    /

    返回文章
    返回