2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

混响环境下基于倒谱BRIR的双耳互相关声源定位算法

张毅 颜博 王可佳

张毅, 颜博, 王可佳. 混响环境下基于倒谱BRIR的双耳互相关声源定位算法. 自动化学报, 2016, 42(10): 1562-1569. doi: 10.16383/j.aas.2016.c150828
引用本文: 张毅, 颜博, 王可佳. 混响环境下基于倒谱BRIR的双耳互相关声源定位算法. 自动化学报, 2016, 42(10): 1562-1569. doi: 10.16383/j.aas.2016.c150828
ZHANG Yi, YAN Bo, WANG Ke-Jia. Sound Source Localization Algorithm Based on Cepstral BRIR Binaural Cross-correlation in Reverberant Environment. ACTA AUTOMATICA SINICA, 2016, 42(10): 1562-1569. doi: 10.16383/j.aas.2016.c150828
Citation: ZHANG Yi, YAN Bo, WANG Ke-Jia. Sound Source Localization Algorithm Based on Cepstral BRIR Binaural Cross-correlation in Reverberant Environment. ACTA AUTOMATICA SINICA, 2016, 42(10): 1562-1569. doi: 10.16383/j.aas.2016.c150828

混响环境下基于倒谱BRIR的双耳互相关声源定位算法

doi: 10.16383/j.aas.2016.c150828
基金项目: 

重庆市科学技术委员会项目 cstc2015jcyjBX0066

详细信息
    作者简介:

    张毅 重庆邮电大学先进制造工程学院教授.主要研究方向为机器人及应用, 语音信号处理, 声源定位.E-mail:zhangyi@cqupt.edu.cn

    王可佳 重庆邮电大学自动化学院硕士研究生.主要研究方向为语音信号处理, 语音识别, 声纹识别.E-mail:qw.123woaini@foxmail.com

    通讯作者:

    颜博 重庆邮电大学自动化学院硕士研究生.主要研究方向为语音信号处理, 声源定位.本文通信作者.E-mail:yanbo19921102@sina.com

Sound Source Localization Algorithm Based on Cepstral BRIR Binaural Cross-correlation in Reverberant Environment

Funds: 

Chongqing Science and Technology Commission Project cstc2015jcyjBX0066

More Information
    Author Bio:

    Professor at the School of Advanced Manufacturing Engineering, Chongqing University of Posts and Telecommunications. His research interest covers robot and its applications, speech signal processing, and sound source localization

    Master student at the School of Automation, Chongqing University of Posts and Telecommunications. Her research interest covers processing of speech signal, speech recognition, and voiceprint recognition

    Corresponding author: YAN Bo Master student at the School of Automation, Chongqing University of Posts and Telecommunications. Her research interest covers speech signal processing and sound source localization. Corresponding author of this paper
  • 摘要: 在实际封闭环境中,针对存在混响而导致声源定位性能下降的问题,提出一种基于倒谱双耳房间脉冲响应(Binaural room impulse response,BRIR)的双耳互相关声源定位方法.该方法通过从倒谱BRIR中减去混响分量,然后反变换到时域得到估计的脉冲响应,再与数据库中的头部脉冲响应(Head related impulse response,HRIR)进行互相关运算,最大互相关值相对应的位置就是所估计的声源位置.仿真实验结果表明,提出的算法能减少混响环境中带来的定位误差,提高声源定位的精度.
  • 图  1  RT=0 s时, 三种算法对方位角${{15}^{{}^\circ }}$定位

    Fig.  1  Three algorithms for positioning the azimuth ${{15}^{{}^\circ }}$ when RT=0 s

    图  2  RT=0.30 s时, 三种算法对方位角${{15}^{{}^\circ }}$定位

    Fig.  2  Three algorithms for positioning the azimuth ${{15}^{{}^\circ }}$ when RT=0.30 s

    图  3  RT=0.50 s时, 三种算法对方位角${{15}^{{}^\circ }}$定位

    Fig.  3  Three algorithms for positioning the azimuth ${{15}^{{}^\circ }}$ when RT=0.50 s

    图  4  RT=0.70 s时, 三种算法对方位角${{15}^{{}^\circ }}$定位

    Fig.  4  Three algorithms for positioning the azimuth ${{15}^{{}^\circ }}$ when RT=0.70 s

    图  5  RT=0.90 s时, 三种算法对方位角${{15}^{{}^\circ }}$定位

    Fig.  5  Three algorithms for positioning the azimuth ${{15}^{{}^\circ }}$ when RT=0.90 s

    图  6  方位角为${{15}^{{}^\circ }}$不同混响时间下的RMSE比较

    Fig.  6  RMSE comparison of azimuth for ${{15}^{{}^\circ }}$ in different reverberation time

    图  7  实验环境示意图

    Fig.  7  Schematic diagram of experimental environment

    表  1  在不同混响时间下三种定位方法的声源方位估计

    Table  1  Sound source azimuth estimation of three location methods in different reverberation time

    实际角度(°) 0 10 15 20 30 35
    CEP-BRIR-CC
    声源定位法
    RT=0s 估计角度(°) 0.08 10.24 15.06 20.23 30.15 35.23
    绝对误差(°) 0.08 0.24 0.06 0.23 0.15 0.23
    RT=0.3s 估计角度(°) 0.17 9.03 14.82 21.09 30.25 36.39
    绝对误差(°) 0.17 0.97 1.18 1.09 0.25 1.39
    RT=0.5s 估计角度(°) -0.29 8.79 13.67 18.69 30.69 36.87
    绝对误差(°) 0.29 1.21 1.33 1.31 0.69 1.87
    CEP-GCC-ITD
    声源定位法
    RT=0s 估计角度(°) -0.08 10.67 15.92 20.86 30.42 35.37
    绝对误差(°) 0.08 0.67 0.92 0.86 0.42 0.37
    RT=0.3s 估计角度(°) 0.39 8.11 12.81 17.23 28.85 33.14
    绝对误差(°) 0.39 1.89 2.19 2.77 1.14 1.86
    RT=0.5s 估计角度(°) -1.69 7.06 11.91 16.14 28.15 32.06
    绝对误差(°) 1.69 2.94 3.09 3.86 1.85 2.94
    CEP-CC-ITD
    声源定位法
    RT=0s 估计角度(°) 0.07 10.73 15.95 21.46 30.85 35.62
    绝对误差(°) 0.07 0.73 0.95 1.46 0.85 0.62
    RT=0.3 s 估计角度(°) 0.63 8.68 12.78 23.06 27.62 32.97
    绝对误差(°) 0.63 1.32 2.22 3.06 2.38 2.03
    RT=0.5s 估计角度(°) -2.06 6.12 11.66 15.89 26.85 38.77
    绝对误差(°) 2.06 3.88 3.34 4.11 3.15 3.77
    下载: 导出CSV

    表  2  三种定位方法的统计结果

    Table  2  The statistical results of three localization methods

    角度
    方法
    —60° —15° 30° 45°
    估计值 误差 估计值 误差 估计值 误差 估计值 误差 估计值 误差
    CEP-BRIR-CC —54.8° 5.2° —19.6° 4.6° —3° 35.2° 5.2° 41.1° 3.9°
    CEP-GCC-ITD —67.6° 7.6° —22.3° 7.3° 7.5° 7.5° 36.9° 6.9° 52.8° 7.8°
    CEP-CC-ITD —50.9° 9.1° —23.5° 8.5° 8.8° 8.8° 22.0° 8.0° 54.2° 9.2°
    下载: 导出CSV
  • [1] Li H, Hong X. Binaural auditory localization of signals processed by speech enhancement methods. In:Proceedings of the 7th International Congress on Image and Signal Processing. Dalian, China:IEEE, 2014. 883-887
    [2] Wu X, Talagala D S, Zhang W, Abhayapala T D. Binaural localization of speech sources in 3-D using a composite feature vector of the HRTF. In:Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing. South Brisbane, QLD:IEEE, 2015. 2654-2658
    [3] 周蕙瑜.双通道立体声的虚拟重发技术研究[硕士学位论文], 电子科技大学, 中国, 2006.

    Zhou Hui-Yu. Dual-channel Stereo Virtual Retransmission Technology Research[Master dissertation], University of Electronic Science and Technology, China, 2006.
    [4] Portello A, Bustamante G, Danés P, Mifsud A. Localization of multiple sources from a binaural head in a known noisy environment. In:Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems. Chicago, USA:IEEE, 2014. 3168-3174
    [5] Liu H, Zhang J. A binaural sound source localization model based on time-delay compensation and interaural coherence. In:Proceedings of the 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing. Florence, Italy:IEEE, 2014. 1424-1428
    [6] 白振华.听觉定位中HRTF的研究[硕士学位论文], 东南大学, 中国, 2003.

    Bai Zhen-Hua. Study of HRTF in Auditory Localization[Master dissertation], Southeast University, China, 2003.
    [7] 罗元, 陈凯, 张毅.一种结合听觉掩蔽与双耳互相关的声源定位算法.计算机应用与软件, 2015, 32(3):141-144 http://www.cnki.com.cn/Article/CJFDTOTAL-JYRJ201503035.htm

    Luo Yuan, Chen Kai, Zhang Yi. A sound source localisation algorithm based on the combination of auditory masking and binaural cross-correlation. Computer Applications and Software, 2015, 32(3):141-144 http://www.cnki.com.cn/Article/CJFDTOTAL-JYRJ201503035.htm
    [8] Raspaud M, Viste H, Evangelista G. Binaural source localization by joint estimation of ILD and ITD. IEEE Transactions on Audio, Speech, and Language Processing, 2010, 18(1):68-77 doi: 10.1109/TASL.2009.2023644
    [9] 吴玉秀, 孟庆浩, 曾明.基于声音的分布式多机器人相对定位.自动化学报, 2014, 40(5):798-809 http://www.aas.net.cn/CN/abstract/abstract18348.shtml

    Wu Yu-Xiu, Meng Qing-Hao, Zeng Ming. Sound based relative localization for distributed multi-robot systems. Acta Automatica Sinica, 2014, 40(5):798-809 http://www.aas.net.cn/CN/abstract/abstract18348.shtml
    [10] Zannini C M, Parisi R, Uncini A. Binaural sound source localization in the presence of reverberation. In:Proceedings of the 17th International Conference on Digital Signal Processing. Corfu, Greece:IEEE, 2011. 1-6
    [11] Woodruff J, Wang D L. Binaural localization of multiple sources in reverberant and noisy environments. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(5):1503-1512 doi: 10.1109/TASL.2012.2183869
    [12] Barker J, Vincent E, Ma N, Christensen H, Green P. The PASCAL CHiME speech separation and recognition challenge. Computer Speech and Language, 2013, 27(3):621-633 doi: 10.1016/j.csl.2012.10.004
    [13] Stéphenne A, Champagne B. A new cepstral prefiltering technique for estimating time delay under reverberant conditions. Signal Processing, 1997, 59(3):253-266 doi: 10.1016/S0165-1684(97)00051-0
    [14] 屈丹, 杨绪魁, 张文林.特征空间本征音说话人自适应.自动化学报, 2015, 41(7):1244-1252 http://www.aas.net.cn/CN/abstract/abstract18698.shtml

    Qu Dan, Yang Xu-Kui, Zhang Wen-Lin. Feature space eigenvoice speaker adaptation. Acta Automatica Sinica, 2015, 41(7):1244-1252 http://www.aas.net.cn/CN/abstract/abstract18698.shtml
    [15] Mosayyebpour S, Lohrasbipeydeh H, Esmaeili M, Gulliver T A. Time delay estimation via minimum-phase and all-pass component processing. In:Proceedings of the 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing. Vancouver, BC:IEEE, 2013. 4285-4289
    [16] 马浩, 吴镇扬, 张杰, 胡红梅.与头相关传递函数的双耳特征提取与分类.电路与系统学报, 2007, 12(5):58-64 http://www.cnki.com.cn/Article/CJFDTOTAL-DLYX200705012.htm

    Ma Hao, Wu Zhen-Yang, Zhang Jie, Hu Hong-Mei. Binaural character extraction and clustering of head related transfer function. Journal of Circuits and Systems, 2007, 12(5):58-64 http://www.cnki.com.cn/Article/CJFDTOTAL-DLYX200705012.htm
  • 加载中
图(7) / 表(2)
计量
  • 文章访问数:  2665
  • HTML全文浏览量:  410
  • PDF下载量:  657
  • 被引次数: 0
出版历程
  • 收稿日期:  2015-12-09
  • 录用日期:  2016-05-17
  • 刊出日期:  2016-10-01

目录

    /

    返回文章
    返回