Adaptively Combining Color and Depth for Human Body Contour Tracking
-
摘要: 采用活动轮廓对人体目标建模,提出一 种新的水平集框架下自适应融合RGB-D图像的颜色和深度信息的人体轮廓跟踪方法. 设计了一种基于超像素的局部自适应权重计算方法,自动确定深度信息在水平集演化中的重要性. 基于深度信息的活动轮廓驱动外力包括由边缘生成的梯度向量流和由目标/背景深度模型生成的置信图,基于颜色信息的驱动外力由目标/背景颜色模型生成的置信图,这三种外力通过局部自适应权重融合,驱动活动轮廓向目标的边界演化.为了得到更加精确的目标轮廓和防止误差漂移,基于本文观察到的人体表面在深度图像中的两个特性,提出两个简单但有效的算法对水平集方法得到的结果进行精化调整. 最后,通过实验验证了本文算法的优越性.Abstract: In this paper, we present a novel human body contour tracking method, which combines color and depth cues of RGB-D images adaptively in the level set framework. We model the body object by the active contour. A superpixel-based locally adaptive weight map is designed to determine the importance of the depth cue in the evolution of the active contour. The depth-based external forces for the active contour are derived from the gradient vector flow (GVF) generated from the edges of the depth image and the confidence map generated from the depth models of the object/background, while the color-based external force is derived from the confidence map generated from the color models of the object/background. The three external forces are integrated by the adaptive weight to drive the active contour to evolve to the boundary of the object. To obtain a more accurate contour and to avoid error drifting, we propose two simple but effective algorithms based on the two properties of the human body surface in the depth image to refine the tracking result of the level set method. Experimental results demonstrate that our tracking method behaves more robustly and accurately than the latest depth-based body contour extraction method and the color-based contour tracking method on the datasets acquired in indoor environments.
-
Key words:
- Contour tracking /
- human tracking /
- active contour /
- level set
-
[1] Moeslund T B, Hilton A, Kr [2] ger V. A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 2006, 104(2): 90-126 [3] [2] Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A. Real-time human pose recognition in parts from single depth images. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI: IEEE, 2011. 1297-1304 [4] [3] Girshick R, Shotton J, Kohli P, Criminisi A, Fitzgibbon A. Efficient regression of general-activity human poses from depth images. In: Proceedings of the 2011 International Conference on Computer Vision. Barcelona, Spain: IEEE, 2011. 415-422 [5] [4] Taylor J, Shotton J, Sharp T, Fitzgibbon A. The Vitruvian manifold: inferring dense correspondences for one-shot human pose estimation. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE, 2012. 103-110 [6] [5] Yilmaz A, Li X, Shah M. Object contour tracking using level sets. In: Proceedings of the 2004 Asian Conference on Computer Vision. Jeju, Korea: Springer, 2004. 1-7 [7] [6] Bibby C, Reid I. Robust real-time visual tracking using pixel-wise posteriors. In: Proceedings of the 2008 European Conference on Computer Vision. Berlin, Heidelberg: Springer, 2008. 831-844 [8] [7] Sun X, Yao H, Zhang S. A novel supervised level set method for non-rigid object tracking. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA: IEEE, 2011. 3393-3400 [9] [8] Osher S, Sethian J. Fronts propagating with curvature-dependent speed: algorithms based on Hamilton-Jacobi formulations. Journal of Computational Physics, 1988, 79(1): 12-49 [10] [9] Kass M, Witkin A, Terzopoulos D. Snakes: active contour models. International Journal of Computer Vision, 1987, 1(4): 321-331 [11] Caselles V, Kimmel R, Sapiro G. Geodesic active contours. International Journal of Computer Vision, 1997, 22(1): 61-79 [12] Xu C, Prince J L. Snakes, shapes, and gradient vector flow. IEEE Transaction on Image Processing, 1998, 7(3): 359-369 [13] Xu C, Prince J L. Generalized gradient vector flow external forces for active contours. Signal Processing, 1998, 71(2): 131-139 [14] Paragios N, Mellina-Gottardo O, Ramesh V. Gradient vector flow fast geometric active contours. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(3): 402-407 [15] Wang Yuan-Quan, Jia Yun-De. Analysis of the critical point of the gradient vector flow snake model. Journal of Software, 2006, 17(9): 1915-1921 (王元全, 贾云得. 梯度矢量流Snake 模型临界点剖析. 软件学报, 2006, 17(9): 1915-1921) [16] Chan T F, Vese L A. Active contours without edges. IEEE Transactions on Image Processing, 2001, 10(2): 266-277 [17] Li C M, Kao C Y, Gore J C, Ding Z H. Implicit active contours driven by local binary fitting energy. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE, 2007. 1-7 [18] Lankton S, Tannenbaum A. Localizing region-based active contours. IEEE Transactions on Image Processing, 2008, 17(11): 2029-2039 [19] Zha Yu-Fei, Zhang Yu, Bi Du-Yan. Moving objects tracking using region-based active contours. Journal of Image and Graphics, 2006, 11(12): 1844-1848 (查宇飞, 张育, 毕笃彦. 基于区域活动轮廓运动目标跟踪方法研究. 中国图象图形学报, 2006, 11(12): 1844-1848) [20] Zhou Xue, Hu Wei-Ming. Object contour tracking with fusion of color and incremental shape priors. Acta Automatica Sinica, 2009, 35(11): 1394-1402 (周雪, 胡卫明. 融合颜色和增量形状先验的目标轮廓跟踪. 自动化学报, 2009, 35(11): 1394-1402) [21] Zhou X, Li X, Chin T J, Suter D. Superpixel-driven level set tracking. In: Proceedings of the 2012 IEEE International Conference on Image Processing. Nanjing, China: IEEE, 2012. 409-412 [22] Horbert E, Rematas K, Leibe B. Level-set person segmentation and tracking with multi-region appearance models and top-down shape information. In: Proceedings of the 2011 International Conference on Computer Vision. Barcelona, Spain: IEEE, 2011. 1871-1878 [23] Tian Hao, Yang Jian, Wang Yan-Ming, Li Guo-Hui. Towards automatic building extraction: variational level set prior shape knowledge. Acta Automatica Sinica, 2010, 36(11): 1502-1511)(田昊, 杨剑, 汪彦明, 李国辉. 基于先验形状约束水平集模型的建筑物提取方法. 自动化学报, 2010, 36(11): 1502-1511) [24] Xia L, Chen C C, Aggarwal J K. Human detection using depth information by Kinect. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA: IEEE, 2011. 15-22 [25] Wang L, Gong M L, Zhang C X, Yang R G, Zhang C, Yang Y H. Automatic real-time video matting using time-of-flight camera and multichannel poisson equations. International Journal of Computer Vision, 2012, 97(1): 104-121 [26] Wang L, Zhang C, Yang R, Zhang C. TofCut: towards robust real-time foreground extraction using a time-of-fight camera. In: Proceedings of the 2010 3D Data Processing Visualization and Transmission. Paris, France, 2010. 1-8 [27] Microsoft Corp. Redmond WA. Kinect for Xbox 360 [28] Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Susstrunk S. SLIC Superpixels, Technical Report 149300, EPFL, 2010 [29] Yu T, Zhang C, Cohen M, Rui Y, Wu Y. Monocular video foreground/background segmentation by tracking spatial-color Gaussian mixture models. In: Proceedings of the 2007 IEEE Workshop on Motion and Video Computing. Austin, USA: IEEE, 2007. 5-12 [30] Criminisi A, Cross G, Blake A, Kolmogorov V. Bilayer segmentation of live video. In: Proceedings of the 2006 International Conference on Computer Vision. New York, USA: IEEE, 2006. 53-60 [31] Li C M, Xu C Y, Gui C F, Fox M D. Distance regularized level set evolution and its application to image segmentation. IEEE Transactions on Image Processing, 2010, 19(12): 3243-3254 [32] Zhang K H, Zhang L, Song H H, Zhang D. Re-initialization free level set evolution via reaction diffusion. IEEE Transactions on Image Processing, 2013, 22(1): 258-271 [33] Aubert G, Kornprobst P. Mathematical Problems in Image Processing: Partial Differential Equations and the Calculus of Variations. New York: Springer-Verlag, 2000 [34] Li C M, Xu C Y, Konwar K M, Fox M D. Fast distance preserving level set evolution for medical image segmentation. In: Proceedings of the 2006 IEEE International Conference on Control, Automation, Robotics and Vision. Singapore: IEEE, 2006. 1-7 [35] Rauschert I, Collins R T. A generative model for simultaneous estimation of human body shape and pixel-level segmentation. In: Proceedings of the 2012 European Conference on Computer Vision, Berlin, Heidelberg: Springer, 2012. 704-717
点击查看大图
计量
- 文章访问数: 2123
- HTML全文浏览量: 59
- PDF下载量: 2028
- 被引次数: 0