Road Structural Feature Based Monocular Visual Localization for Intelligent Vehicle
-
摘要: 高精度定位是实现自动驾驶的关键.在城市密集区域,全球定位系统(Global positioning system,GPS)等卫星定位系统受到遮挡、干扰、多路径反射等影响,无法保障自动驾驶所需的定位精度.视觉定位技术通过图像特征匹配进行位置估计,被广泛研究.然而传统基于特征点的方法容易受到移动目标的干扰,在高动态交通场景中的应用面临挑战.在结构化道路场景中,车道等线特征普遍存在,为人类驾驶员的视觉理解与决策提供重要线索.受该思路的启发,本文利用场景中的三垂线和点特征构建道路结构特征(Road structural feature,RSF),并在此基础上提出一个基于道路结构特征的单目视觉定位算法.本文利用在北京市区的典型路口、路段、街道等场所采集的车载视频数据进行实验验证,以同步采集的高精度GPS惯性导航组合定位系统数据为参照,与传统视觉定位算法进行比较.结果表明,本文算法在朝向估计上明显优于传统算法,对环境中的动态干扰有更高的鲁棒性.在卫星信号易受干扰的区域,可以有效地弥补GPS等定位系统的不足,为满足自动驾驶所需的车道级定位要求提供重要的技术手段.Abstract: Precise localization is an essential issue for autonomous driving applications, while global positioning system (GPS)-based systems are challenged to meet requirements such as lane-level accuracy, especially in crowded urban environment. This paper introduces a new visual-based localization approach in dynamic traffic environments, focusing on structured roads like straight roads or intersections. Such environments show several line segments on lane markings, curbs, poles, building edges, etc., which demonstrate the road's longitude, latitude and vertical directions. Based on this observation, we define a road structural feature (RSF) as sets of segments along three perpendicular axes together with feature points, and propose an RSF based monocular visual localization method. Extensive experiments are conducted on three typical scenarios, including highway, intersection and downtown streets. Results show better accuracy compared with a state-of-the-art visual localization method using feature points. We demonstrate that the proposed method can help improving localization accuracy in GPS restricted area, and discuss the remained challenges leading to future studies.1) 本文责任编委 王飞跃
-
表 1 三组实验的平移和朝向误差
Table 1 Translation and rotation errors
路口实验 拥堵路段 密集街道 平移(%) 朝向(°/m) 平移(%) 朝向(°/m) 平移(%) 朝向(°/m) 均差, 95% 均差, 95% 均差, 95% 均差, 95% 均差, 95% 均差, 95% libviso2 3.94, 7.85 0.0144, 0.0323 18.60, 34.09 0.0249, 0.0428 6.63, 11.70 0.0137, 0.0227 Orb-SLAM — — — — 3.18, 5.33 0.0101, 0.0218 Our 1.07, 2.94 0.0024, 0.0049 0.89, 1.82 0.0016, 0.0040 0.69, 1.23 0.0031, 0.0075 -
[1] 李宇波, 朱效洲, 卢惠民, 张辉.视觉里程计技术综述.计算机应用研究, 2012, 29(8): 2801-2805, 2810 http://www.cnki.com.cn/Article/CJFDTOTAL-JSYJ201208002.htmLi Yu-Bo, Zhu Xiao-Zhou, Lu Hui-Min, Zhang Hui. Review on visual odometry technology. Application Research of Computers, 2012, 29(8): 2801-2805, 2810 http://www.cnki.com.cn/Article/CJFDTOTAL-JSYJ201208002.htm [2] Moravec H P. Obstacle Avoidance and Navigation in the Real World by a Seeing Robot Rover, Technical Report ADA092604, Department of Computer Science, Stanford University, USA, 1980. [3] Matthies L, Shafer S. Error modeling in stereo navigation. IEEE Journal on Robotics and Automation, 1987, 3(3): 239-248 doi: 10.1109/JRA.1987.1087097 [4] Nistér D, Naroditsky O, Bergen J. Visual odometry. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, DC, USA: IEEE, 2004. 652-659 [5] Nistér D, Naroditsky O, Bergen J. Visual odometry for ground vehicle applications. Journal of Field Robotics, 2006, 23(1): 3-20 doi: 10.1002/(ISSN)1556-4967 [6] Triggs B, McLauchlan P F, Hartley R I, Fitzgibbon A W. Bundle adjustment-a modern synthesis. Vision Algorithms: Theory and Practice. Berlin Heidelberg, Germany: Springer-Verlag, 2000. 298-372 [7] Mouragnon E, Lhuillier M, Dhome M, Dekeyser F, Sayd P. Real time localization and 3D reconstruction. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, NY, USA: IEEE, 2006. 363-370 [8] Fischler M A, Bolles R C. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 1981, 24(6): 381-395 doi: 10.1145/358669.358692 [9] Ozden K E, Schindler K, Van Gool L. Multibody structure-from-motion in practice. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(6): 1134-1141 doi: 10.1109/TPAMI.2010.23 [10] Kundu A, Krishna K M, Jawahar C V. Realtime multibody visual SLAM with a smoothly moving monocular camera. In: Proceedings of the 2011 IEEE International Conference on Computer Vision. Barcelona, Spain: IEEE, 2011. 2080-2087 [11] Engel J, Schöps T, Cremers D. LSD-SLAM: large-scale direct monocular SLAM. In: Proceedings of the 13th European Conference on Computer Vision. Berlin Heidelberg, Germany: Springer, 2014. 834-849 [12] Mur-Artal R, Montiel J M M, Tardós J D. ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Transactions on Robotics, 2015, 31(5): 1147-1163 doi: 10.1109/TRO.2015.2463671 [13] Geiger A. Libviso2: library for visual odometry 2 [Online], available: http://www.cvlibs.net/software/libviso, January 3, 2017 [14] Ramalingam S, Bouaziz S, Sturm P, Brand M. Skyline2gps: localization in urban canyons using omni-skylines. In: Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems. Taipei, China: IEEE, 2010. 3816-3823 [15] Ramalingam S, Bouaziz S, Sturm P. Pose estimation using both points and lines for geo-localization. In: Proceedings of the 2011 IEEE International Conference on Robotics and Automation. Shanghai, China: IEEE, 2011. 4716-4723 [16] Achar S, Jawahar C V, Krishna K M. Large scale visual localization in urban environments. In: Proceedings of the 2011 IEEE International Conference on Robotics and Automation. Shanghai, China: IEEE, 2011. 5642-5648 [17] Ziegler J, Lategahn H, Schreiber M, Keller C G, Knöppel C, Hipp J, Haueis M, Stiller C. Video based localization for bertha. In: Proceedings of the 2014 IEEE Intelligent Vehicles Symposium. Dearborn, MI, USA: IEEE, 2014. 1231-1238 [18] Geiger A. Monocular road Mosaicing for urban environments. In: Proceedings of the 2009 IEEE Intelligent Vehicles Symposium. Xi'an, China: IEEE, 2009. 140-145 [19] Zhang J, Song D Z. On the error analysis of vertical line pair-based monocular visual odometry in urban area. In: Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems. St. Louis, MO, USA: IEEE, 2009. 3486-3491 [20] Barinova O, Lempitsky V, Tretiak E, Kohli P. Geometric image parsing in man-made environments. In: Proceedings of the 11th European Conference on Computer Vision. Berlin Heidelberg, Germany: Springer, 2010. 57-70 [21] Tretyak E, Barinova O, Kohli P, Lempitsky V. Geometric image parsing in man-made environments. International Journal of Computer Vision, 2012, 97(3): 305-321 doi: 10.1007/s11263-011-0488-1 [22] von Gioi R G, Jakubowicz J, Morel J M, Randall G. LSD: a line segment detector [Online], available: http://www.ipol.im/pub/art/2012/gjmr-lsd/, January 3, 2017 [23] Geiger A, Ziegler J, Stiller C. Stereoscan: dense 3d reconstruction in real-time. In: Proceedings of the 2011 IEEE Intelligent Vehicles Symposium. Baden-Baden, Germany: IEEE, 2011. 963-968 [24] Geiger A, Lauer M, Wojek C, Stiller C, Urtasun R. 3D traffic scene understanding from movable platforms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(5): 1012-1025 doi: 10.1109/TPAMI.2013.185 [25] Corral-Soto E R, Elder J H. Automatic single-view calibration and rectification from parallel planar curves. In: Proceedings of the 13th European Conference on Computer Vision. Berlin Heidelberg, Germany: Springer, 2014. 813-827 [26] 庄严, 陈东, 王伟, 韩建达, 王越超.移动机器人基于视觉室外自然场景理解的研究与进展.自动化学报, 2010, 36(1): 1-11 http://www.aas.net.cn/CN/abstract/abstract13622.shtmlZhuang Yan, Chen Dong, Wang Wei, Han Jian-Da, Wang Yue-Chao. Status and development of natural scene understanding for vision-based outdoor moblie robot. Acta Automatica Sinica, 2010, 36(1): 1-11 http://www.aas.net.cn/CN/abstract/abstract13622.shtml