2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于运动目标三维轨迹重建的视频序列同步算法

王雪 SHIJian-Bo PARKHyun-Soo 王庆

王雪, SHIJian-Bo, PARKHyun-Soo, 王庆. 基于运动目标三维轨迹重建的视频序列同步算法. 自动化学报, 2017, 43(10): 1759-1772. doi: 10.16383/j.aas.2017.c160584
引用本文: 王雪, SHIJian-Bo, PARKHyun-Soo, 王庆. 基于运动目标三维轨迹重建的视频序列同步算法. 自动化学报, 2017, 43(10): 1759-1772. doi: 10.16383/j.aas.2017.c160584
WANG Xue, SHI Jian-Bo, PARK Hyun-Soo, WANG Qing. Synchronization of Video Sequences Through 3D Trajectory Reconstruction. ACTA AUTOMATICA SINICA, 2017, 43(10): 1759-1772. doi: 10.16383/j.aas.2017.c160584
Citation: WANG Xue, SHI Jian-Bo, PARK Hyun-Soo, WANG Qing. Synchronization of Video Sequences Through 3D Trajectory Reconstruction. ACTA AUTOMATICA SINICA, 2017, 43(10): 1759-1772. doi: 10.16383/j.aas.2017.c160584

基于运动目标三维轨迹重建的视频序列同步算法

doi: 10.16383/j.aas.2017.c160584
基金项目: 

国家自然科学基金 61531014

详细信息
    作者简介:

    王雪 西北工业大学计算机学院博士研究生.主要研究方向为目标跟踪, 人体行为分析.E-mail:xwang@mail.nwpu.edu.cn

    SHIJian-Bo:SHI Jian-Bo 宾夕法尼亚大学工程与应用科学学院教授.主要研究方向为人体行为分析, 图像识别分割.E-mail:jshi@seas.upenn.edu

    PARKHyun-Soo:PARK Hyun-Soo  宾夕法尼亚大学工程与应用科学学院博士后.主要研究方向为基于视觉社交信号的人体交互行为分析, 如注意力运动、面部表情和身体姿势等.E-mail:hypar@seas.upenn.edu

    通讯作者:

    王庆 西北工业大学计算机学院教授.主要研究方向为计算机视觉, 图像与视频处理, 光场成像, 虚拟现实.本文通信作者.E-mail:qwang@nwpu.edu.cn

Synchronization of Video Sequences Through 3D Trajectory Reconstruction

Funds: 

National Natural Science Foundation of China 61531014

More Information
    Author Bio:

    Ph. D. candidate at the School of Computer Science and Engineering, Northwestern Polytechnical University. Her research interest covers object tracking and human behavior analysis

    Professor at the School of Engineering and Applied Science, University of Pennsylvania, USA. His research interest covers human behavior analysis and image recognition-segmentation

    Postdoctoral fellow at the School of Engineering and Applied Science, University of Pennsylvania, USA. His research interest covers human interact with one another by sending visible social signals, such as gaze movements, facial expressions, and body gestures

    Corresponding author: WANG Qing  Professor at the School of Computer Science and Engineering, Northwestern Polytechnical University. His research interest covers computer vision, image and video signal processing, light field, and virtual reality. Corresponding author of this paper.E-mail:qwang@nwpu.edu.cn
  • 摘要: 提出一种利用运动目标三维轨迹重建的视频时域同步算法.待同步的视频序列由不同相机在同一场景中同时拍摄得到,对场景及相机运动不做限制性约束.假设每帧图像的相机投影矩阵已知,首先基于离散余弦变换基重建运动目标的三维轨迹.然后提出一种基于轨迹基系数矩阵的秩约束,用于衡量不同序列子段间的空间时间对准程度.最后构建代价矩阵,并利用基于图的方法实现视频间的非线性时域同步.我们不依赖已知的点对应关系,不同视频中的跟踪点甚至可以对应不同的三维点,只要它们之间满足以下假设:观测序列中跟踪点对应的三维点,其空间位置可以用参考序列中所有跟踪点对应的三维点集的子集的线性组合描述,且该线性关系维持不变.与多数现有方法要求特征点跟踪持续整个图像序列不同,本文方法可以利用长短不一的图像点轨迹.本文在仿真数据和真实数据集上验证了提出方法的鲁棒性和性能.
    1)  本文责任编委 黄庆明
  • 图  1  待同步的第一人称视角图像序列

    Fig.  1  Video sequences captured by first-person cameras

    图  2  测试序列对同步和不同步时基系数矩阵$\overline{M}$的奇异值

    Fig.  2  An example of the singular values of $\overline{ M}$ in synchronized case and non-synchronized cases

    图  3  代价矩阵和最优路径(白实线)

    Fig.  3  Cost matrix and optimal path (white solid curve)

    图  4  双序列时域对准算法流程图

    Fig.  4  The flow chart of pairwise alignment

    图  5  仿真数据重建结果(黑)和真实值(灰)

    Fig.  5  Reconstruction (black) and ground truth (gray) of simulated data

    图  6  跟踪误差、数据丢失和图像点数量对同步结果的影响

    Fig.  6  Comparisons of robustness with regard to tracking error, missing data and point number

    图  7  仿真数据集上各算法在不同跟踪误差下的实验结果对比以及估算的代价矩阵示例

    Fig.  7  Comparisons of alignment accuracy using different methods regarding tracking noise level and representative cost matrices with estimated optimal paths superimposed

    图  8  三维重建结果(从左到右对应场景依次为:积木, 健身毯, 篮球#1, 篮球#2和玩具火车)

    Fig.  8  The 3D reconstruction results (From left to right: block building, exercise mat, basketball (#1), basketball (#2) and toy train.)

    图  9  积木场景中各算法的时域对准结果对比(从左到右依次为:参考序列中的图像帧、本文算法、PDM、BPM、ECM、MFM和SMM找到的第二个序列中的对应帧(上)及第三个序列中的对应帧(下))

    Fig.  9  Synchronization results on the blocks scene (From left to right: sample frames from the reference sequence, corresponding frames from the second sequence (top) and the third sequence (bottom) by our method, PDM, BPM, ECM, MFM and SMM, respectively.)

    图  10  健身毯场景中各算法的时域对准结果对比(同图 9)

    Fig.  10  Synchronization results on the exercise mat scene idem as Fig. 9

    图  11  篮球#1场景中各算法的时域对准结果对比(从左到右依次为:参考序列中的图像帧、本文算法、PDM、BPM、ECM、MFM和SMM找到的第二个序列中的对应帧)

    Fig.  11  Synchronization results on the basketball scene (#1) (From left to right: sample frames from the reference sequence, corresponding frames from the second sequence by our method, PDM, BPM, ECM, MFM and SMM, respectively.)

    图  12  篮球#2场景中各算法的时域对准结果对比(同图 11)

    Fig.  12  Synchronization results on the basketball scene (#2) idem as Fig. 11

    图  13  玩具火车场景中各算法的时域对准结果对比(同图 11)

    Fig.  13  Synchronization results on the toy train scene idem as Fig. 11

    图  14  不同有效秩对同步结果的影响及不同有效秩对应的代价矩阵

    Fig.  14  Comparisons of alignment accuracy with different λ values for efficient rank and cost matrices computed with different λ values

    图  15  不同帧率比对同步结果的影响及观测序列帧率为46 fps、40 fps和24 fps时的代价矩阵

    Fig.  15  Comparisons of alignment accuracy with different frame rate ratios and cost matrices computed when the frame rate of the observed sequence is 46, 40 and 24, respectively

    表  1  真实数据集上各算法的归一化时域对准误差对比(帧)

    Table  1  Quantitative comparisons of alignment error on real scenes (frame)

    积木#1 积木#2 健身毯#1 健身毯#2 篮球#1 篮球#2 玩具火车
    BPM (手动标记点轨迹) 39.61 9.16 12.05 15.63 16.81 12.42 56.80
    ECM (手动标记点轨迹) 25.15 32.37 57.48 62.60 50.44 29.83 24.86
    MFM (自动跟踪点轨迹) 11.81 21.70 22.17 9.44 17.68 22.78 70.04
    SMM (SIFT) 155.75 196.56 132.08 202.50 9.71 31.74 130.83
    PDM (手动标记点轨迹) 0.85 2.53 2.96 4.60 4.29 1.49 1.28
    本文算法(手动标记点轨迹) 0.45 1.27 2.52 2.76 3.07 1.12 1.33
    本文算法(自动跟踪点轨迹) 0.52 1.74 1.35 1.48 2.84 1.54 3.18
    本文算法(手动标记和自动跟踪) 0.56 1.40 2.07 1.99 3.75 0.92 2.01
    下载: 导出CSV
  • [1] Caspi Y, Irani M. Spatio-temporal alignment of sequences. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(11):1409-1424 doi: 10.1109/TPAMI.2002.1046148
    [2] Caspi Y, Simakov D, Irani M. Feature-based sequence-tosequence matching. International Journal of Computer Vision, 2006, 68(1):53-64 doi: 10.1007/s11263-005-4842-z
    [3] Lu C, Mandal M. A robust technique for motion-based video sequences temporal alignment. IEEE Transactions on Multimedia, 2013, 15(1):70-82 doi: 10.1109/TMM.2012.2225036
    [4] Pundik D, Moses Y. Video synchronization using temporal signals from epipolar lines. In:Proceedings of the 11th European Conference on Computer Vision. Heraklion, Crete, Greece:Springer Berlin Heidelberg, 2010. 15-28
    [5] Pádua F, Carceroni F, Santos G, Kutulakos K. Linear sequence-to-sequence alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(2):304-320 doi: 10.1109/TPAMI.2008.301
    [6] Yilmaz A, Shah M. Matching actions in presence of camera motion. Computer Vision and Image Understanding, 2006, 104(2-3):221-231 doi: 10.1016/j.cviu.2006.07.012
    [7] Rao C, Gritai A, Shah M, Syeda-Mahmood T. Viewinvariant alignment and matching of video sequences. In:Proceedings of the 9th IEEE International Conference on Computer Vision. Nice, France:IEEE, 2003. 939-945
    [8] Tresadern P A, Reid I D. Video synchronization from human motion using rank constraints. Computer Vision and Image Understanding, 2009, 113(8):891-906 doi: 10.1016/j.cviu.2009.03.012
    [9] Wolf L, Zomet A. Correspondence-free synchronization and reconstruction in a non-rigid scene. In:Proceedings of the 7th European Conference on Computer Vision, Workshop on Vision and Modelling of Dynamic Scenes. Copenhagen, Denmark:Springer Berlin Heidelberg, 2002.
    [10] Wolf L, Zomet A. Wide baseline matching between unsynchronized video sequences. International Journal of Computer Vision, 2006, 68(1):43-52 doi: 10.1007/s11263-005-4841-0
    [11] Sand P, Teller S. Video matching. ACM Transactions on Graphics, 2004, 23(3):592-599 doi: 10.1145/1015706
    [12] Evangelidis G D, Bauckhage C. Efficient subframe video alignment using short descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(10):2371-2386 doi: 10.1109/TPAMI.2013.56
    [13] Serrat J, Diego F, Lumbreras F, Álvarez J M. Synchronization of video sequences from free-moving camreas. In:Proceedings of the 3rd Iberian Conference on Pattern Recognition and Image Analysis, Part Ⅱ. Girona, Spain:Springer Berlin Heidelberg, 2007. 620-627
    [14] Diego F, Ponsa D, Serrat J, López A M. Video alignment for change detection. IEEE Transactions on Image Processing, 2011, 20(7):1858-1869 doi: 10.1109/TIP.2010.2095873
    [15] Diego F, Serrat J, López A M. Joint spatio-temporal alignment of sequences. IEEE Transactions on Multimedia, 2013, 15(6):1377-1387 doi: 10.1109/TMM.2013.2247390
    [16] Wang O, Schroers C, Zimmer H, Gross M, Sorkine-Hornung A. VideoSnapping:interactive synchronization of multiple videos. ACM Transactions on Graphics, 2014, 33(4):77:1-77:10 http://dblp.uni-trier.de/db/journals/tog/tog33.html#WangSZGS14
    [17] Tuytelaars T, van Gool L. Synchronizing video sequences. In:Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington D C, USA:IEEE, 2004. 762-768
    [18] Lei C, Yang Y. Trifocal tensor-based multiple video synchronization with subframe optimization. IEEE Transactions on Image Processing, 2006, 15(9):2473-2480 doi: 10.1109/TIP.2006.877438
    [19] Dexter E, Pérez P, Laptev I. Multi-view synchronization of human actions and dynamic scenes. In:Proceedings of the 2009 British Machine Vision Conference. London, UK:BMVA Press, 2009. 122:1-122:11
    [20] Akhter I, Sheikh Y, Khan S, Kanade T. Nonrigid strcture from motion in trajectory space. In:Proceedings of the 2008 Advances in Neural Information Processing Systems. Vancouver, Canada:NIPS, 2008. 41-48
    [21] Park H S, Shiratori T, Matthews I, Sheikh Y. 3D reconstruction of a moving point from a series of 2D projections. In:Proceedings of the 11th European Conference on Computer Vision. Heraklion, Crete, Greece:Springer, 2010. 158-171
    [22] Kutulakos K N, Vallino J. Affine object representations for calibration-free augmented reality. In:Proceedings of the 1996 IEEE Virtual Reality Annual International Symposium. Washington DC, USA:IEEE, 1996. 25-36
    [23] Fragkiadaki K, Zhang W J, Zhang G, Shi J B. Twogranularity tracking:mediating trajectory and detection graphs for tracking under occlusions. In:Proceedings of the 12th European Conference on Computer Vision. Florence, Italy:Springer, 2012. 552-565
    [24] Lucas B D, Kanade T. An interative image registration technique with an application to stereo vision. In:Proceedings of the 7th International Joint Conference on Artificial Intelligence. Vancouver, Canada:Morgan Kaufmann Publishers Inc., 1981. 674-679
    [25] Snavely N, Seitz S M, Szeliski R. Photo tourism:exploring photo collections in 3D. ACM Transactions on Graphics, 2006, 25(3):835-846 doi: 10.1145/1141911
    [26] Hartley R I, Zisserman A. Multiple View Geometry in Computer Vision (2nd edition). Cambridge:Cambridge University Press, 2004.
    [27] Park H S, Jain E, Sheikh Y. 3D gaze concurrences from head-mounted cameras. In:Proceedings of the 2012 Advances in Neural Information Processing Systems. Nevada, USA:NIPS, 2012. 422-430
  • 加载中
图(15) / 表(1)
计量
  • 文章访问数:  2334
  • HTML全文浏览量:  374
  • PDF下载量:  658
  • 被引次数: 0
出版历程
  • 收稿日期:  2016-08-10
  • 录用日期:  2017-03-02
  • 刊出日期:  2017-10-20

目录

    /

    返回文章
    返回