深度学习在基于单幅图像的物体三维重建中的应用

陈加; 张玉麒; 宋鹏; 魏艳涛; 王煜

doi:10.16383/j.aas.2018.c180236

深度学习在基于单幅图像的物体三维重建中的应用

doi: 10.16383/j.aas.2018.c180236

陈加^1,2,,
张玉麒^1,,
宋鹏^3,,
魏艳涛^1, ,,
王煜^4,

1.
华中师范大学教育信息技术学院武汉 430079, 中国
2.
英国萨里大学视觉、语音和信号处理中心萨里 GU27XH, 英国
3.
瑞士联邦理工学院(洛桑)计算机图形学与几何实验室洛桑 CH-1015, 瑞士
4.
香港科技大学机器人研究院香港 999077, 中国

基金项目:

国家自然科学基金 61605054

湖北省自然科学基金 2014CFB659

华中师范大学中央高校基本科研业务费 CCNU16JYKX039

华中师范大学中央高校基本科研业务费 CC NU19QD007

华中师范大学中央高校基本科研业务费 CCNU15A0 5023

华中师范大学中央高校基本科研业务费 CCNU19TD007

国家自然科学基金 61502195

详细信息

作者简介:
陈加  华中师范大学教育信息技术学院讲师.主要研究方向为可视计算, 运动捕捉, 三维重建, 教育信息技术.E-mail:jc@mail.ccnu.edu.cn

张玉麒  华中师范大学教育信息技术学院硕士研究生.主要研究方向为深度学习, 三维重建.E-mail:ZYQ2046@mail.ccnu.edu.cn

宋鹏  瑞士联邦理工学院(洛桑)计算机图形学与几何实验室博士后.主要研究方向为计算机图形学, 三维重建.E-mail:peng.song@epfl.ch

王煜  香港科技大学机器人研究院院长, 教授.主要研究方向为几何建模与设计, 机器人学.E-mail:mywang@ust.hk

通讯作者:
魏艳涛华中师范大学教育信息技术学院副教授.主要研究方向为深度学习, 计算机视觉.本文通信作者.E-mail:weiyantaoccnu@163.com

计量
- 文章访问数: 4653
- HTML全文浏览量: 2112
- PDF下载量: 1838
- 被引次数: 0
出版历程
- 收稿日期: 2018-04-20
- 录用日期: 2018-08-30
- 刊出日期: 2019-04-20

Application of Deep Learning to 3D Object Reconstruction From a Single Image

CHEN Jia^{1,2
,},
ZHANG Yu-Qi^1
,,
SONG Peng^3
,,
WEI Yan-Tao^{1
, ,},
WANG Yu^4
,

1.
School of Educational Information Technology, Central China Normal University, Wuhan 430079, China
2.
Centre for Vision, Speech and Signal Processing, University of Surrey, Surrey GU27XH, UK
3.
Computer Graphics and Geometry Laboratory, École Polytechnique Fédérale de Lausanne, Lausanne CH-1015, Switzerland
4.
Robotics Institute, the Hong Kong University of Science and Technology, Hong Kong 999077, China

Funds:

National Natural Science Foundation of China 61605054

Hubei Provincial Natural Science Foundation 2014CFB659

the Fundamental Research Funds for the central Universities of Central China Normal University CCNU16JYKX039

the Fundamental Research Funds for the central Universities of Central China Normal University CC NU19QD007

the Fundamental Research Funds for the central Universities of Central China Normal University CCNU15A0 5023

the Fundamental Research Funds for the central Universities of Central China Normal University CCNU19TD007

National Natural Science Foundation of China 61502195

More Information

Author Bio:
Lecturer at the School of Educational Information Technology, Central China Normal University. His research interest covers visual computing, motion capture, 3D reconstruction, and educational information technology

Master student at the School of Educational Information Technology, Central China Normal University. His research interest covers deep learning and 3D reconstruction

Postdoctor at the Computer Graphics and Geometry Laboratory, École Polytechnique Fédérale de Lausanne, Switzerland. His research interest covers computer graphics and 3D reconstruction

Director (professor) at the Robotics Institute, the Hong Kong University of Science and Technology. His research interest covers geometric modeling and design, robotics

Corresponding author: WEI Yan-Tao Associate professor at the School of Educational Information Technology, Central China Normal University. His research interest covers deep learning and computer vision. Corresponding author of this paper

摘要

摘要: 基于单幅图像的物体三维重建是计算机视觉领域的一个重要问题，近几十年来得到了广泛的关注.随着深度学习的不断发展，近年来基于单幅图像的物体三维重建取得了显著进展.本文对深度学习在基于单幅图像的物体三维重建领域的研究进展及具体应用进行了综述.首先介绍了基于单幅图像的三维重建的研究背景及其传统方法的研究现状，其次简要介绍了深度学习并详细综述了深度学习在基于单幅图像的物体三维重建中的应用，随后简要概述了三维物体重建的常用公共数据集，最后进行了分析与总结，指出了目前存在的问题及未来的研究方向.
- 三维重建 /
- 深度学习 /
- 计算机视觉 /
- 单幅图像
Abstract: 3D object reconstruction from a single image is an important topic in computer vision, which has attracted enormous attention during the past decades. With the further study in deep learning, remarkable progress of 3D object reconstruction from a single image has been obtained in recent years. In this paper, we review the applications of deep learning models in the field of 3D object reconstruction from a single image. First, we introduce the research background and the current state-of-the-art of traditional methods. Then, we provide a brief overview of typical deep learning models and we describe the applications of deep learning techniques in 3D object reconstruction from a single image. After that, we list several commonly used data sets for 3D object reconstruction. Finally, we discuss current challenges and further research directions.
- 3D reconstruction /
- deep learning /
- computer vision /
- single image
注释:

1) 本文责任编委吴毅红

HTML全文

表 1 不同方法对PASCAL VOC数据集图像中的物体重建的结果对比^[20]

Table 1 Comparison of different methods on the PASCAL VOC ^[20]

方法	飞机	单车	轮船	公交	汽车	椅子	摩托	沙发	火车	电视	均值
Twarog等^[39]	9.73	10.39	11.68	15.40	11.77	8.58	8.99	8.62	23.68	9.45	11.83
Vicente等^[19]	5.07	6.03	8.80	8.76	4.38	5.74	4.86	6.49	17.52	8.37	7.60
Kar等^[20]	5.00	6.27	9.94	6.22	5.18	5.20	4.98	6.58	12.60	9.64	7.16

下载: 导出CSV

表 2 现有的传统方法与3D-R2N2重建结果的对比^[20]

Table 2 Comparison of traditional methods and 3D-R2N2 ^[8]

方法	飞机	单车	轮船	公交	汽车	椅子	摩托	沙发	火车	电视	均值
Kar等^[20]	0.298	0.114	0.188	0.501	0.472	0.234	0.361	0.149	0.249	0.492	0.318
Choy等^[8]	0.544	0.499	0.560	0.816	0.699	0.280	0.649	0.332	0.672	0.574	0.571

下载: 导出CSV

表 3 不同方法以平均IoU值作为评价标准的重建精度对比

Table 3 3D reconstruction comparison with different methods using IoU

	Choy等^[8]	Yan等^[79]	Kar等^[74]	Fan等^[74]	Kato等^[74]
IoU均值	0.556	0.574	0.605	0.640	0.602

下载: 导出CSV

参考文献(104)

[1]	Rezende D J, Ali Eslami S M, Mohamed S, Battaglia P, Jaderberg M, Heess N. Unsupervised learning of 3D structure from images. In: Proceedings of the 30th Conference on Neural Information Processing Systems (NIPS 2016). New York, USA: Curran Associates, Inc., 2016. 4996-5004
[2]	Haming K, Peters G. The structure-from-motion reconstruction pipeline-a survey with focus on short image sequences. Kybernetika, 2010, 46(5):926-937 https://dml.cz/bitstream/handle/10338.dmlcz/141400/Kybernetika_46-2010-5_8.pdf
[3]	Lhuillier M, Quan L. A quasi-dense approach to surface reconstruction from uncalibrated images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(3):418-433 doi: 10.1109/TPAMI.2005.44
[4]	Habbecke M, Kobbelt L. A surface-growing approach to multi-view stereo reconstruction. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, MN, USA: IEEE, 2007. 1-8
[5]	Oswald M R, Töppe E, Nieuwenhuis C, Cremers D. A review of geometry recovery from a single image focusing on curved object reconstruction. Innovations for Shape Analysis. Berlin, Germany: Springer-Verlag, 2013. 343-378
[6]	Yi L, Shao L, Savva M, Huang H B, Zhou Y, Wang Q R, et al. Large-scale 3D shape reconstruction and segmentation from ShapeNet Core55. arXiv preprint arXiv: 1710.06104, 2017.
[7]	Aspert N, Santa-Cruz D, Ebrahimi T. MESH: measuring errors between surfaces using the Hausdorff distance. In: Proceedings of the 2002 IEEE International Conference on Multimedia and Expo. Lausanne, Switzerland: IEEE, 2002. 705-708
[8]	Choy C B, Xu D F, Gwak J Y, Chen K, Savarese S. 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction. In: Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer, 2016. 628-644
[9]	Fan H Q, Su H, Guibas L. A point set generation network for 3D object reconstruction from a single image. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, Hawaii, USA: IEEE, 2017. 2463-2471
[10]	Kemelmacher-Shlizerman I, Basri R. 3D face reconstruction from a single image using a single reference face shape. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(2):394-405 doi: 10.1109/TPAMI.2010.63
[11]	Wang H K, Stout D B, Chatziioannou A F. Mouse atlas registration with non-tomographic imaging modalities-a pilot study based on simulation. Molecular Imaging and Biology, 2012, 14(4):408-419 doi: 10.1007/s11307-011-0519-x
[12]	Dworzak J, Lamecker H, Von Berg J, Klinder T, Lorenz C, Kainmüller D, et al. 3D reconstruction of the human rib cage from 2D projection images using a statistical shape model. International Journal of Computer Assisted Radiology and Surgery, 2010, 5(2):111-124 doi: 10.1007/s11548-009-0390-2
[13]	Baka N, Kaptein B L, De Bruijne M, Van Walsum T, Giphart J E, Niessen W J, et al. 2D-3D shape reconstruction of the distal femur from stereo X-ray imaging using statistical shape models. Medical Image Analysis, 2011, 15(6):840-850 doi: 10.1016/j.media.2011.04.001
[14]	Blanz V, Vetter T. A morphable model for the synthesis of 3D faces. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques. New York, USA: ACM Press, 1999. 187-194
[15]	Cashman T J, Fitzgibbon A W. What shape are dolphins? Building 3D morphable models from 2D images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1):232-244 doi: 10.1109/TPAMI.2012.68
[16]	Bakshi S, Yang Y H. Shape from shading for non-Lambertian surfaces. In: Proceedings of the 1st International Conference on Image Processing. Austin, TX, USA: IEEE, 1994. 130-134
[17]	Ahmed A, Farag A. Shape from shading for hybrid surfaces. In: Proceedings of the 2007 IEEE International Conference on Image Processing. San Antonio, TX, USA: IEEE, 2007. Ⅱ-525-Ⅱ-528
[18]	Jin H L, Soatto S, Yezzi A J. Multi-view stereo reconstruction of dense shape and complex appearance. International Journal of Computer Vision, 2005, 63(3):175-189 doi: 10.1007/s11263-005-6876-7
[19]	Vicente S, Carreira J, Agapito L, Batista J. Reconstructing PASCAL VOC. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014. 41-48
[20]	Kar A, Tulsiani S, Carreira J, Malik J. Category-specific object reconstruction from a single image. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, MA, USA: IEEE, 2015. 1966-1974
[21]	Prasad M, Zisserman A, Fitzgibbon A W. Fast and controllable 3D modelling from silhouettes. In: Proceedings of the 2005 Eurographics. Hamburg, Federal Republic of Germany: Elsevier Science Publishing Company, 2005. 9-12
[22]	Ikeuchi K, Horn B K P. Numerical shape from shading and occluding boundaries. Artificial Intelligence, 1981, 17(1-3):141-184 doi: 10.1016/0004-3702(81)90023-0
[23]	Prasad M, Fitzgibbon A. Single view reconstruction of curved surfaces. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06). New York, NY, USA: IEEE, 2006. 1345-1354
[24]	Daum M, Dudek G. On 3-D surface reconstruction using shape from shadows. In: Proceedings of the 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Santa Barbara, CA, USA: IEEE, 1998. 461-468
[25]	Kato H, Ushiku Y, Harada T. Neural 3D mesh renderer. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 37-44
[26]	Rother D, Sapiro G. Seeing 3D objects in a single 2D image. In: Proceedings of the 12th IEEE International Conference on Computer Vision. Kyoto, Japan: IEEE, 2009. 1819-1826
[27]	Nevatia R, Binford T O. Description and recognition of curved objects. Artificial Intelligence, 1977, 8(1):77-98 https://dl.acm.org/citation.cfm?id=3015410.3015415
[28]	Gupta A, Efros A A, Hebert M. Blocks world revisited: image understanding using qualitative geometry and mechanics. In: Proceedings of the 11th European Conference on Computer Vision. Heraklion, Crete, Greece: Springer-Verlag, 2010. 482-496
[29]	Xiao J X, Russell B C, Torralba A. Localizing 3D cuboids in single-view images. In: Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, USA: Curran Associates Inc., 2012. 746-754
[30]	Pentland A P. Perceptual organization and the representation of natural form. Artificial Intelligence, 1986, 28(3):293-331 doi: 10.1016/0004-3702(86)90052-4
[31]	Haag M, Nagel H H. Combination of edge element and optical flow estimates for 3D-model-based vehicle tracking in traffic image sequences. International Journal of Computer Vision, 1999, 35(3):295-319 doi: 10.1023/A:1008112528134
[32]	Koller D, Daniilidis K, Nagel H H. Model-based object tracking in monocular image sequences of road traffic scenes. International Journal of Computer Vision, 1993, 10(3):257-281 doi: 10.1007/BF01539538
[33]	Lim J J, Pirsiavash H, Torralba A. Parsing Ikea objects: fine pose estimation. In: Proceedings of the 2013 IEEE International Conference on Computer Vision. Sydney, NSW, Australia: IEEE, 2013. 2992-2999
[34]	Satkin S, Rashid M, Lin J, Hebert M. 3DNN:3D nearest neighbor. International Journal of Computer Vision, 2015, 111(1):69-97 doi: 10.1007/s11263-014-0734-4
[35]	Pepik B, Stark M, Gehler P, Ritschel T, Schiele B. 3D object class detection in the wild. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Boston, MA, USA: IEEE, 2015. 1-10
[36]	Huang Q X, Wang H, Koltun V. Single-view reconstruction via joint analysis of image and shape collections. ACM Transactions on Graphics (TOG), 2015, 34(4): Article No. 87
[37]	Liu F, Zeng D, Li J, Zhao Q J. Cascaded regressor based 3D face reconstruction from a single arbitrary view image.[Online], available: https://arxiv.org/abs/1509.06161v1, March 25, 2019
[38]	Blanz V, Vetter T. Face recognition based on fitting a 3D morphable model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(9):1063-1074 doi: 10.1109/TPAMI.2003.1227983
[39]	Twarog N R, Tappen M F, Adelson E H. Playing with puffball: simple scale-invariant inflation for use in vision and graphics. In: Proceedings of the 2012 ACM Symposium on Applied Perception. Los Angeles, California, USA: ACM, 2012. 47-54
[40]	Aloimonos J. Shape from texture. Biological Cybernetics, 1988, 58(5):345-360 doi: 10.1007/BF00363944
[41]	Marinos C, Blake A. Shape from texture: the homogeneity hypothesis. In: Proceedings of the 3rd International Conference on Computer Vision. Osaka, Japan: IEEE, 1990. 350-353
[42]	Loh A M, Hartley R I. Shape from non-homogeneous, non-stationary, anisotropic, perspective texture. In: Proceedings of the 2005 British Machine Vision Conference. Oxford, UK: BMVC, 2005. 69-78
[43]	Horn B K P. Obtaining Shape from Shading Information. Cambridge:MIT Press, 1989. 123-171
[44]	Robles-Kelly A, Hancock E R. An eigenvector method for shape-from-shading. In: Proceedings of the 12th International Conference on Image Analysis and Processing. Mantova, Italy: IEEE, 2003. 474-479
[45]	Cheung W P, Lee C K, Li K C. Direct shape from shading with improved rate of convergence. Pattern Recognition, 1997, 30(3):353-365 doi: 10.1016/S0031-3203(96)00097-0
[46]	Yang L, Han J Q. 3D shape reconstruction of medical images using a perspective shape-from-shading method. Measurement Science and Technology, 2008, 19(6): Article No. 065502
[47]	Tankus A, Kiryati N. Photometric stereo under perspective projection. In: Proceedings of the 10th IEEE International Conference on Computer Vision. Beijing, China: IEEE, 2005. 611-616
[48]	Saxena A, Chung S H, Ng A Y. Learning depth from single monocular images. In: Proceedings of the 18th International Conference on Neural Information Processing Systems. Vancouver, British Columbia, Canada: MIT Press, 2005. 1161-1168
[49]	Saxena A, Sun M, Ng A Y. Make3D:learning 3D scene structure from a single still image. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(5):824-840 doi: 10.1109/TPAMI.2008.132
[50]	Delage E, Lee H, Ng A Y. A dynamic Bayesian network model for autonomous 3D reconstruction from a single indoor image. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06). New York, USA: IEEE, 2006. 2418-2428
[51]	Tulsiani S, Kar A, Carreira J, Malik J. Learning category-specific deformable 3D models for object reconstruction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4):719-731 doi: 10.1109/TPAMI.2016.2574713
[52]	王伟, 高伟, 朱海, 胡占义.快速鲁棒的城市场景分段平面重建.自动化学报, 2017, 43(4):674-684 http://www.aas.net.cn/CN/abstract/abstract19045.shtml Wang Wei, Gao Wei, Zhu Hai, Hu Zhan-Yi. Rapid and robust piecewise planar reconstruction of urban scenes. Acta Automatica Sinica, 2017, 43(4):674-684 http://www.aas.net.cn/CN/abstract/abstract19045.shtml
[53]	缪君, 储珺, 张桂梅, 王璐.基于稀疏点云的多平面场景稠密重建.自动化学报, 2015, 41(4):813-822 http://www.aas.net.cn/CN/abstract/abstract18655.shtml Miao Jun, Chu Jun, Zhang Gui-Mei, Wang Lu. Dense multi-planar scene reconstruction from sparse point cloud. Acta Automatica Sinica, 2015, 41(4):813-822 http://www.aas.net.cn/CN/abstract/abstract18655.shtml
[54]	张峰, 史利民, 孙凤梅, 胡占义.一种基于图像的室内大场景自动三维重建系统.自动化学报, 2010, 36(5):625-633 http://www.aas.net.cn/CN/abstract/abstract13353.shtml Zhang Feng, Shi Li-Min, Sun Feng-Mei, Hu Zhan-Yi. An image based 3D reconstruction system for large indoor scenes. Acta Automatica Sinica, 2010, 36(5):625-633 http://www.aas.net.cn/CN/abstract/abstract13353.shtml
[55]	LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553):436-444 doi: 10.1038/nature14539
[56]	Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors. Nature, 1986, 323(6088):533-536 doi: 10.1038/323533a0
[57]	Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786):504-507 doi: 10.1126/science.1127647
[58]	焦李成, 杨淑媛, 刘芳, 王士刚, 冯志玺.神经网络七十年:回顾与展望.计算机学报, 2016, 39(8):1697-1716 http://d.old.wanfangdata.com.cn/Periodical/jsjxb201608015 Jiao Li-Cheng, Yang Shu-Yuan, Liu Fang, Wang Shi-Gang, Feng Zhi-Xi. Seventy years beyond neural networks:retrospect and prospect. Chinese Journal of Computers, 2016, 39(8):1697-1716 http://d.old.wanfangdata.com.cn/Periodical/jsjxb201608015
[59]	Feng X, Zhang Y D, Glass J. Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition. In: Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Florence, Italy: IEEE, 2014. 1759-1763
[60]	Graves A, Mohamed A R, Hinton G. Speech recognition with deep recurrent neural networks. In: Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, BC, Canada: IEEE, 2013. 6645-6649
[61]	Collobert R, Weston J. A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning. Helsinki, Finland: ACM, 2008. 160-167
[62]	Huang E H, Socher R, Manning C D, Ng A Y. Improving word representations via global context and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Jeju Island, Korea: Association for Computational Linguistics, 2012. 873-882
[63]	Mikolov T, Chen K, Corrado G S, Dean J. Efficient estimation of word representations in vector space.[Online], available: http://www.oalib.com/paper/4057741, March 25, 2019
[64]	Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, USA: Curran Associates Inc., 2012. 1097-1105
[65]	Le Q V. Building high-level features using large scale unsupervised learning. In: Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, BC, Canada: IEEE, 2013. 8595-8598
[66]	Socher R, Huval B, Bath B, Manning C D, Ng A Y. Convolutional-recursive deep learning for 3D object classification. In: Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, USA: Curran Associates Inc., 2012. 656-664
[67]	Wu Z R, Song S R, Khosla A, Yu F, Zhang L G, Tang X O, et al. 3D shapeNets: a deep representation for volumetric shapes. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, MA, USA: IEEE, 2015. 1912-1920
[68]	Gupta S, Girshick R, ArbelÁez P, Malik J. Learning rich features from RGB-D images for object detection and segmentation. In: Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer-Verlag, 2014. 345-360
[69]	Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets. Neural Computation, 2006, 18(7):1527-1554 doi: 10.1162/neco.2006.18.7.1527
[70]	Schölkopf B, Platt J, Hofmann T. Greedy layer-wise training of deep networks. In: Proceedings of the 19th International Conference on Neural Information Processing Systems. Canada: MIT Press, 2006. 153-160
[71]	LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11):2278-2324 doi: 10.1109/5.726791
[72]	Williams R J, Zipser D. A learning algorithm for continually running fully recurrent neural networks. Neural Computation, 1989, 1(2):270-280 doi: 10.1162/neco.1989.1.2.270
[73]	Girdhar R, Fouhey D F, Rodriguez M, Gupta A. Learning a predictable and generative vector representation for objects. In: Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer-Verlag, 2016. 484-499
[74]	Kar A, Hane C, Malik J. Learning a multi-view stereo machine. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017). New York, USA: Curran Associates, Inc., 2017. 364-375
[75]	Wu J J, Wang Y F, Xue T F, Sun X Y, Freeman W T, Tenenbaum J B. MarrNet: 3D shape reconstruction via 2.5D sketches. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017). New York, USA: Curran Associates, Inc., 2017. 8-15
[76]	Kanazawa A, Jacobs D W, Chandraker M. WarpNet: weakly supervised matching for single-view reconstruction. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA: IEEE, 2016. 3253-3261
[77]	Tulsiani S, Zhou T H, Efros A A, Malik J. Multi-view supervision for single-view reconstruction via differentiable ray consistency. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, Hawaii, USA: IEEE, 2017. 209-217
[78]	Tulsiani S. Learning Single-view 3D Reconstruction of Objects and Scenes[Ph. D. dissertation], UC Berkeley, USA, 2018
[79]	Yan X C, Yang J M, Yumer E, Guo Y J, Lee H. Perspective transformer nets: learning single-view 3D object reconstruction without 3D supervision. In: Proceedings of the 30th Conference on Neural Information Processing Systems (NIPS 2016). New York, USA: Curran Associates, Inc., 2016. 1696-1704
[80]	Gwak J Y, Choy C B, Garg A, Chandraker M, Savarese S. Weakly supervised generative adversarial networks for 3D reconstruction. arXiv preprint arXiv: 1705.10904, 2017. 263-272
[81]	Rosca M, Lakshminarayanan B, Warde-Farley D, Mohamed S. Variational approaches for auto-encoding generative adversarial networks. arXiv preprint arXiv: 1706. 04987, 2017.
[82]	Zhu R, Galoogahi H K, Wang C Y, Lucey S. Rethinking reprojection: closing the loop for pose-aware shape reconstruction from a single image. In: Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 57-65
[83]	Liu J, Yu F, Funkhouser T. Interactive 3D modeling with a generative adversarial network. In: Proceedings of the 2017 International Conference on 3D Vision (3DV). Qingdao, China: IEEE, 2018. 126-134
[84]	Wu J J, Zhang C K, Xue T F, Freeman W T, Tenenbaum J B. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Proceedings of the 30th Conference on Neural Information Processing Systems (NIPS 2016). New York, USA: Curran Associates, Inc., 2016. 82-90
[85]	Gadelha M, Maji S, Wang R. 3D shape induction from 2D views of multiple objects. In: Proceedings of the 2017 International Conference on 3D Vision (3DV). Qingdao, China: IEEE, 2017. 402-411
[86]	Wang P S, Liu Y, Guo Y X, Sun C Y, Tong X. O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Transactions on Graphics (TOG), 2017, 36(4): Article No. 72
[87]	Sun Y B, Liu Z W, Wang Y, Sarma S E. Im2avatar: Colorful 3D reconstruction from a single image.[Online], available: https://arxiv.org/abs/1804.06375, March 25, 2019
[88]	Tatarchenko M, Dosovitskiy A, Brox T. Octree generating networks: efficient convolutional architectures for high-resolution 3D outputs. In: Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 2107-2115
[89]	Riegler G, Ulusoys A O, Geiger A. Octnet: learning deep 3D representations at high resolutions. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, Hawaii, USA: IEEE, 2017. 6620-6629
[90]	Häne C, Tulsiani S, Malik J. Hierarchical surface prediction for 3D object reconstruction. In: Proceedings of the 2017 International Conference on 3D Vision (3DV). Qingdao, China: IEEE, 2017. 76-84
[91]	Charles R Q, Su H, Mo K, Guibas L J. PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, Hawaii, USA: IEEE, 2017. 77-85
[92]	Qi C R, Yi L, Su H, Guibas L J. Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017). New York, USA: Curran Associates, Inc., 2017. 5099-5108
[93]	Klokov R, Lempitsky V. Escape from cells: deep Kd-networks for the recognition of 3D point cloud models. In: Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 863-872
[94]	Newell A, Yang K Y, Deng J. Stacked hourglass networks for human pose estimation. In: Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands: Springer, 2016. 483-499
[95]	Lin C H, Kong C, Lucey S. Learning efficient point cloud generation for dense 3D object reconstruction. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco, California, USA: AAAI, 2017. 3-11
[96]	Pontes J K, Kong C, Sridharan S, Lucey S, Eriksson A, Fookes C. Image2mesh: A learning framework for single image 3D reconstruction.[Online], available: https://arxiv.org/abs/1711.10669v1, March 25, 2019
[97]	Wang N Y, Zhang Y D, Li ZW, Fu Y W, Liu W, Jiang Y G. Pixel2mesh: Generating 3D mesh models from single rgb images.[Online], available: https://arxiv.org/abs/1804.01654v1, March 25, 2019
[98]	Xiang Y, Mottaghi R, Savarese S. Beyond PASCAL: a benchmark for 3D object detection in the wild. In: Proceedings of the 2014 IEEE Winter Conference on Applications of Computer Vision. Steamboat Springs, CO, USA: IEEE, 2014. 75-82
[99]	Everingham M, Van Gool L, Williams C K I, Winn J, Zisserman A. The PASCAL visual object classes (VOC) challenge. International Journal of Computer Vision, 2010, 88(2):303-338 doi: 10.1007/s11263-009-0275-4
[100]	Deng J, Dong W, Socher R, Li L J, Li K, Li F F. ImageNet: a large-scale hierarchical image database. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA: IEEE, 2009. 248-255
[101]	Chang A X, Funkhouser T, Guibas L, Hanrahan P, Huang Q X, Li Z M, et al. Shapenet: An information-rich 3d model repository.[Online], available: https://arxiv.org/abs/1512.03012v1, March 25, 2019
[102]	Miller G A. WordNet:a lexical database for English. Communications of the ACM, 1995, 38(11):39-41 doi: 10.1145/219717.219748
[103]	Song H O, Xiang Y, Jegelka S, Savarese S. Deep metric learning via lifted structured feature embedding. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016. 4004-4012
[104]	Shilane P, Min P, Kazhdan M, Funkhouser T. The princeton shape benchmark. In: Proceedings of the 2004 Shape Modeling Applications. Genova, Italy: IEEE, 2004. 167-178

施引文献

资源附件(0)

访问统计

表(3)

计量

文章访问数: 4653
HTML全文浏览量: 2112
PDF下载量: 1838
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

深度学习在基于单幅图像的物体三维重建中的应用

doi: 10.16383/j.aas.2018.c180236

通讯作者:
魏艳涛华中师范大学教育信息技术学院副教授.主要研究方向为深度学习, 计算机视觉.本文通信作者.E-mail:weiyantaoccnu@163.com

计量

Application of Deep Learning to 3D Object Reconstruction From a Single Image

Corresponding author: WEI Yan-Tao Associate professor at the School of Educational Information Technology, Central China Normal University. His research interest covers deep learning and computer vision. Corresponding author of this paper

计量

目录

留言板

深度学习在基于单幅图像的物体三维重建中的应用

doi: 10.16383/j.aas.2018.c180236

通讯作者: 魏艳涛 华中师范大学教育信息技术学院副教授.主要研究方向为深度学习, 计算机视觉.本文通信作者.E-mail:weiyantaoccnu@163.com

计量

出版历程

Application of Deep Learning to 3D Object Reconstruction From a Single Image

Corresponding author: WEI Yan-Tao Associate professor at the School of Educational Information Technology, Central China Normal University. His research interest covers deep learning and computer vision. Corresponding author of this paper

计量

出版历程

目录

通讯作者:
魏艳涛华中师范大学教育信息技术学院副教授.主要研究方向为深度学习, 计算机视觉.本文通信作者.E-mail:weiyantaoccnu@163.com