CVPR2016 Tutorial: 3D Deep Learning with Marvin


In recent years, deep learning has made unprecedented progress in artificial intelligence tasks from speech recognition to image retrieval. However, though language (one-dimensional) and images (two-dimensional) have received incredible attention, existing methods poorly serve the three-dimensional data that drives a broad range of critical applications such as 3D object recognition, medical imaging, neuroscience, autonomous driving, to scientific simulations. In this tutorial, we plan to teach the basic concepts of three-dimensional deep learning. We will talk about how to use our 3D deep learning software framework Marvin. We will also give examples of applying 3D deep learning algorithm to computer vision tasks, one using a discriminative model and the other using a generative model. Finally, we will introduce a few datasets for training these algorithms.


8:30 - 9:45Introduction to Marvin and Deep LearningJianxiong Xiao link
9:45 - 10:00Training 3D NetworkFisher Yu link code
10:00 - 10:40Coffee Break
10:40 - 11:15Primitive-Level 3D Deep LearningAndy Zeng link
11:15 - 11:50Object-Level 3D Deep LearningShuran Song link
11:50 - 12:25Scene-Level 3D Deep LearningYinda Zhang link
12:25 - 12:30Closing Remarks



[1] J. Xiao, S. Song, D. Suo, and F. Yu, Marvin: A minimalist GPU-only N-dimensional ConvNet framework,
[2] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang and J. Xiao, 3D ShapeNets: A Deep Representation for Volumetric Shapes, CVPR 2015
[3] S. Song and J. Xiao, Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images, CVPR 2016
[4] D. Maturana and S. Scherer, VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition, IROS 2015
[5] A. Zeng, S. Song, M. Niebner, M. Fisher, and J. Xiao, 3DMatch: Learning the Matching of Local 3D Geometry in Range Scans, arXiv:1603.08182
[6] Y. Zhang, M. Bai, P. Kohli, S. Izadi, and J. Xiao, DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding, arXiv:1603.04922