CVPR2016 Tutorial: 3D Deep Learning with Marvin

Introduction

In recent years, deep learning has made unprecedented progress in artificial intelligence tasks from speech recognition to image retrieval. However, though language (one-dimensional) and images (two-dimensional) have received incredible attention, existing methods poorly serve the three-dimensional data that drives a broad range of critical applications such as 3D object recognition, medical imaging, neuroscience, autonomous driving, to scientific simulations. In this tutorial, we plan to teach the basic concepts of three-dimensional deep learning. We will talk about how to use our 3D deep learning software framework Marvin. We will also give examples of applying 3D deep learning algorithm to computer vision tasks, one using a discriminative model and the other using a generative model. Finally, we will introduce a few datasets for training these algorithms.

Schedule

Time	Topic	Speaker	Slides
8:30 - 9:45	Introduction to Marvin and Deep Learning	Jianxiong Xiao	link
9:45 - 10:00	Training 3D Network	Fisher Yu	link code
10:00 - 10:40	Coffee Break
10:40 - 11:15	Primitive-Level 3D Deep Learning	Andy Zeng	link
11:15 - 11:50	Object-Level 3D Deep Learning	Shuran Song	link
11:50 - 12:25	Scene-Level 3D Deep Learning	Yinda Zhang	link
12:25 - 12:30	Closing Remarks

Organizers

Fisher Yu - Princeton University
Shuran Song - Princeton University
Yinda Zhang - Princeton University
Andy Zeng - Princeton University
Jianxiong Xiao - AutoX, inc.

References

[1] J. Xiao, S. Song, D. Suo, and F. Yu, Marvin: A minimalist GPU-only N-dimensional ConvNet framework, http://marvin.is
[2] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang and J. Xiao, 3D ShapeNets: A Deep Representation for Volumetric Shapes, CVPR 2015
[3] S. Song and J. Xiao, Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images, CVPR 2016
[4] D. Maturana and S. Scherer, VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition, IROS 2015
[5] A. Zeng, S. Song, M. Niebner, M. Fisher, and J. Xiao, 3DMatch: Learning the Matching of Local 3D Geometry in Range Scans, arXiv:1603.08182
[6] Y. Zhang, M. Bai, P. Kohli, S. Izadi, and J. Xiao, DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding, arXiv:1603.04922