In recent years, deep learning has made unprecedented progress in artificial intelligence tasks from speech recognition to image retrieval. However, though language (one-dimensional) and images (two-dimensional) have received incredible attention, existing methods poorly serve the three-dimensional data that drives a broad range of critical applications such as 3D object recognition, medical imaging, neuroscience, autonomous driving, to scientific simulations. In this tutorial, we plan to teach the basic concepts of three-dimensional deep learning. We will talk about how to use our 3D deep learning software framework Marvin. We will also give examples of applying 3D deep learning algorithm to computer vision tasks, one using a discriminative model and the other using a generative model. Finally, we will introduce a few datasets for training these algorithms.
Time | Topic | Speaker | Slides |
---|---|---|---|
8:30 - 9:45 | Introduction to Marvin and Deep Learning | Jianxiong Xiao | link |
9:45 - 10:00 | Training 3D Network | Fisher Yu | link code |
10:00 - 10:40 | Coffee Break | ||
10:40 - 11:15 | Primitive-Level 3D Deep Learning | Andy Zeng | link |
11:15 - 11:50 | Object-Level 3D Deep Learning | Shuran Song | link |
11:50 - 12:25 | Scene-Level 3D Deep Learning | Yinda Zhang | link |
12:25 - 12:30 | Closing Remarks | ||
[1] J. Xiao, S. Song, D. Suo, and F. Yu, Marvin: A minimalist GPU-only N-dimensional ConvNet framework, http://marvin.is
[2] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang and J. Xiao, 3D ShapeNets: A Deep Representation for Volumetric Shapes, CVPR 2015
[3] S. Song and J. Xiao, Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images, CVPR 2016
[4] D. Maturana and S. Scherer, VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition, IROS 2015
[5] A. Zeng, S. Song, M. Niebner, M. Fisher, and J. Xiao, 3DMatch: Learning the Matching of Local 3D Geometry in Range Scans, arXiv:1603.08182
[6] Y. Zhang, M. Bai, P. Kohli, S. Izadi, and J. Xiao, DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding, arXiv:1603.04922