Syllabus & Logistics
Syllabus
Introduction
- Organization
- Introduction
- History
- Primitives and transformations
- Geometric image formation
- Photometric image formation
- Image sensing pipeline
Image Processing
- Linear filters
- Fourier transformation
- Edge
- Image pyramids
Features and Matching
- Feature descriptor
- SIFT descriptor
- Matching the local descriptors
- RANSAC
- Homography
Structure from Motion
- Camera calibration
- Epipolar geometry
- Triangulation
- Factorization
- Bundle adjustment
Stereo Reconstruction
- Recap epipolar geometry
- Image rectification
- Disparity estimation
- Block matching
- Spatial Regularization
- End-to-end learning
Machine Learning Crash Course
- Linear regression/classifier
- Trees and forests
- XGboost and Adaboost
- SVM
Probabilistic Graphical Model
- Structured prediction
- Markov random field
- Factor graph
- Belief propagation
- Examples
- And-or graph
Deep neural networks
- Linear classifier
- Loss function & regularization
- Activation function
- Back-propagation
- Optimization
- Training neural networks
Convolutional neural networks
- Brief history of CNN
- Convolution layer
- Pooling layer
- CNN architectures
- CNN normalization
- Case studies
Sequence Models
- RNN
- Sequence-to-sequence modeling
- LSTM
- GRU
- Transformer
Visual Recognition
- Single-stage detectors
- Two-stage detectors
- Semantic segmentation
Visual Representation Learning
- What are good representations?
- Supervised learning & fine-tuning
- Unsupervised learning
- Self-supervised learning
- Discrete & sequence learning
3D Scene Understanding
- Holistic 3D scene understanding
- 3D scene model
- 3D human body model
- Interaction model
Vision and Language
- How language helps computer vision?
- What is language grounding?
- Explicit grounding
- Implicit grounding
- Grounding language on visual concepts and programs
- Emergence of language & communication (optional)
Vision meets Cognition
- Physical commonsense
- Social commonsense
- Advanced topics
Vision for Embodied Agents
- Egocentric vision
- 3D representation for embodied agent
- Active vision
- Reinforcement learning
- Visual policy learning
Last updated on Aug 1, 2022