Syllabus & Logistics

Syllabus

Introduction

  • Organization
  • Introduction
  • History

Image Formation

  • Primitives and transformations
  • Geometric image formation
  • Photometric image formation
  • Image sensing pipeline

Image Processing

  • Linear filters
  • Fourier transformation
  • Edge
  • Image pyramids

Features and Matching

  • Feature descriptor
  • SIFT descriptor
  • Matching the local descriptors
  • RANSAC
  • Homography

Structure from Motion

  • Camera calibration
  • Epipolar geometry
  • Triangulation
  • Factorization
  • Bundle adjustment

Stereo Reconstruction

  • Recap epipolar geometry
  • Image rectification
  • Disparity estimation
  • Block matching
  • Spatial Regularization
  • End-to-end learning

Machine Learning Crash Course

  • Linear regression/classifier
  • Trees and forests
  • XGboost and Adaboost
  • SVM

Probabilistic Graphical Model

  • Structured prediction
  • Markov random field
  • Factor graph
  • Belief propagation
  • Examples
  • And-or graph

Deep neural networks

  • Linear classifier
  • Loss function & regularization
  • Activation function
  • Back-propagation
  • Optimization
  • Training neural networks

Convolutional neural networks

  • Brief history of CNN
  • Convolution layer
  • Pooling layer
  • CNN architectures
  • CNN normalization
  • Case studies

Sequence Models

  • RNN
  • Sequence-to-sequence modeling
  • LSTM
  • GRU
  • Transformer

Visual Recognition

  • Single-stage detectors
  • Two-stage detectors
  • Semantic segmentation

Visual Representation Learning

  • What are good representations?
  • Supervised learning & fine-tuning
  • Unsupervised learning
  • Self-supervised learning
  • Discrete & sequence learning

3D Scene Understanding

  • Holistic 3D scene understanding
  • 3D scene model
  • 3D human body model
  • Interaction model

Vision and Language

  • How language helps computer vision?
  • What is language grounding?
  • Explicit grounding
  • Implicit grounding
  • Grounding language on visual concepts and programs
  • Emergence of language & communication (optional)

Vision meets Cognition

  • Physical commonsense
  • Social commonsense
  • Advanced topics

Vision for Embodied Agents

  • Egocentric vision
  • 3D representation for embodied agent
  • Active vision
  • Reinforcement learning
  • Visual policy learning
Next