Dr. Yixin Zhu received a Ph.D. degree (‘18) from UCLA advised by Prof. Song-Chun Zhu. His research builds interactive AI by integrating high-level common sense (functionality, affordance, physics, causality, intent) with raw sensory inputs (pixels and haptic signals) to enable richer representation and abstract reasoning on objects, scenes, shapes, numbers, and agents. He is a co-organizer of Vision Meets Cognition (FPIC) workshops, 3D Scene Understanding for Vision, Graphics, and Robotics workshops, and Virtual Reality Meets Physical Reality workshops.

During his Ph.D. and postdoc studies, his work was supported by DARPA MSEE, DARPA SIMPLEX, DARPA XAI, ONR MURI on Scene Understanding, and ONR Cognitive Systems for Human-Machine Teaming.

His group is looking for highly motivated undergrads, Ph.D. students, and postdocs with exceptional programming skills and solid math backgrounds to work on 3D computer vision, abstract reasoning, physics-based simulation, and cognitive robot.


  • computer vision
  • artificial intelligence
  • human-robot interaction
  • Ph.D. in statistics, 2018


  • M.S. in computer science, 2013


  • B.Eng. in software engineering, 2012

    Xi'an Jiaotong University


[RA-L/IROS22] Object Gathering with a Tethered Robot Duo
[NeurIPS21] Unsupervised Foreground Extraction via Deep Region Competition
[ICCV21] YouRefIt: Embodied Reference Understanding with Language and Gesture
[IROS21] Consolidating Kinematic Models to Promote Coordinated Mobile Manipulations
[IROS21] Efficient Task Planning for Mobile Manipulation: a Virtual Kinematic Chain Perspective
[CogSci21] Individual vs. Joint Perception: a Pragmatic Model of Pointing as Communicative Smithian Helping
[CVPR21] Learning Triadic Belief Dynamics in Nonverbal Communication from Videos
[CVPR21] ACRE: Abstract Causal Reasoning Beyond Covariation
[ICRA21] Reconstructing Interactive 3D Scene by Panoptic Mapping and CAD Model Alignments
[ICRA21] Congestion-aware Multi-agent Trajectory Prediction for Collision Avoidance
[ECCV20] LEMMA: A Multi-view Dataset for Learning Multi-agent Multi-task Activities
[IROS20] Human-Robot Interaction in a Shared Augmented Reality Workspace
[Engineering20] Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense
[SIGGRAPH20] A Massively Parallel and Scalable Multi-GPU Material Point Method
[SIGGRAPH20] IQ-MPM: An Interface Quadrature Material Point Method for Non-sticky Strongly Two-Way Coupled Nonlinear Solids and Fluids
[ICRA20] Joint Inference of States, Robot Knowledge, and Human (False-)Beliefs
[ICRA20] Congestion-aware Evacuation Routing using Augmented Reality Devices
[ScienceRobotics19] A tale of two explanations: Enhancing human trust by explaining robot behavior
[AAAI20] Theory-based Causal Transfer: Integrating Instance-level Induction and Abstract-level Structure Learning
[NeurIPS19] Learning Perceptual Inference by Contrasting
[NeurIPS19] PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points



  • Wenhe Zhang, Master, Computer Science, UCLA, 2020 Fall
  • Xiaojian Ma, Master, Computer Science, UCLA, 2019 Fall
  • Xiaolin Fang, Ph.D., CSAIL, MIT, 2019 Fall
  • Shu Wang, Ph.D., Statistics, UCLA, 2018 Fall
  • Wenwen Si, Master, Computer Vision, CMU, 2018 Fall
  • Hangxin Liu, Ph.D., Computer Science, UCLA, 2018 Spring
  • Jenny Lin, Ph.D., Computer Science, CMU, 2017 Fall
  • Mark Edmonds, Ph.D., Computer Science, UCLA, 2017 Fall
  • Tian Ye, Master, Robotics, CMU, 2017 Fall
  • Feng Gao, Master, Statistics, UCLA, 2017 Fall
  • Xu Xie, Master, Statistics, UCLA, 2017 Fall
  • Xingwen Guo, Master, Computer Science, Yale, 2017 Fall
  • Chi Zhang, Master, Computer Science, UCLA, 2017 Fall
  • Jingyu Shao, Master, Statistics, UCLA, 2016 Winter
  • Yutong Zhang, Master in Computer Science, UCLA, 2015 Fall