Affordance, Functionality, and HOIs

Contributors:

Chao Xu (Affordance)
Zeyu Zhang (Functionality)
Tengyu Liu (HOI/HSI)
Yuyang Li (HOI/HSI)

Reading list

survey/review/perspective paper book GitHub

Required - Affordance

Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense (Section 5), Engineering 2020
From 3D Scene Geometry to Human Workspace, CVPR 2011
Inferring Forces and Learning Human Utilities From Videos, CVPR 2016
Act the Part: Learning Interaction Strategies for Articulated Object Part Discovery, ICCV 2021

Required - HOI/HSI

Essay Option 1: A Deep Dive into the State of the Field

Photograph: Pete Souza/The White House

“The picture above is funny. But for me it is also one of those examples that make me sad about the outlook for AI and for Computer Vision. What would it take for a computer to understand this image as you or I do? I challenge you to think explicitly of all the pieces of knowledge that have to fall in place for it to make sense. … I hate to say it but the state of CV and AI is pathetic when we consider the task ahead, and when we think about how we can ever go from here to there. The road ahead is long, uncertain and unclear. … In any case, we are very, very far and this depresses me. What is the way forward?”

-- Andrej Karpathy, Director of AI and Autopilot Vision at Tesla

The above image was taken in 2010, and the above comment was made in 2012. Since then, AI technology has advanced significantly, and I’m wondering if the above comments still hold true today.

Background

In a blog post by Andrej Karpathy, the complexities of computer vision are explored through the lens of a humorous image featuring President Obama and a man standing on a scale. Karpathy outlines the numerous layers of understanding that a human applies almost instantaneously when viewing the image, from recognizing the 3D structure of the scene to understanding the implications of Obama’s foot on the scale. This serves as a stark contrast to the current state of computer vision, which struggles with such multi-layered interpretations.

Assignment

Write an essay that delves into the complexities of computer vision as outlined by Karpathy. Discuss the various tasks that an algorithm must understand to “get the joke” in the image and how far current technology is from achieving this level of understanding.

Guidelines

Introduction: Introduce the topic of computer vision and its significance in the field of AI. Reference Karpathy’s blog post as a starting point for the discussion.
List of Tasks for Understanding the Image: Enumerate and elaborate on the tasks that Karpathy mentions an algorithm must understand to interpret the image as a human does. These include but are not limited to:
- Recognizing 3D structure
- Understanding visual elements like mirrors
- Identifying people and their roles
- Understanding physics and how objects interact
- Reasoning about the state of mind of people in the image
Current State of Computer Vision: Discuss the current state-of-the-art techniques in computer vision. How do they compare to the list of tasks needed for full understanding?
Challenges in Data and Training: Address the issue of data collection and training algorithms. How can we gather data that supports complex inferences? Is “more data” the solution?
The Role of Embodiment: Explore Karpathy’s notion that embodiment—experiencing the world as humans do—might be necessary for algorithms to understand complex scenes.
Future Directions: What are the potential paths forward in this field? Is the road ahead “long, uncertain, and unclear,” as Karpathy suggests, or are there promising avenues for research?
Conclusion: Summarize the complexities involved in achieving a computer vision system that can understand the world as humans do and offer your own insights into the way forward.
References: Cite any sources, articles, or studies you use to support your arguments.

Evaluation Criteria

Clarity and organization of thoughts
Depth of analysis
Use of case studies and examples
Quality of writing, including grammar and syntax
Proper citation of sources

Additional Resources

State of Computer Vision by Andrej Karpathy

Good luck, and may your essay contribute to the ongoing dialogue in this fascinating field!

Essay Option 2: AI for Autonomous Driving

Background

Recent incidents involving Tesla’s Autopilot and Full Self-Driving (FSD) technologies have raised questions about the challenges of building an AI system capable of driving a car autonomously. In one case, a Tesla Model 3’s Autopilot system mistook a truck hauling deactivated traffic lights for an endless trail of actual traffic lights on the road. In another instance, Tesla’s FSD technology confused the moon for a yellow traffic light, causing the car to apply the brakes unnecessarily. These incidents highlight the difficulties in training AI systems to understand the complexities of the physical world they operate in.

Assignment

Write an essay that explores the challenges and considerations in building an AI system for autonomous driving. Specifically, focus on the aspects of the physical world that an AI should understand to operate safely and efficiently. Use the recent Tesla incidents as case studies to illustrate your points.

Guidelines

Introduction: Introduce the topic and the importance of building reliable AI systems for autonomous driving. Mention the recent Tesla incidents as examples of the challenges involved.
Understanding the Physical World: Discuss the various aspects of the physical world that an AI system should understand, such as:
- Traffic signals and signs
- Road conditions and infrastructure
- Weather conditions
- Other vehicles and pedestrians
- Unusual scenarios (e.g., a truck hauling traffic lights, the moon appearing as a traffic light, etc.)
Limitations of Current Technologies: Examine the limitations of current AI technologies in understanding the physical world. Use the Tesla incidents to demonstrate these limitations.
The Role of Data and Training: Discuss the importance of data and training in building a robust AI system. Address the argument that simply collecting ‘more data’ may not be sufficient for achieving full driving autonomy.
Ethical and Safety Considerations: Explore the ethical implications and safety concerns that arise when AI systems fail to understand the physical world correctly.
Conclusion: Sum up the challenges and considerations in building an AI system for autonomous driving and suggest possible solutions or future directions for research and development.
References: Cite any sources, studies, or news articles you’ve used to support your arguments.

Evaluation Criteria

Clarity and organization of thoughts
Depth of analysis
Use of case studies and examples
Quality of writing, including grammar and syntax
Proper citation of sources

Additional Resources

Good luck, and happy writing!

Reading list

Required - Affordance

Required - HOI/HSI

Required - Functionality

Optional - Affordance in General

Optional - Scene Affordance

Optional - Object Affordance

Optional - HOI/HSI

Optional - Functionality

Essay Option 1: A Deep Dive into the State of the Field

Background

Assignment

Guidelines

Evaluation Criteria

Additional Resources

Essay Option 2: AI for Autonomous Driving

Background

Assignment

Guidelines

Evaluation Criteria

Additional Resources