Submission Instructions
Template: Use the PKU-IAI Technical Report (TR) Template available on the LaTeX Template page. Do not use the Course Project and Essay Template; it is different from the TR template.
Project List
Affordance and Functionality
1. Replicate GPNN Paper Results
- Difficulty:
- Objective: Replicate findings from the GPNN paper.
- Evaluation: Match paper’s tables and figures; discuss discrepancies.
2. Compare Neural Parts and Cuboid Shape Abstraction on PartNet
- Difficulty:
- Objective: Implement and compare Neural Parts and Cuboid Shape Abstraction on PartNet’s chair category.
- Evaluation: Use IoU, Precision, and Recall for qualitative and quantitative comparison.
3. Learn the Concept of a Daily Object (e.g., a Cup)
- Difficulty:
- Objective: Develop a machine learning model to understand and represent the concept of a daily object like a cup.
- Subtasks:
- Define the key components of the object’s concept.
- Formulate the problem and method.
- Collect necessary training data.
- Design evaluation metrics.
- Evaluation: Use qualitative and quantitative metrics to prove the model’s effectiveness in learning the object’s concept.
Intuitive Physics
1. Model Visually Grounded VoEs with SOTA Algorithms
- Difficulty:
- Objective: Use SOTA computer vision algorithms to model Visually Grounded Violations of Expectation (VoEs).
- Evaluation: Accurately measure the model’s “surprise” metric when a VoE is presented.
2. Probabilistic Model for Water-Pouring Task
- Difficulty:
- Objective: Create a probabilistic model to solve the water-pouring task.
- Reference Models: Tom Griffiths and Josh Tenenbaum’s groups.
- Evaluation: Predict the glass angle for water pouring; compare with human performance and reference models.
Causality
Model-based RL for Causal Transfer in OpenLock
- Difficulty:
- Background:
- Question the “Reward is all you need” paradigm in RL.
- Focus on causal transfer in the OpenLock task, a virtual escaping game.
- Model-free RL has limitations in understanding abstract causal structures.
- Objective:
- Design a model-based RL method for OpenLock.
- Address the task’s focus on understanding abstract causal structures and utilizing implicit meta-rules.
- Task Details:
- Clearly state your model construction.
- Compare with model-free RL.
- Optional: Handle probabilistic OpenLock scenarios.
- Evaluation:
- Compare your model-based RL with model-free methods.
- Optional: Include results for probabilistic OpenLock scenarios.
Tool, Mirroring, and Imitation
Virtual Tool Game with Compositional Concepts
- Difficulty:
- Background:
- Explore the Virtual Tool Game and understand the baselines in the referred paper.
- Objective:
- Design a new scenario leveraging compositional concepts (e.g., Bridge + Catapult).
- Propose a model that can solve this new compositional problem while learning individual concepts.
- Task Details:
- Reproduce baselines from the referred paper.
- Create a new scenario involving at least two compositional concepts.
- Develop a model to solve the new problem.
- Evaluation:
- Compare your model’s performance with baseline models.
- Validate its ability to understand and apply compositional concepts.
Communication
1. Cooperation through Nonverbal Cues
- Difficulty:
- Background:
- Study shows chimpanzees use nonverbal cues for cooperation tasks.
- Objective:
- Build a simulated scenario to computationally reproduce these experimental results.
- Task Details:
- Include nonverbal communication cues like gaze and pointing.
- Develop a policy for nonverbal communication under a shared goal.
- Evaluation:
- Validate the model’s ability to use nonverbal cues effectively in a cooperative setting.
2. Emergent Languages in Multi-Agent Systems
- Difficulty:
- Background:
- Multi-agent systems can develop emergent languages.
- Objective:
- Design a task and environment for agents to develop an emergent language.
- Task Details:
- Use the EGG toolkit for training.
- Develop evaluation metrics and report results.
- Evaluation:
- Assess whether agents successfully solve the task through emergent communication.
3. Rational Speech Acts (RSA) Model
- Difficulty:
- Background:
- Familiarize yourself with key papers on the Rational Speech Acts (RSA) Model:
- Objective:
- Implement a literal and a pragmatic agent based on the RSA model.
- Task Details:
- Use the TUNA Corpus for experiments.
- Choose a category (people or furniture) from the singular portion for your experiments.
- Evaluation:
- Use mean accuracy and multiset Dice as formulated in Eqn. (6) to evaluate the agents.
Intentionality
Multi-agent Activity Parsing and Prediction on LEMMA
- Difficulty:
- Background:
- The focus is on activity parsing and prediction in multi-agent scenarios using LEMMA.
- Objective:
- Use grammar parsing or planning methods in a neural-symbolic way.
- Task Details:
- Address challenges like multi-agent activity representation and symbolic plan structures.
- Explore evaluation methods beyond future activity prediction.
- Evaluation:
- Assess the model’s ability to parse and predict multi-agent activities.
Animacy
Generate Animate and Inanimate Dot Motion
- Difficulty:
- Background:
- Explore the unified problem of animate and inanimate motion.
- Objective:
- Generate diverse animate and inanimate dot motion stimuli.
- Task Details:
- Formulate the problem for both synthesis and discriminative tasks.
- Perform a human study for model verification.
- Evaluation:
- Assess the model’s classification accuracy on provided stimuli.
Theory of Mind (ToM)
1. Build a Mini Theory of Mind System
- Difficulty:
- Background:
- Theory of Mind involves understanding mental states like desires, beliefs, and intents.
- Objective:
- Build a system that models these mental states in real or simulated scenarios.
- Task Details:
- Do NOT use open-sourced ToM projects.
- Include modules for desire, belief, and intent.
- Implement inverse mental inference and forward planning processes.
- Evaluation:
- Validate the system’s capability in modeling interactions based on ToM.
2. Hanabi Challenge with Official Environment
- Difficulty:
- Background:
- The Hanabi challenge focuses on cooperative multi-agent systems.
- Objective:
- Build an AI agent to tackle the Hanabi challenge.
- Task Details:
- Use the official environment.
- Evaluation:
- Assess the agent’s performance in the Hanabi environment.
3. Action Understanding as Inverse Planning
- Difficulty:
- Background:
- Familiarize yourself with the Food truck paper and understand the Bayesian inverse planning framework.
- Objective:
- Implement the Bayesian inverse planning model to infer agents’ goals and beliefs.
- Validate the model through psychophysical experiments using animated stimuli.
- Task Details:
- Implement the Bayesian inverse planning model based on Markov decision problems (MDPs).
- Create animated stimuli of agents moving in simple mazes as described in the paper.
- Conduct experiments to measure online goal inferences, retrospective goal inferences, and prediction of future actions.
- Evaluation:
- Assess the model’s ability to accurately infer agents’ goals and beliefs.
Abstract Reasoning
1. Probabilistic PDDL Solver for Block Stacking
- Difficulty:
- Background:
- Explore probabilistic PDDL solvers in the context of block stacking.
- Objective:
- Implement the solver and reproduce block stacking experiments.
- Task Details:
- Follow the experiments outlined in the paper by Huang, De-An, et al..
- Evaluation:
- Validate the solver’s performance in block stacking tasks.
2. Minimal Differentiable Engine for Convex Optimization
- Difficulty:
- Background:
- The focus is on supporting implicit convex optimization and differentiation.
- Objective:
- Design a minimal differentiable engine.
- Task Details:
- Implement the engine and validate its capabilities.
- Evaluation:
- Assess the engine’s performance in optimization tasks.
3. Implement BPL + Language Model and Reproduce Experiments
- Difficulty:
- Background:
- Ellis, 2023 introduces a Bayesian reasoning process where a language model first proposes candidate hypotheses expressed in natural language, which are then re-weighed by a prior and a likelihood.
- Objective:
- Implement the model in Python/PyTorch and reproduce at least one experiment.
- Task Details:
- Follow the proposed framework.
- Choose an experiment (number game or logical concepts learning) to reproduce.
- Evaluation:
- Validate the implementation by comparing your results with the original experiment.
Utility
Learning Human Utility for Object Arrangement
- Difficulty:
- Background:
- The agent aims to infer human utility for arranging objects according to different user preferences.
- Objective:
- Develop a utility function to represent common norms and individual preferences.
- Task Details:
- Use a simulation for the environment.
- Training: Utilize a set of arranged examples with user IDs.
- Testing: Use examples provided by a specific user.
- Input:
- A simulation environment.
- Training: A set of arranged examples with user IDs.
- Testing: Examples provided by a specific user.
- Output:
- A utility function representing common norms and human preferences.
- A policy for object arrangement based on the learned utility function.
- Evaluation:
- Validate the learned utility function and policy against user preferences.
- Reference:
- Organizing objects by predicting user preferences through collaborative filtering, IJRR 2016
- Configurable 3D Scene Synthesis and 2D Image Rendering with Per-Pixel Ground Truth using Stochastic Grammars, IJCV 2018
- Example-based Synthesis of 3D Object Arrangements, TOG 2012
- My House, My Rules: Learning Tidying Preferences with Graph Neural Networks, CoRL 2021
- B-Pref: Benchmarking Preference-Based Reinforcement Learning, NeurIPS 2021
XAI and Teaming
Watch-And-Help Benchmark in Virtual Home Environment
- Difficulty:
- Background:
- The Watch-And-Help benchmark focuses on human-robot teaming with goals and intents.
- Objective:
- Develop algorithms to solve the human-robot teaming problem considering goals and intents.
- Task Details:
- Reproduce baselines for the Watch-And-Help benchmark.
- Propose new algorithms considering goals and intents.
- Evaluation:
- Compare your algorithms with existing baselines.