Lab 3: Block Tower Stability Predictor | Yixin Zhu

Initial Codebase for Gradescope Submission

Background

Predicting structural stability—such as whether a block tower will collapse—is not only an intellectual exercise but also a critical challenge in architecture, robotics, and disaster prevention. Humans, guided by visual intuition and experience, can swiftly judge stability and even anticipate collapse dynamics, a skill rooted in implicit physical knowledge and causal reasoning.

Current AI models struggle with this task: physics engine simulations are accurate but computationally costly and narrow in scope, while deep learning excels at visual recognition but lacks physical inference. Bridging perception, physics, and reasoning to achieve “intuitive physical perception” is thus essential for progress toward general intelligence. The ShapeStacks dataset advances this goal by providing synthetic block towers with stability annotations, enabling the training of models that approximate human-like intuitive reasoning.

Learning Objective

By completing this assignment, you will:

Understand the concept of visual stability prediction in block towers
Implement an InceptionV4-based model for binary classification (stable vs. unstable)
Train and evaluate the model using the ShapeStacks dataset

Environment Setup

Prerequisites

Python 3.6
Linux environment (recommended)
NVIDIA GPU with CUDA support

Installation

In order to run the intuitive physics models efficiently on GPU, the NVIDIA drivers, CUDA and cuDNN frameworks which are compatible with Tensorflow, and all other required packages should be installed.

conda create -n stability_predictor python=3.6
conda install cudatoolkit=10.0
conda install cudnn==7.6.5
pip install tensorflow-gpu==1.15.0

Assignment Tasks

Task 1: Data Provider

Objective:
The goal of this task is to build a robust data loading pipeline for the ShapeStacks dataset in shapestacks_provider.py.

Description: Complete the data loading pipeline in shapestacks_provider.py:

Implement _get_filenames_with_labels() to assign stability labels
Complete _create_dataset() to create TensorFlow datasets
Implement _parse_record() to read and preprocess images
Finish the main shapestacks_input_fn() function

You can download data here (pickup code: pq3L). The data structure is like:

${DATASET}/
|__ meta/
    |__ blacklist_stable.txt
    |__ blacklist_unstable.txt
|__ mjcf/
    |__ meshes/
    |__ textures/
    |__ assets.xml
    |__ env_blocks-easy-h=2-vcom=0-vpsf=0-v=1.xml
    |__ ...
    |__ env_ccs-hard-h=6-vcom=5-vpsf=0-v=120.xml
    |__ world_blocks-easy-h=2-vcom=0-vpsf=0-v=1.xml
    |__ ...
    |__ world_ccs-hard-h=6-vcom=5-vpsf=0-v=120.xml
|__ recordings/
    |__ env_blocks-easy-h=2-vcom=0-vpsf=0-v=1/
    |__ ...
    |__ env_ccs-hard-h=6-vcom=5-vpsf=0-v=120/
|__ splits/
    |__ blocks_all/
        |__ ...
    |__ ccs_all/
        |__ eval.txt
        |__ test.txt
        |__ train.txt
        |__ eval_bgr_mean.npy
        |__ test_bgr_mean.npy
        |__ train_bgr_mean.npy
    |__ default/
        |__ ...

Task 2: Model Implementation

Objective:
The aim of this task is to develop and complete the model architecture in inception_model.py, enabling both multi-class and binary classification for block tower stability prediction.

Description: Complete the model definition in inception_model.py:

Implement inception_v4_model_fn() for multi-class classification
Implement inception_v4_logregr_model_fn() for binary classification
Add proper loss functions and evaluation metrics

Task 3: Training Loop

Objective:
The purpose of this task is to implement an effective training workflow in train_inception_v4_shapestacks.py.

Description: The script train_inception_v4_shapestacks.py can be used to train a visual stability predictor on the ShapeStacks dataset. The main parameters are:

--data_dir which needs to point to the dataset location SHAPESTACKS_DATASET
--model_dir which defines a MODEL_DIR where all the tensorflow output and snapshots will be stored during training
--split_name which defines which split of data to use: ccs_all / blocks_all

Set your lab root path to SHAPESTACKS_CODE_HOME before training and evaluation:

export SHAPESTACKS_CODE_HOME=./

An example run of the training script looks like this:

(stability_predictor) $ python intuitive_physics/stability_predictor/train_inception_v4_shapestacks.py --data_dir lab3_dataset/shapestacks --model_dir ./output --split_name ccs_all

You can track the training progress by pointing a tensorboard to the model’s root directory:

(stability_predictor) $ tensorboard --logdir=stability_predictor:${MODEL_DIR}

The most recent model checkpoints during training are kept in the models’s root directory. If the training script finds existing checkpoints in MODEL_DIR, it will automatically load the most recent one of them and resume training from there.

During training, the checkpoints which perform best on the validation set are also saved to the snapshots/ subdirectory. The amount of best checkpoints to keep can be set via --n_best_eval.

Complete the training script in train_inception_v4_shapestacks.py:

Set up the Estimator with proper configuration
Implement the training loop with evaluation

Task 4: Testing and Evaluation

Objective:
The goal of this task is to implement a comprehensive testing and evaluation workflow in test_inception_v4_shapestacks.py.

Description: After a stability predictor has been trained, the latest checkpoint or a particular snapshot can be loaded back into a tf.estimator.Estimator You can also set the model_dir parameter of tf.estimator.Estimator to MODEL_DIR/snapshots/<snapshot_name> to load the weights of a particular snapshot.

Complete the testing script in test_inception_v4_shapestacks.py:

Load and evaluate trained models
Generate result files for different splits

Evaluation and Grading

Your implementation will be evaluated based on the accuracy achieved on the test sets. After completing your implementation, you should run the script test_inception_v4_shapestacks.py twice, specifying the --split_name as ccs_all and blocks_all, respectively.

This will generate two result files:

results_ccs_all.py

results_blocks_all.py

Each result will be graded separately, with 50 points assigned to each split:

ccs_all dataset accuracy (50 points)
blocks_all dataset accuracy (50 points)

The final score will be the sum of the two.

Implementation Guidelines

Only modify code within the TODO sections
Do not change the filenames of any files required for submission; submit them exactly as specified.
You may need to adjust environment setup steps (such as Python version, CUDA, or cuDNN) according to your own system configuration.

Hand-in Requirements

Submit the following files as a .zip archive to GradeScore:

data_provider/shapestacks_provider.py
intuitive_physics/stability_predictor/train_inception_v4_shapestacks.py
intuitive_physics/stability_predictor/test_inception_v4_shapestacks.py
tf_models/inception/inception_model.py
results_ccs_all.py
results_blocks_all.py

Academic Integrity

This assignment must be completed individually. You may talk about general ideas and concepts with your classmates, but all code you submit must be written by you alone. Any form of plagiarism will result in failing the course.

Permitted:

Discussing theoretical concepts
Exchanging debugging approaches (without sharing code)
Referring to course materials and recommended resources

Prohibited:

Sharing or duplicating code solutions
Using external code implementations without explicit permission
Working together on the actual coding tasks

Good luck with your assignment!