Step 1 — Add Tactile Modality
Extending the Policy Config
The tactile observation can be added as either the full pressure map (spatial, 64 values per frame) or the aggregated scalar (total_force_n). Start with the scalar — it is easier to normalize and sufficient for most manipulation tasks. Add the pressure map if your task requires spatial contact information (e.g., distinguishing edge vs. center contact).
policy:
name: act
dataset:
repo_id: local/paxini-grasp-place
observation_features:
- observation.state
- observation.images.wrist
- observation.tactile.total_force_n
- observation.tactile.contact_area_mm2
action_features:
- action
Step 2 — Compute Normalization Statistics
Normalization Statistics from Your Dataset
Always compute normalization statistics from your specific dataset — do not use hardcoded values. The Paxini SDK provides a utility for this:
from paxini.data import compute_dataset_stats
stats = compute_dataset_stats("./tactile_dataset/")
print(stats.to_yaml())
stats.save("./act_paxini_stats.yaml")
Why normalization matters for tactile
Force values (0–20 N) are on a completely different scale from joint positions (typically ±π radians) and pixel values (0–255). Without normalization, the model will learn to ignore the tactile signal because its magnitude is swamped by the visual loss term during training.
Step 3 — Training
Training Command
Launch ACT training with the tactile config. Training time: ~45 minutes on an 8GB GPU, ~3 hours on CPU.
pip install lerobot
python lerobot/scripts/train.py \
--config-name act_paxini \
--dataset-path ./tactile_dataset/ \
--stats-path ./act_paxini_stats.yaml \
--output-dir ./checkpoints/act_tactile/ \
--num-epochs 300 \
--batch-size 8 \
--seed 42
tensorboard --logdir ./checkpoints/act_tactile/logs/
Training is complete when the validation loss plateaus. For 50 episodes at 100 Hz, expect convergence around epoch 200–250. The tactile_obs_loss metric in TensorBoard should decrease steadily alongside the main action loss.
Step 4 — Evaluate vs. Vision-Only Baseline
Evaluating Improvement Over Vision-Only
Train a second policy using only vision + proprioception (no tactile channels) on the same dataset. This is your baseline. Run evaluation over 20 rollouts for each policy on your target task:
python lerobot/scripts/train.py \
--config-name act_vision_only \
--dataset-path ./tactile_dataset/ \
--output-dir ./checkpoints/act_baseline/
python lerobot/scripts/eval.py \
--checkpoint ./checkpoints/act_tactile/best.ckpt \
--num-episodes 20 \
--eval-name "tactile"
python lerobot/scripts/eval.py \
--checkpoint ./checkpoints/act_baseline/best.ckpt \
--num-episodes 20 \
--eval-name "baseline"
Look for improvement in grasp stability metrics — specifically task success rate on slip-prone objects and average hold duration during the placement phase. Tactile-aware policies typically show the largest improvement on deformable and variably-weighted objects.
What to do if tactile does not improve performance
First, verify your dataset's contact events align with video (Unit 4 quality check). Second, confirm normalization statistics were computed correctly — incorrect normalization is the most common training failure mode for new modalities. Third, try adding the pressure_map as an observation instead of just the scalar — it provides richer spatial context.
What's Next
Next Directions
Path Complete
You have gone from first sensor reading to a trained tactile-aware manipulation policy. You now have a complete pipeline — hardware, data, and policy — that you can apply to any contact-rich task.
Unit 5 Complete When...
Training completes without errors and produces a checkpoint. The tactile validation loss decreases and plateaus during training. You have run 20 evaluation rollouts for both the tactile policy and the vision-only baseline and recorded the success rates for comparison.