M.Sc. Thesis — Phase 1 Robotics & AI NUST, Pakistan

DLO Perception & Tracking
in MuJoCo Simulation

Real-time cable keypoint detection, 3D lifting, and Kalman filtering at ~100 Hz — CPU-only

Muhammad Mahad · Supervised by Dr. Karam Dad Kallu

Scroll to explore

Demo

Dual Franka Panda arms holding a bright-red DLO cable in MuJoCo. Left: interactive 3D viewer. Right: 6-panel perception pipeline running at every timestep.

👁

Perceive

HSV segmentation isolates the cable from background; skeleton is extracted every frame

📍

Localize

10 keypoints sampled along skeleton and lifted to 3D via depth buffer

📈

Track

Kalman filter maintains temporal consistency; spline interpolation handles occlusion

What Phase 1 Proves

Phase 1 delivers a real-time DLO perception and tracking pipeline running entirely on CPU in MuJoCo simulation — the foundation for Phase 2 shape control and CBF safety.

01

~100 Hz Pipeline

Full perception loop (segment → skeleton → 3D lift → Kalman update) at over 10× the target rate on CPU.

02

1.9% DTC Shape Error

Distance-to-Centreline metric confirms the estimated cable shape matches ground truth with high fidelity.

03

Occlusion Robustness

Cubic spline interpolation recovers tracking through 30% keypoint occlusion without loss of cable shape.

MuJoCo RGB-D
Perception + Kalman
DLO
State

Perception Pipeline

A four-stage classical CV pipeline — no deep learning required. Each stage is CPU-native and runs inside the MuJoCo physics loop.

Stage 1

HSV Segmentation

OpenCV HSV thresholding isolates the bright-red DLO cable from the scene

Color-tuned mask Morphological clean-up CPU-native OpenCV
Binary mask
Stage 2

Zhang-Suen Skeletonization

Thinning algorithm extracts the cable centreline; 10 keypoints sampled uniformly along it

scikit-image skeleton Ordered traversal 10 keypoints
2D keypoints
Stage 3

3D Depth Lifting

7×7 min-depth patch from MuJoCo depth buffer back-projects each 2D keypoint to world frame

MuJoCo Renderer Pinhole projection NaN-safe API
3D points
Stage 4

Kalman Filter Tracking

6-state constant-velocity Kalman filter per keypoint; greedy NN association across frames

NumPy/SciPy Spline occlusion fill Phase 2 ready API
🔌

Phase 2 Interface: get_dlo_state(rgb, depth) → (10, 3)

The pipeline is packaged as a single reusable function. Phase 2 shape control and CBF safety will consume this output directly — no changes to the perception code required.

Phase 1 Results

All metrics measured CPU-only on a single core. Ground truth from mj_data.site_xpos.

~100 Hz
Pipeline Speed
Target: >10 Hz ✅
1.9%
DTC Shape Error
Kalman — best tracker
30%
Occlusion Handled
Spline interpolation ✅
3
Trackers Evaluated
Kalman · PF · EMA
Tracker RMSE (mm) DTC (%) Speed (Hz) Occlusion RMSE
Kalman ✅ 213 1.9% 1 674 199 mm
Particle Filter 209 2.5% 306 194 mm
EMA 210 2.5% 5 004 203 mm

Note: High RMSE (mm) is a camera resolution artefact — 2.5 m overhead camera gives ~7.5 mm/pixel. DTC (shape accuracy relative to cable length) is the meaningful metric.

Tech Stack

🔬

MuJoCo 3.x

CPU-native physics + off-screen RGB-D rendering via Python API. No GPU required.

🤖

Dual Franka Panda

Two 7-DoF arms with 20-segment DER ball-joint DLO cable bundled in models/

👁

OpenCV

HSV segmentation, morphological ops, multi-panel visualization at runtime

🦴

scikit-image

Zhang-Suen skeletonization extracts cable centreline from binary mask

📐

NumPy / SciPy

Kalman filter, particle filter, cubic spline occlusion interpolation

🛡️

CBF-ready API

get_dlo_state() → (10,3) output feeds directly into Phase 2 shape control and CBF-QP

6-Month Timeline

Month 1 — This Repo
✅ Completed

DLO Perception & Tracking

Real-time pipeline: HSV segmentation → skeletonization → 3D lifting → Kalman filter. ~100 Hz, CPU-only, 1.9% DTC. Live MuJoCo + OpenCV demo.

Month 2
🔲 Planned

DLO Shape Control & Jacobian

Jacobian-based incremental shape control consuming get_dlo_state(). Online Jacobian estimation from keypoint deltas.

Month 3
🔲 Planned

Global Motion Planner

DIT* or JIT adapted for 14-DoF + DLO configuration space with catenary-aware cost function.

Month 4
🔲 Planned

Contact-Aware CBF Safety

CBF-QP safety filter: h_arm-arm ≥ 0, h_arm-env ≥ 0, h_cable-env ≥ 0. Mode-switching for clip insertions.

Month 5
🔲 Planned

Dynamic Stress Testing

Benchmark hierarchical CBF vs standard MPC vs vanilla RRT on fast motions and dynamic obstacles.

Month 6
🔲 Planned

Thesis Writing & Defense

Data analysis, final manuscript, and defense presentation formatting.

Key References

  1. Caporali, A., et al. (2023). "FASTDLO: Fast Deformable Linear Objects Instance Segmentation." IEEE RA-L.
  2. Yu, M., et al. (2023). "A coarse-to-fine framework for dual-arm manipulation of deformable linear objects with whole-body obstacle avoidance." ICRA 2023.
  3. Chen, K., Bing, Z., et al. (2023). "Contact-aware Shaping and Maintenance of Deformable Linear Objects With Fixtures." IROS 2023.
  4. Müller, M., et al. (2012). "Physically based shape matching." International Journal of Non-Linear Mechanics.
  5. Zeng, Q., et al. (2023). "Accurate Simulation and Parameter Identification of Deformable Linear Objects in MuJoCo." arXiv:2310.00911.
  6. Zhang, T. Y., & Suen, C. Y. (1984). "A fast parallel algorithm for thinning digital patterns." CACM.
  7. Welch, G., & Bishop, G. (1995). "An introduction to the Kalman filter." UNC Tech Report.
  8. Yin, H., Varava, A., & Kragic, D. (2021). "Modeling, learning, perception, and control methods for deformable object manipulation." Science Robotics, 6(54).