Back to projects

Fusion2Drive: Waymo Perception Fusion to Ego Action

2025 · Autonomous driving · multi-sensor fusion
Fusion2Drive
GitHub

Overview

Fusion2Drive is a clean, reproducible multi-sensor fusion model for autonomous driving, trained on the Waymo Open Dataset. It fuses camera and LiDAR inputs in a shared bird's-eye-view space to jointly predict ego waypoints for closed-loop control and 3D object detection for vehicles, pedestrians, and cyclists. A PointPillars LiDAR encoder and a Lift-Splat-Shoot camera encoder feed a BEV fusion backbone, which branches into a CenterPoint-style detection head and a transformer planning head.

Key features

Expected metrics (full training)

MetricLiDAR-onlyCamera-onlyFusion
Vehicle mAP (L1)0.650.420.71
Pedestrian mAP (L1)0.580.350.64
Cyclist mAP (L1)0.520.300.58
Waypoint ADE (m)0.851.200.72
Waypoint FDE (m)1.452.101.25

Target metrics from the full-training configuration, illustrating the expected gain of fusion over single-sensor baselines on both detection and planning.