Depth Completion and Up-Sampling by Deep LiDAR-Camera Fusion

Concept of depth completion by fusing LiDAR and stereo camera images. Image taken from https://doi.org/10.1007/s00138-023-01426-x

Problem

Robots often face a sensing trade-off: 3D LiDAR is metrically accurate but sparse (misses thin objects like branches/wires), while stereo/RGB-D yields dense but noisier depth. The goal is to fuse these modalities to produce a dense, per-frame point cloud that preserves LiDAR’s metric accuracy while filling gaps guided by image structure. This can be used for perception, navigation and other purposes by the robots.

Research question

How can LiDAR and stereo/RGB images be fused to generate dense and metrically accurate depth in real time on a mobile robot?

1. What fusion strategy (classical method vs. learning-based ) best balances accuracy, robustness to thin structures, and compute budget?

2. How sensitive is the dense depth to camera–LiDAR extrinsic calibration and time synchronization errors, or motion, and how can we mitigate them?

Project activities

1. Literature review on LiDAR-camera depth fusion, focusing on accuracy, thin-object recovery, robustness, runtime, and open-source availability.

2. Benchmarking of selected open-source fusion methods on the KITTI Depth Prediction dataset.

3. Verification of the most promising methods on data recorded with our own sensor setup.

4. Providing a recommendation for the most suitable existing solution for integration into our robotic perception pipeline.

Contact

Le Viet Duc – Pervasive Systems, EEMCS, University of Twente

Hojat Mirtajadini, SMART Mechatronics and Robotics Research Group, Saxion University of Applied Sciences