LiDAR De-skewing Using Global Shutter Cameras for Visual-LiDAR SLAM

LiDAR-based SLAM is a core component of modern autonomous systems because 3D LiDAR provides accurate geometry that is robust to lighting and appearance changes. At the same time, lidar-visual SLAM is becoming standard. Cameras provide texture and color, LiDAR provides precise structure, and together they enable (1) colored point clouds, (2) denser and more accurate reconstructions by combining depth and RGB, (3) easier semantic segmentation and AI-based understanding in image space, (4) more robust SLAM by fusing visual and geometric cues, and (5) more reliable geometry estimation even for fast-moving robots and drones.

Most LiDAR sensors are not true single-shot devices. They scan the scene over a short time window. When the platform moves during this scan, the point cloud becomes motion distorted, creating rolling-shutter-like artifacts where different parts of the cloud correspond to different poses (Figure a). This degrades geometric accuracy, harms scan registration, and complicates precise image-LiDAR correspondence. Current deskewing methods typically rely on IMU data to unwarp points to a common time, but they depend heavily on accurate LiDAR–IMU time synchronization, good extrinsic calibration, and detailed timing information from the LiDAR (per-point or per-ring timestamps), thus, they still can not recover the correct geometry of the scene in all conditions and for every sensor setups (Figure c).

a) Current projection of LiDAR and camera that suffers from motion distortion	b) Our visual-LiDAR platform for developing autonomous navigation solutions
c) A top view of the LiDAR pointcloud from Hesai JT16 sensor, showing the motion distortion and inability of IMU-based deskewing (using LiDAR’s internal IMU) in recovering correct shape of environment.

Research Questions

In this project, we want to explore an alternative. Can a global-shutter camera be used as a reference to deskew LiDAR scans? Instead of relying only on IMU data, we ask whether we can use image-based constraints to remove LiDAR motion distortion. Concretely, the research questions include:

· Is it possible to use a global-shutter camera frame as the reference frozen view and warp the LiDAR scan to this time?

· Can we achieve pixel-level accurate correspondence between LiDAR points and image pixels in real-time, over the entire shared field of view, even for fast motions?

· Are there deep learning methods that can perform end-to-end LiDAR deskewing, for example by predicting per-point corrections or learning a joint image-LiDAR motion model?

· How does the accuracy and robustness of image-based deskewing compare to IMU-aided deskewing across different motion profiles (slow vs aggressive motion, car vs drone) and environments?

· To what extent can improved deskewing translate into better SLAM performance (trajectory error, map consistency) and better downstream perception (segmentation, detection)?

Activities

Student working on this assignment should ideally be comfortable with Python and/or C++, have basic experience with ROS2 and git, and be familiar with computer vision concepts. Experience in deep learning (PyTorch/TensorFlow, CNNs/transformers) is highly desirable.

Contact persons:

Le Viet Duc – Pervasive Systems, EEMCS, University of Twente

Hojat Mirtajadini, SMART Mechatronics and Robotics Research Group, Saxion University of Applied Sciences