3D object detection from arbitrary RGB camera rigs

Most self-driving cars have a set of RGB cameras on the vehicle. We tried to use these camera rigs for 3D object detection. As different self-driving companies have different number of cameras on the rigs and different layouts, we based our deep network on a trajectory planning network Lift-Splat-Shoot which works on arbitrary rigs too.

We also experimented with using LiDAR as privileged information (i.e. just used during training). Privileged information has been used to improve the performance and reduce training time. We observed performance gains upto a handful of epochs, but the gains disappeared after ~10 epochs.

A report and a short presentation is available for further reading.

These are the two detection outputs from the RGB-only trained network and RGB+PI network.

RGB only training and inferenceRGB + LiDAR training / RGB only inference