Boosting Monocular Depth Estimation with Lightweight 3D Point Fusion

In this paper, we propose enhancing monocular depth estimation by adding 3D points as depth guidance. Unlike existing depth completion methods, our approach performs well on extremely sparse and unevenly distributed point clouds, which makes it agnostic to the source of the 3D points. We achieve this by introducing a novel multi-scale 3D point fusion network that is both lightweight and efficient. We demonstrate its versatility on two different depth estimation problems where the 3D points have been acquired with conventional structure-from-motion and Li-DAR. In both cases, our network performs on par with state-of-the-art depth completion methods and achieves significantly higher accuracy when only a small number of points is used while being more compact in terms of the number of parameters. We show that our method outperforms some contemporary deep learning based multi-view stereo and structure-from-motion methods both in accuracy and in compactness.

Huynh Lam, Nguyen Phong, Matas Jiri, Rahtu Esa, Heikkilä Janne

A4 Article in conference proceedings

2021 IEEE/CVF International Conference on Computer Vision (ICCV)

L. Huynh, P. Nguyen, J. Matas, E. Rahtu and J. Heikkilä, "Boosting Monocular Depth Estimation with Lightweight 3D Point Fusion," 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 12747-12756, doi: 10.1109/ICCV48922.2021.01253

https://doi.org/10.1109/ICCV48922.2021.01253 http://urn.fi/urn:nbn:fi-fe2022030421975