180° panorama images are rendered above. In under-calibrated multi-camera systems, the NeRF quality significantly degrades, along with color discrepancies, object ghosts, and wrong geometry. Our UC-NeRF achieves high-quality rendering and accurate geometry in the challenging cases. (Slide the white line to compare the results between zipnerf and ours.)
We present UC-NeRF, a novel method tailored for novel view synthesis in under-calibrated multi-view camera systems of autonomous driving.
Multi-camera setups find widespread use across various applications, such as autonomous driving, as they greatly expand sensing capabilities. Despite the fast development of Neural radiance field (NeRF) techniques and their wide applications in both indoor and outdoor scenes, applying NeRF to multi-camera systems remains very challenging. This is primarily due to the inherent under-calibration issues in multi-camera setup, including inconsistent imaging effects stemming from separately calibrated image signal processing units in diverse cameras, and system errors arising from mechanical vibrations during driving that affect relative camera poses. In this paper, we present UC-NeRF, a novel method tailored for novel view synthesis in under-calibrated multi-view camera systems. Firstly, we propose a layer-based color correction to rectify the color inconsistency in different image regions. Second, we propose virtual warping to generate more viewpoint-diverse but color-consistent virtual views for color correction and 3D recovery. Finally, a spatiotemporally constrained pose refinement is designed for more robust and accurate pose calibration in multi-camera systems.
Our method not only achieves state-of-the-art performance of novel view synthesis in multi-camera setups, but also effectively facilitates depth estimation in large-scale outdoor scenes with the synthesized novel views.
Note that no depth supervision used in our UC-NeRF.
                                        Ours UC-NeRF(left)                                                                                       Zip-NeRF(right)
More Comparison
Color correction of the side-view images might overfit, resulting in different areas not being under the same color space when rendering panoramic images. For example, the color of the trees and flowers on both sides may not match reality. UC-NeRF can solve this problem.
Errors from the relative transformation of different cameras can cause blurring. UC-NeRF can alleviate these artifacts.
With the obtained 3D NeRF, we can generate additional photo-realistic images from novel viewpoints. The synthesized images can facilitate downstream perception tasks like monocular depth estimation. We first train VA-DepthNet (ICLR2023), a state-of-the-art monocular depth estimation model, on the original real images. We then train the model by combining the original real images and the new synthesized images (VA-DepthNet*). As Tab.1 illustrates, the accuracy of the estimated depth is improved with such a data augmentation. Fig.1 also shows such an operation leads to sharper edges and more accurate predictions. It shows great potential of generating more training images for large perception model, such as Metric3D.
@article{cheng2023uc,
title = {UC-NeRF: Neural Radiance Field for Under-Calibrated multi-view cameras in autonomous driving},
author = {Cheng, Kai and Long, Xiaoxiao and Yin, Wei and Wang, Jin and Wu, Zhiqiang and Ma, Yuexin and Wang, Kaixuan and Chen, Xiaozhi and Chen, Xuejin},
journal = {arXiv preprint arXiv:2311.16945},
year = {2023}
}