Neural radiance fields (NeRFs) have emerged as an effective method for novel-view synthesis and 3D scene reconstruction. However, conventional training methods require access to all training views during scene optimization. This assumption may be prohibitive in continual learning scenarios, where new data is acquired in a sequential manner and a continuous update of the NeRF is desired, as in automotive or remote sensing applications. When naively trained in such a continual setting, traditional scene representation frameworks suffer from catastrophic forgetting, where previously learned knowledge is corrupted after training on new data. Prior works in alleviating forgetting with NeRFs suffer from low reconstruction quality and high latency, making them impractical for real-world application. We propose a continual learning framework for training NeRFs that leverages replay-based methods combined with a hybrid explicit--implicit scene representation. Our method outperforms previous methods in reconstruction quality when trained in a continual setting, while having the additional benefit of being an order of magnitude faster.
Overview of our method. The scene representation is sequentially trained on sequentially acquired views. After each stage of training, a frozen copy of the scene parameters is stored. While optimizing for the next set of incoming images, the frozen network is queried to obtain pseudo ground truth values. The current network is trained on a mixed objective that minimizes photometric loss with respect to ground truth images from the current task, and pseudo ground truth values for previous tasks.
Reconstructed views from a previously supervised (forgotten) task across different methods. Our method consistently outperforms all other baselines in visual quality, retaining high-frequency details for earlier tasks through the use of explicit features being an order of magnitude faster than the closest performing baseline (MEIL-NeRF).
Reconstructed views from an earlier supervised (forgotten) task for our method and MEIL-NeRF trained for fixed times per task. Our method consistently outperforms MEIL-NeRF given equal time budget. With only 5s per task, our method already reconstructs the scene with reasonable fidelity, illustrating that our method is well-suited for real-time continual scene fitting.
@inproceedings{Po2023InstantCL,
title={Instant Continual Learning of Neural Radiance Fields},
author={Ryan Po and Zhengyang Dong and Alexander W. Bergman and Gordon Wetzstein},
year={2023},
url={https://api.semanticscholar.org/CorpusID:261531118}
}