NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes
Abstract
We present NeSF, a method for producing 3D semantic fields from pre-trained density fields and sparse 2D semantic supervision.
Our method side-steps traditional scene representations by leveraging neural representations where 3D information is stored within neural fields.
In spite of being supervised by 2D signals alone, our method is able to generate 3D-consistent semantic maps from novel camera poses and can be queried at arbitrary 3D points.
Notably, NeSF is compatible with any method producing a density field, and its accuracy improves as the quality of the pre-trained density fields improve.
Our empirical analysis demonstrates comparable quality to competitive 2D and 3D semantic segmentation baselines on convincing synthetic scenes while also offering features unavailable to existing methods.
Our method side-steps traditional scene representations by leveraging neural representations where 3D information is stored within neural fields.
In spite of being supervised by 2D signals alone, our method is able to generate 3D-consistent semantic maps from novel camera poses and can be queried at arbitrary 3D points.
Notably, NeSF is compatible with any method producing a density field, and its accuracy improves as the quality of the pre-trained density fields improve.
Our empirical analysis demonstrates comparable quality to competitive 2D and 3D semantic segmentation baselines on convincing synthetic scenes while also offering features unavailable to existing methods.