NeLU3D: Neural Inverse Structured Light Scanning without Modeling the Projector

1New York University, Courant Institute of Mathematical Sciences, 2New York University, Tandon School of Engineering, 3New York University, Center for Data Science
Preprint
Teaser of Neural LookUp3D, a neural inverse structured light scanning method without explicit projector modeling.

Neural LookUp3D (NeLU3D) is a neural inverse structured light scanning method without explicit projector modeling.

Abstract

We propose NeLU3D, a neural inverse structured light method that foregoes the explicit modeling of the projector. With as few as four monochromatic images (or two RGB images), our approach extracts accurate surfaces and normal orientation using arbitrary light sources, such as extremely low-cost projectors or high-speed analog projectors with a fixed pattern for slow-motion capture. We show scans of over twenty-five objects of different shapes, sizes, reflectances, and textures to demonstrate the feasibility and quality of our method in a handful of projector-camera configurations. We also showcase sub-millimeter accurate results with suboptimal patterns, where previous methods fail or recover 3D shapes covered by noisy outliers.

NeLU3D Pipeline

Our goal is recover a surface from captured structured light images of a scene without explicitly modeling the light source. We describe below how we bypass the explicit calibration of the light source and then use inverse rendering to reconstruct a surface.
NeLU3D Pipeline.

NeLU3D Calibration

We move a calibration plane across a volume while emitting a structured light pattern onto it. The camera is calibrated, so we can shoot rays onto the volume of calibration and train a color MLP to predict, given a 3D coordinate, its corresponding color intensity.

NeLU3D Reconstruction

As described above, we train a color MLP that skips the light source calibration - the weights of the color MLP remain frozen during reconstruction. For reconstruction. we cast the problem as inverse rendering by fitting scene geometry, represented by san MLP mapping 3D points to a signed distance function (SDF), and scene appearance, represented by an MLP mapping 2D image coordinates to global illumination contributions. The two MLPs are optimized to explain the captured images via volumetric rendering. At the end, we extract the surface from the zero level set of the SDF.

Results

Some results using NeLU3D with as few as three monochromatic patterns.

Acknowledgments