What is a Gaussian Splat (i.e. 3D Gaussian Functions)?
A Gaussian splat is a 3D ellipsoid defined by:
- Position (x, y, z) - The center point of the splat in 3D space
- Covariance matrix - Defines the size, orientation, and shape of the ellipsoid
- Color (RGB) - The color contribution of the splat
- Opacity (α) - How transparent or opaque the splat is
Each splat can be thought of as a small, semi-transparent cloud of color that contributes to the final rendered image. When millions of these splats are combined, they create detailed 3D scenes.
Gaussian functions are chosen because they have several advantageous properties:
- Smooth falloff - They fade smoothly from the center to the edges
- Differentiable - Essential for optimization during training
- Efficient rendering - Can be rasterized quickly on modern GPUs
- Compact representation - Each splat requires only a small amount of data
Creating 3D Gaussian Splats
The process of taking multiple 2D images into 3D Gaussian Splats, can be broken down into two main steps:
- Structure from Motion (SfM) - Input images are analyzed to estimate original camera poses and create an initial 3D point cloud, identifying key points in the scene.
- Training - The algorithm performs the following over many thousands of iterations:
- Adds splats in areas that need more detail
- Removes splats that don’t contribute significantly
- Adjusts splat parameters to better match the input images