The neural radiance field (NeRF) has emerged as a prominent methodology for synthesizing realistic images of novel views. While neural radiance representations based on voxels or mesh individually offer distinct advantages, excelling in either rendering quality or speed, each has limitations in the other aspect. In response, we propose a pioneering hybrid representation named Vosh, seamlessly combining both voxel and mesh components in hybrid rendering for view synthesis. Vosh is meticulously crafted by optimizing the voxel grid of NeRF, strategically meshing a portion of the volumetric density field to surface. Therefore, it excels in fast rendering scenes with simple geometry and textures through its mesh component, while simultaneously enabling high-quality rendering in intricate regions by leveraging voxel component. The flexibility of Vosh is showcased through the ability to adjust hybrid ratios, providing users the ability to control the balance between rendering quality and speed based on flexible usage. Experimental results demonstrates that our method achieves commendable trade-off between rendering quality and speed, and notably has real-time performance on mobile devices
An overview of the proposed methodology. The training phase starts from grid training for obtaining initial voxels. Then, a portion of voxels becomes mesh in the voxels to mesh conversion. Subsequently, the combination of voxels and mesh are optimized through hybrid rendering and voxels adjustment to obtain the final hybrid representation Vosh. The inference phase realizes real-time hybrid rendering with Vosh even on mobile phones.