Init Release

This commit is contained in:
JeffreyXiang
2025-12-09 09:21:05 +00:00
parent 7f9ccee603
commit de38fdd408
235 changed files with 21120 additions and 0 deletions

3
.gitmodules vendored Normal file
View File

@@ -0,0 +1,3 @@
[submodule "o-voxel/third_party/eigen"]
path = o-voxel/third_party/eigen
url = https://gitlab.com/libeigen/eigen.git

21
LICENSE Normal file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) Microsoft Corporation.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE

222
README.md Normal file
View File

@@ -0,0 +1,222 @@
![](assets/teaser.webp)
# Native and Compact Structured Latents for 3D Generation
<a href="https://microsoft.github.io/trellis.2"><img src="https://img.shields.io/badge/Paper-Arxiv-b31b1b.svg" alt="Paper"></a>
<a href="https://huggingface.co/microsoft/TRELLIS.2-4B"><img src="https://img.shields.io/badge/Hugging%20Face-Model-yellow" alt="Hugging Face"></a>
<a href="https://microsoft.github.io/trellis.2"><img src="https://img.shields.io/badge/Project-Website-blue" alt="Project Page"></a>
<a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT-green" alt="License"></a>
https://github.com/user-attachments/assets/5ee056e4-73a9-4fd8-bf60-59cae90d3dfc
*(Compressed version due to GitHub size limits. See the full-quality video on our project page!)*
**TRELLIS.2** is a state-of-the-art large 3D generative model (4B parameters) designed for high-fidelity **image-to-3D** generation. It leverages a novel "field-free" sparse voxel structure termed **O-Voxel** to reconstruct and generate arbitrary 3D assets with complex topologies, sharp features, and full PBR materials.
## ✨ Features
### 1. High Quality, Resolution & Efficiency
Our 4B-parameter model generates high-resolution fully textured assets with exceptional fidelity and efficiency using vanilla DiTs. It utilizes a Sparse 3D VAE with 16× spatial downsampling to encode assets into a compact latent space.
| Resolution | Total Time* | Breakdown (Shape + Mat) |
| :--- | :--- | :--- |
| **512³** | **~3s** | 2s + 1s |
| **1024³** | **~17s** | 10s + 7s |
| **1536³** | **~60s** | 35s + 25s |
<small>*Tested on NVIDIA H100 GPU.</small>
### 2. Arbitrary Topology Handling
The **O-Voxel** representation breaks the limits of iso-surface fields. It robustly handles complex structures without lossy conversion:
***Open Surfaces** (e.g., clothing, leaves)
***Non-manifold Geometry**
***Internal Enclosed Structures**
### 3. Rich Texture Modeling
Beyond basic colors, TRELLIS.2 models arbitrary surface attributes including **Base Color, Roughness, Metallic, and Opacity**, enabling photorealistic rendering and transparency support.
### 4. Minimalist Processing
Data processing is streamlined for instant conversions that are fully **rendering-free** and **optimization-free**.
* **< 10s** (Single CPU): Textured Mesh → O-Voxel
* **< 100ms** (CUDA): O-Voxel → Textured Mesh
## 🗺️ Roadmap
- [x] Paper release
- [x] Release image-to-3D inference code
- [x] Release pretrained checkpoints (4B)
- [x] Hugging Face Spaces demo
- [ ] Release shape-conditioned texture generation inference code (Current schdule: before 12/24/2025)
- [ ] Release training code (Current schdule: before 12/31/2025)
## 🛠️ Installation
### Prerequisites
- **System**: The code is currently tested only on **Linux**.
- **Hardware**: An NVIDIA GPU with at least 24GB of memory is necessary. The code has been verified on NVIDIA A100 and H100 GPUs.
- **Software**:
- The [CUDA Toolkit](https://developer.nvidia.com/cuda-toolkit-archive) is needed to compile certain packages. Recommended version is 12.4.
- [Conda](https://docs.anaconda.com/miniconda/install/#quick-command-line-install) is recommended for managing dependencies.
- Python version 3.8 or higher is required.
### Installation Steps
1. Clone the repo:
```sh
git clone -b main https://github.com/microsoft/TRELLIS.2.git --recursive
cd TRELLIS.2
```
2. Install the dependencies:
**Before running the following command there are somethings to note:**
- By adding `--new-env`, a new conda environment named `trellis2` will be created. If you want to use an existing conda environment, please remove this flag.
- By default the `trellis2` environment will use pytorch 2.6.0 with CUDA 12.4. If you want to use a different version of CUDA, you can remove the `--new-env` flag and manually install the required dependencies. Refer to [PyTorch](https://pytorch.org/get-started/previous-versions/) for the installation command.
- If you have multiple CUDA Toolkit versions installed, `CUDA_HOME` should be set to the correct version before running the command. For example, if you have CUDA Toolkit 12.4 and 13.0 installed, you can run `export CUDA_HOME=/usr/local/cuda-12.4` before running the command.
- By default, the code uses the `flash-attn` backend for attention. For GPUs do not support `flash-attn` (e.g., NVIDIA V100), you can install `xformers` manually and set the `ATTN_BACKEND` environment variable to `xformers` before running the code. See the [Minimal Example](#minimal-example) for more details.
- The installation may take a while due to the large number of dependencies. Please be patient. If you encounter any issues, you can try to install the dependencies one by one, specifying one flag at a time.
- If you encounter any issues during the installation, feel free to open an issue or contact us.
Create a new conda environment named `trellis2` and install the dependencies:
```sh
. ./setup.sh --new-env --basic --flash-attn --nvdiffrast --nvdiffrec --cumesh --o-voxel --flexgemm
```
The detailed usage of `setup.sh` can be found by running `. ./setup.sh --help`.
```sh
Usage: setup.sh [OPTIONS]
Options:
-h, --help Display this help message
--new-env Create a new conda environment
--basic Install basic dependencies
--flash-attn Install flash-attention
--cumesh Install cumesh
--o-voxel Install o-voxel
--flexgemm Install flexgemm
--nvdiffrast Install nvdiffrast
--nvdiffrec Install nvdiffrec
```
## 📦 Pretrained Weights
The pretrained model **TRELLIS.2-4B** is available on Hugging Face. Please refer to the model card there for more details.
| Model | Parameters | Resolution | Link |
| :--- | :--- | :--- | :--- |
| **TRELLIS.2-4B** | 4 Billion | 512³ - 1536³ | [Hugging Face](https://huggingface.co/microsoft/TRELLIS.2-4B) |
## 🚀 Usage
### 1. Image to 3D Generation
#### Minimal Example
Here is an [example](example.py) of how to use the pretrained models for 3D asset generation.
```python
import os
os.environ['OPENCV_IO_ENABLE_OPENEXR'] = '1'
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True" # Can save GPU memory
import cv2
import imageio
from PIL import Image
import torch
from trellis2.pipelines import Trellis2ImageTo3DPipeline
from trellis2.utils import render_utils
from trellis2.renderers import EnvMap
import o_voxel
# 1. Setup Environment Map
envmap = EnvMap(torch.tensor(
cv2.cvtColor(cv2.imread('assets/hdri/forest.exr', cv2.IMREAD_UNCHANGED), cv2.COLOR_BGR2RGB),
dtype=torch.float32, device='cuda'
))
# 2. Load Pipeline
pipeline = Trellis2ImageTo3DPipeline.from_pretrained("microsoft/TRELLIS.2-4B")
pipeline.cuda()
# 3. Load Image & Run
image = Image.open("assets/example_image/T.png")
mesh = pipeline.run(image)[0]
mesh.simplify(16777216) # nvdiffrast limit
# 4. Render Video
video = render_utils.make_pbr_vis_frames(render_utils.render_video(mesh, envmap=envmap))
imageio.mimsave("sample.mp4", video, fps=15)
# 5. Export to GLB
glb = o_voxel.postprocess.to_glb(
vertices = mesh.vertices,
faces = mesh.faces,
attr_volume = mesh.attrs,
coords = mesh.coords,
attr_layout = mesh.layout,
voxel_size = mesh.voxel_size,
aabb = [[-0.5, -0.5, -0.5], [0.5, 0.5, 0.5]],
decimation_target = 1000000,
texture_size = 4096,
remesh = True,
remesh_band = 1,
remesh_project = 0,
verbose = True
)
glb.export("sample.glb", extension_webp=True)
```
Upon execution, the script generates the following files:
- `sample.mp4`: A video visualizing the generated 3D asset with PBR materials and environmental lighting.
- `sample.glb`: The extracted PBR-ready 3D asset in GLB format.
**Note:** The `.glb` file is exported in `OPAQUE` mode by default. Although the alpha channel is preserved within the texture map, it is not active initially. To enable transparency, import the asset into your 3D software and manually connect the texture's alpha channel to the material's opacity or alpha input.
#### Web Demo
[app.py](app.py) provides a simple web demo for image to 3D asset generation. you can run the demo with the following command:
```sh
python app.py
```
Then, you can access the demo at the address shown in the terminal.
### 2. PBR Texture Generation
Will be released soon. Please stay tuned!
## 🧩 Related Packages
TRELLIS.2 is built upon several specialized high-performance packages developed by our team:
* **[O-Voxel](o-voxel):**
Core library handling the logic for converting between textured meshes and the O-Voxel representation, ensuring instant bidirectional transformation.
* **[FlexGEMM](https://github.com/JeffreyXiang/FlexGEMM):**
Efficient sparse convolution implementation based on Triton, enabling rapid processing of sparse voxel structures.
* **[CuMesh](https://github.com/JeffreyXiang/CuMesh):**
CUDA-accelerated mesh utilities used for high-speed post-processing, remeshing, decimation, and UV-unwrapping.
## ⚖️ License
This model and code are released under the **[MIT License](LICENSE)**.
Please note that certain dependencies operate under separate license terms:
- [**nvdiffrast**](https://github.com/NVlabs/nvdiffrast): Utilized for rendering generated 3D assets. This package is governed by its own [License](https://github.com/NVlabs/nvdiffrast/blob/main/LICENSE.txt).
- [**nvdiffrec**](https://github.com/NVlabs/nvdiffrec): Implements the split-sum renderer for PBR materials. This package is governed by its own [License](https://github.com/NVlabs/nvdiffrec/blob/main/LICENSE.txt).
## 📚 Citation
If you find this model useful for your research, please cite our work:
```bibtex
@article{
xiang2025trellis2,
title={Native and Compact Structured Latents for 3D Generation},
author={Xiang, Jianfeng and Chen, Xiaoxue and Xu, Sicheng and Wang, Ruicheng and Lv, Zelong and Deng, Yu and Zhu, Hongyuan and Dong, Yue and Zhao, Hao and Yuan, Nicholas Jing and Yang, Jiaolong},
journal={Tech report},
year={2025}
}
```

14
SECURITY.md Normal file
View File

@@ -0,0 +1,14 @@
<!-- BEGIN MICROSOFT SECURITY.MD V1.0.0 BLOCK -->
## Security
Microsoft takes the security of our software products and services seriously, which
includes all source code repositories in our GitHub organizations.
**Please do not report security vulnerabilities through public GitHub issues.**
For security reporting information, locations, contact information, and policies,
please review the latest guidance for Microsoft repositories at
[https://aka.ms/SECURITY.md](https://aka.ms/SECURITY.md).
<!-- END MICROSOFT SECURITY.MD BLOCK -->

645
app.py Normal file
View File

@@ -0,0 +1,645 @@
import gradio as gr
import os
os.environ['OPENCV_IO_ENABLE_OPENEXR'] = '1'
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"
from datetime import datetime
import shutil
import cv2
from typing import *
import torch
import numpy as np
from PIL import Image
import base64
import io
from trellis2.modules.sparse import SparseTensor
from trellis2.pipelines import Trellis2ImageTo3DPipeline
from trellis2.renderers import EnvMap
from trellis2.utils import render_utils
import o_voxel
MAX_SEED = np.iinfo(np.int32).max
TMP_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'tmp')
MODES = [
{"name": "Normal", "icon": "assets/app/normal.png", "render_key": "normal"},
{"name": "Clay render", "icon": "assets/app/clay.png", "render_key": "clay"},
{"name": "Base color", "icon": "assets/app/basecolor.png", "render_key": "base_color"},
{"name": "HDRI forest", "icon": "assets/app/hdri_forest.png", "render_key": "shaded_forest"},
{"name": "HDRI sunset", "icon": "assets/app/hdri_sunset.png", "render_key": "shaded_sunset"},
{"name": "HDRI courtyard", "icon": "assets/app/hdri_courtyard.png", "render_key": "shaded_courtyard"},
]
STEPS = 8
DEFAULT_MODE = 3
DEFAULT_STEP = 3
css = """
/* Overwrite Gradio Default Style */
.stepper-wrapper {
padding: 0;
}
.stepper-container {
padding: 0;
align-items: center;
}
.step-button {
flex-direction: row;
}
.step-connector {
transform: none;
}
.step-number {
width: 16px;
height: 16px;
}
.step-label {
position: relative;
bottom: 0;
}
.wrap.center.full {
inset: 0;
height: 100%;
}
.wrap.center.full.translucent {
background: var(--block-background-fill);
}
.meta-text-center {
display: block !important;
position: absolute !important;
top: unset !important;
bottom: 0 !important;
right: 0 !important;
transform: unset !important;
}
/* Previewer */
.previewer-container {
position: relative;
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Helvetica, Arial, sans-serif;
width: 100%;
height: 722px;
margin: 0 auto;
padding: 20px;
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
}
.previewer-container .tips-icon {
position: absolute;
right: 10px;
top: 10px;
z-index: 10;
border-radius: 10px;
color: #fff;
background-color: var(--color-accent);
padding: 3px 6px;
user-select: none;
}
.previewer-container .tips-text {
position: absolute;
right: 10px;
top: 50px;
color: #fff;
background-color: var(--color-accent);
border-radius: 10px;
padding: 6px;
text-align: left;
max-width: 300px;
z-index: 10;
transition: all 0.3s;
opacity: 0%;
user-select: none;
}
.previewer-container .tips-text p {
font-size: 14px;
line-height: 1.2;
}
.tips-icon:hover + .tips-text {
display: block;
opacity: 100%;
}
/* Row 1: Display Modes */
.previewer-container .mode-row {
width: 100%;
display: flex;
gap: 8px;
justify-content: center;
margin-bottom: 20px;
flex-wrap: wrap;
}
.previewer-container .mode-btn {
width: 24px;
height: 24px;
border-radius: 50%;
cursor: pointer;
opacity: 0.5;
transition: all 0.2s;
border: 2px solid #ddd;
object-fit: cover;
}
.previewer-container .mode-btn:hover { opacity: 0.9; transform: scale(1.1); }
.previewer-container .mode-btn.active {
opacity: 1;
border-color: var(--color-accent);
transform: scale(1.1);
}
/* Row 2: Display Image */
.previewer-container .display-row {
margin-bottom: 20px;
min-height: 400px;
width: 100%;
flex-grow: 1;
display: flex;
justify-content: center;
align-items: center;
}
.previewer-container .previewer-main-image {
max-width: 100%;
max-height: 100%;
flex-grow: 1;
object-fit: contain;
display: none;
}
.previewer-container .previewer-main-image.visible {
display: block;
}
/* Row 3: Custom HTML Slider */
.previewer-container .slider-row {
width: 100%;
display: flex;
flex-direction: column;
align-items: center;
gap: 10px;
padding: 0 10px;
}
.previewer-container input[type=range] {
-webkit-appearance: none;
width: 100%;
max-width: 400px;
background: transparent;
}
.previewer-container input[type=range]::-webkit-slider-runnable-track {
width: 100%;
height: 8px;
cursor: pointer;
background: #ddd;
border-radius: 5px;
}
.previewer-container input[type=range]::-webkit-slider-thumb {
height: 20px;
width: 20px;
border-radius: 50%;
background: var(--color-accent);
cursor: pointer;
-webkit-appearance: none;
margin-top: -6px;
box-shadow: 0 2px 5px rgba(0,0,0,0.2);
transition: transform 0.1s;
}
.previewer-container input[type=range]::-webkit-slider-thumb:hover {
transform: scale(1.2);
}
/* Overwrite Previewer Block Style */
.gradio-container .padded:has(.previewer-container) {
padding: 0 !important;
}
.gradio-container:has(.previewer-container) [data-testid="block-label"] {
position: absolute;
top: 0;
left: 0;
}
"""
head = """
<script>
function refreshView(mode, step) {
// 1. Find current mode and step
const allImgs = document.querySelectorAll('.previewer-main-image');
for (let i = 0; i < allImgs.length; i++) {
const img = allImgs[i];
if (img.classList.contains('visible')) {
const id = img.id;
const [_, m, s] = id.split('-');
if (mode === -1) mode = parseInt(m.slice(1));
if (step === -1) step = parseInt(s.slice(1));
break;
}
}
// 2. Hide ALL images
// We select all elements with class 'previewer-main-image'
allImgs.forEach(img => img.classList.remove('visible'));
// 3. Construct the specific ID for the current state
// Format: view-m{mode}-s{step}
const targetId = 'view-m' + mode + '-s' + step;
const targetImg = document.getElementById(targetId);
// 4. Show ONLY the target
if (targetImg) {
targetImg.classList.add('visible');
}
// 5. Update Button Highlights
const allBtns = document.querySelectorAll('.mode-btn');
allBtns.forEach((btn, idx) => {
if (idx === mode) btn.classList.add('active');
else btn.classList.remove('active');
});
}
// --- Action: Switch Mode ---
function selectMode(mode) {
refreshView(mode, -1);
}
// --- Action: Slider Change ---
function onSliderChange(val) {
refreshView(-1, parseInt(val));
}
</script>
"""
empty_html = f"""
<div class="previewer-container">
<svg style=" opacity: .5; height: var(--size-5); color: var(--body-text-color);"
xmlns="http://www.w3.org/2000/svg" width="100%" height="100%" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" class="feather feather-image"><rect x="3" y="3" width="18" height="18" rx="2" ry="2"></rect><circle cx="8.5" cy="8.5" r="1.5"></circle><polyline points="21 15 16 10 5 21"></polyline></svg>
</div>
"""
def image_to_base64(image):
buffered = io.BytesIO()
image = image.convert("RGB")
image.save(buffered, format="jpeg", quality=85)
img_str = base64.b64encode(buffered.getvalue()).decode()
return f"data:image/jpeg;base64,{img_str}"
def start_session(req: gr.Request):
user_dir = os.path.join(TMP_DIR, str(req.session_hash))
os.makedirs(user_dir, exist_ok=True)
def end_session(req: gr.Request):
user_dir = os.path.join(TMP_DIR, str(req.session_hash))
shutil.rmtree(user_dir)
def preprocess_image(image: Image.Image) -> Image.Image:
"""
Preprocess the input image.
Args:
image (Image.Image): The input image.
Returns:
Image.Image: The preprocessed image.
"""
processed_image = pipeline.preprocess_image(image)
return processed_image
def pack_state(latents: Tuple[SparseTensor, SparseTensor, int]) -> dict:
shape_slat, tex_slat, res = latents
return {
'shape_slat_feats': shape_slat.feats.cpu().numpy(),
'tex_slat_feats': tex_slat.feats.cpu().numpy(),
'coords': shape_slat.coords.cpu().numpy(),
'res': res,
}
def unpack_state(state: dict) -> Tuple[SparseTensor, SparseTensor, int]:
shape_slat = SparseTensor(
feats=torch.from_numpy(state['shape_slat_feats']).cuda(),
coords=torch.from_numpy(state['coords']).cuda(),
)
tex_slat = shape_slat.replace(torch.from_numpy(state['tex_slat_feats']).cuda())
return shape_slat, tex_slat, state['res']
def get_seed(randomize_seed: bool, seed: int) -> int:
"""
Get the random seed.
"""
return np.random.randint(0, MAX_SEED) if randomize_seed else seed
def image_to_3d(
image: Image.Image,
seed: int,
resolution: str,
ss_guidance_strength: float,
ss_guidance_rescale: float,
ss_sampling_steps: int,
ss_rescale_t: float,
shape_slat_guidance_strength: float,
shape_slat_guidance_rescale: float,
shape_slat_sampling_steps: int,
shape_slat_rescale_t: float,
tex_slat_guidance_strength: float,
tex_slat_guidance_rescale: float,
tex_slat_sampling_steps: int,
tex_slat_rescale_t: float,
req: gr.Request,
progress=gr.Progress(track_tqdm=True),
) -> str:
# --- Sampling ---
outputs, latents = pipeline.run(
image,
seed=seed,
preprocess_image=False,
sparse_structure_sampler_params={
"steps": ss_sampling_steps,
"guidance_strength": ss_guidance_strength,
"guidance_rescale": ss_guidance_rescale,
"rescale_t": ss_rescale_t,
},
shape_slat_sampler_params={
"steps": shape_slat_sampling_steps,
"guidance_strength": shape_slat_guidance_strength,
"guidance_rescale": shape_slat_guidance_rescale,
"rescale_t": shape_slat_rescale_t,
},
tex_slat_sampler_params={
"steps": tex_slat_sampling_steps,
"guidance_strength": tex_slat_guidance_strength,
"guidance_rescale": tex_slat_guidance_rescale,
"rescale_t": tex_slat_rescale_t,
},
pipeline_type={
"512": "512",
"1024": "1024_cascade",
"1536": "1536_cascade",
}[resolution],
return_latent=True,
)
mesh = outputs[0]
mesh.simplify(16777216) # nvdiffrast limit
images = render_utils.render_snapshot(mesh, resolution=1024, r=2, fov=36, nviews=STEPS, envmap=envmap)
state = pack_state(latents)
torch.cuda.empty_cache()
# --- HTML Construction ---
# The Stack of 48 Images
images_html = ""
for m_idx, mode in enumerate(MODES):
for s_idx in range(STEPS):
# ID Naming Convention: view-m{mode}-s{step}
unique_id = f"view-m{m_idx}-s{s_idx}"
# Logic: Only Mode 0, Step 0 is visible initially
is_visible = (m_idx == DEFAULT_MODE and s_idx == DEFAULT_STEP)
vis_class = "visible" if is_visible else ""
# Image Source
img_base64 = image_to_base64(Image.fromarray(images[mode['render_key']][s_idx]))
# Render the Tag
images_html += f"""
<img id="{unique_id}"
class="previewer-main-image {vis_class}"
src="{img_base64}"
loading="eager">
"""
# Button Row HTML
btns_html = ""
for idx, mode in enumerate(MODES):
active_class = "active" if idx == DEFAULT_MODE else ""
# Note: onclick calls the JS function defined in Head
btns_html += f"""
<img src="{mode['icon_base64']}"
class="mode-btn {active_class}"
onclick="selectMode({idx})"
title="{mode['name']}">
"""
# Assemble the full component
full_html = f"""
<div class="previewer-container">
<div class="tips-wrapper">
<div class="tips-icon">💡Tips</div>
<div class="tips-text">
<p>● <b>Render Mode</b> - Click on the circular buttons to switch between different render modes.</p>
<p>● <b>View Angle</b> - Drag the slider to change the view angle.</p>
</div>
</div>
<!-- Row 1: Viewport containing 48 static <img> tags -->
<div class="display-row">
{images_html}
</div>
<!-- Row 2 -->
<div class="mode-row" id="btn-group">
{btns_html}
</div>
<!-- Row 3: Slider -->
<div class="slider-row">
<input type="range" id="custom-slider" min="0" max="{STEPS - 1}" value="{DEFAULT_STEP}" step="1" oninput="onSliderChange(this.value)">
</div>
</div>
"""
return state, full_html
def extract_glb(
state: dict,
decimation_target: int,
texture_size: int,
req: gr.Request,
progress=gr.Progress(track_tqdm=True),
) -> Tuple[str, str]:
"""
Extract a GLB file from the 3D model.
Args:
state (dict): The state of the generated 3D model.
decimation_target (int): The target face count for decimation.
texture_size (int): The texture resolution.
Returns:
str: The path to the extracted GLB file.
"""
user_dir = os.path.join(TMP_DIR, str(req.session_hash))
shape_slat, tex_slat, res = unpack_state(state)
mesh = pipeline.decode_latent(shape_slat, tex_slat, res)[0]
glb = o_voxel.postprocess.to_glb(
vertices=mesh.vertices,
faces=mesh.faces,
attr_volume=mesh.attrs,
coords=mesh.coords,
attr_layout=pipeline.pbr_attr_layout,
grid_size=res,
aabb=[[-0.5, -0.5, -0.5], [0.5, 0.5, 0.5]],
decimation_target=decimation_target,
texture_size=texture_size,
remesh=True,
remesh_band=1,
remesh_project=0,
use_tqdm=True,
)
now = datetime.now()
timestamp = now.strftime("%Y-%m-%dT%H%M%S") + f".{now.microsecond // 1000:03d}"
os.makedirs(user_dir, exist_ok=True)
glb_path = os.path.join(user_dir, f'sample_{timestamp}.glb')
glb.export(glb_path, extension_webp=True)
torch.cuda.empty_cache()
return glb_path, glb_path
with gr.Blocks(delete_cache=(600, 600)) as demo:
gr.Markdown("""
## Image to 3D Asset with [TRELLIS.2](https://microsoft.github.io/trellis.2)
* Upload an image (preferably with an alpha-masked foreground object) and click Generate to create a 3D asset.
* Click Extract GLB to export and download the generated GLB file if you're satisfied with the result. Otherwise, try another time.
""")
with gr.Row():
with gr.Column(scale=1, min_width=360):
image_prompt = gr.Image(label="Image Prompt", format="png", image_mode="RGBA", type="pil", height=400)
resolution = gr.Radio(["512", "1024", "1536"], label="Resolution", value="1024")
seed = gr.Slider(0, MAX_SEED, label="Seed", value=0, step=1)
randomize_seed = gr.Checkbox(label="Randomize Seed", value=True)
decimation_target = gr.Slider(100000, 1000000, label="Decimation Target", value=500000, step=10000)
texture_size = gr.Slider(1024, 4096, label="Texture Size", value=2048, step=1024)
generate_btn = gr.Button("Generate")
with gr.Accordion(label="Advanced Settings", open=False):
gr.Markdown("Stage 1: Sparse Structure Generation")
with gr.Row():
ss_guidance_strength = gr.Slider(1.0, 10.0, label="Guidance Strength", value=7.5, step=0.1)
ss_guidance_rescale = gr.Slider(0.0, 1.0, label="Guidance Rescale", value=0.7, step=0.01)
ss_sampling_steps = gr.Slider(1, 50, label="Sampling Steps", value=12, step=1)
ss_rescale_t = gr.Slider(1.0, 6.0, label="Rescale T", value=5.0, step=0.1)
gr.Markdown("Stage 2: Shape Generation")
with gr.Row():
shape_slat_guidance_strength = gr.Slider(1.0, 10.0, label="Guidance Strength", value=7.5, step=0.1)
shape_slat_guidance_rescale = gr.Slider(0.0, 1.0, label="Guidance Rescale", value=0.5, step=0.01)
shape_slat_sampling_steps = gr.Slider(1, 50, label="Sampling Steps", value=12, step=1)
shape_slat_rescale_t = gr.Slider(1.0, 6.0, label="Rescale T", value=3.0, step=0.1)
gr.Markdown("Stage 3: Material Generation")
with gr.Row():
tex_slat_guidance_strength = gr.Slider(1.0, 10.0, label="Guidance Strength", value=1.0, step=0.1)
tex_slat_guidance_rescale = gr.Slider(0.0, 1.0, label="Guidance Rescale", value=0.0, step=0.01)
tex_slat_sampling_steps = gr.Slider(1, 50, label="Sampling Steps", value=12, step=1)
tex_slat_rescale_t = gr.Slider(1.0, 6.0, label="Rescale T", value=3.0, step=0.1)
with gr.Column(scale=10):
with gr.Walkthrough(selected=0) as walkthrough:
with gr.Step("Preview", id=0):
preview_output = gr.HTML(empty_html, label="3D Asset Preview", show_label=True, container=True)
extract_btn = gr.Button("Extract GLB")
with gr.Step("Extract", id=1):
glb_output = gr.Model3D(label="Extracted GLB", height=724, show_label=True, display_mode="solid", clear_color=(0.25, 0.25, 0.25, 1.0))
download_btn = gr.DownloadButton(label="Download GLB")
with gr.Column(scale=1, min_width=172):
examples = gr.Examples(
examples=[
f'assets/example_image/{image}'
for image in os.listdir("assets/example_image")
],
inputs=[image_prompt],
fn=preprocess_image,
outputs=[image_prompt],
run_on_click=True,
examples_per_page=18,
)
output_buf = gr.State()
# Handlers
demo.load(start_session)
demo.unload(end_session)
image_prompt.upload(
preprocess_image,
inputs=[image_prompt],
outputs=[image_prompt],
)
generate_btn.click(
get_seed,
inputs=[randomize_seed, seed],
outputs=[seed],
).then(
lambda: gr.Walkthrough(selected=0), outputs=walkthrough
).then(
image_to_3d,
inputs=[
image_prompt, seed, resolution,
ss_guidance_strength, ss_guidance_rescale, ss_sampling_steps, ss_rescale_t,
shape_slat_guidance_strength, shape_slat_guidance_rescale, shape_slat_sampling_steps, shape_slat_rescale_t,
tex_slat_guidance_strength, tex_slat_guidance_rescale, tex_slat_sampling_steps, tex_slat_rescale_t,
],
outputs=[output_buf, preview_output],
)
extract_btn.click(
lambda: gr.Walkthrough(selected=1), outputs=walkthrough
).then(
extract_glb,
inputs=[output_buf, decimation_target, texture_size],
outputs=[glb_output, download_btn],
)
# Launch the Gradio app
if __name__ == "__main__":
os.makedirs(TMP_DIR, exist_ok=True)
# Construct ui components
btn_img_base64_strs = {}
for i in range(len(MODES)):
icon = Image.open(MODES[i]['icon'])
MODES[i]['icon_base64'] = image_to_base64(icon)
pipeline = Trellis2ImageTo3DPipeline.from_pretrained('microsoft/TRELLIS.2-4B')
pipeline.cuda()
envmap = {
'forest': EnvMap(torch.tensor(
cv2.cvtColor(cv2.imread('assets/hdri/forest.exr', cv2.IMREAD_UNCHANGED), cv2.COLOR_BGR2RGB),
dtype=torch.float32, device='cuda'
)),
'sunset': EnvMap(torch.tensor(
cv2.cvtColor(cv2.imread('assets/hdri/sunset.exr', cv2.IMREAD_UNCHANGED), cv2.COLOR_BGR2RGB),
dtype=torch.float32, device='cuda'
)),
'courtyard': EnvMap(torch.tensor(
cv2.cvtColor(cv2.imread('assets/hdri/courtyard.exr', cv2.IMREAD_UNCHANGED), cv2.COLOR_BGR2RGB),
dtype=torch.float32, device='cuda'
)),
}
demo.launch(css=css, head=head)

BIN
assets/app/basecolor.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.8 KiB

BIN
assets/app/clay.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.4 KiB

BIN
assets/app/hdri_city.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.2 KiB

BIN
assets/app/hdri_forest.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.1 KiB

BIN
assets/app/hdri_night.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.8 KiB

BIN
assets/app/hdri_studio.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.2 KiB

BIN
assets/app/hdri_sunrise.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.2 KiB

BIN
assets/app/hdri_sunset.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.9 KiB

BIN
assets/app/normal.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 106 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 180 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 153 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 85 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 156 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 163 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 202 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 106 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 166 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 113 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 90 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 75 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 137 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 164 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 171 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 128 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 134 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 146 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 126 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 174 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 124 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 108 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 122 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 93 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 91 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 166 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 97 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 109 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 109 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 191 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 187 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 106 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 115 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 176 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 112 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 101 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 183 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 113 KiB

BIN
assets/example_image/T.png Executable file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.6 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 101 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 203 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 176 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 176 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 127 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 104 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 152 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 152 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 201 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 148 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 139 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 747 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 213 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 230 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 150 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 77 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 84 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 163 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 178 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 224 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 104 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 297 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 249 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 182 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 241 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 204 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 97 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 87 KiB

BIN
assets/hdri/city.exr Normal file

Binary file not shown.

BIN
assets/hdri/courtyard.exr Normal file

Binary file not shown.

BIN
assets/hdri/forest.exr Normal file

Binary file not shown.

BIN
assets/hdri/interior.exr Normal file

Binary file not shown.

15
assets/hdri/license.txt Normal file
View File

@@ -0,0 +1,15 @@
All HDRIs are licensed as CC0.
These were created by Greg Zaal (Poly Haven https://polyhaven.com).
Originals used for each HDRI:
- City: https://polyhaven.com/a/portland_landing_pad
- Courtyard: https://polyhaven.com/a/courtyard
- Forest: https://polyhaven.com/a/ninomaru_teien
- Interior: https://polyhaven.com/a/hotel_room
- Night: Probably https://polyhaven.com/a/moonless_golf
- Studio: Probably https://polyhaven.com/a/studio_small_01
- Sunrise: https://polyhaven.com/a/spruit_sunrise
- Sunset: https://polyhaven.com/a/venice_sunset
1K resolution of each was taken, and compressed with oiiotool:
oiiotool input.exr --ch R,G,B -d float --compression dwab:300 --clamp:min=0.0:max=32000.0 -o output.exr

BIN
assets/hdri/night.exr Normal file

Binary file not shown.

BIN
assets/hdri/studio.exr Normal file

Binary file not shown.

BIN
assets/hdri/sunrise.exr Normal file

Binary file not shown.

BIN
assets/hdri/sunset.exr Normal file

Binary file not shown.

BIN
assets/teaser.webp Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 283 KiB

48
example.py Normal file
View File

@@ -0,0 +1,48 @@
import os
os.environ['OPENCV_IO_ENABLE_OPENEXR'] = '1'
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True" # Can save GPU memory
import cv2
import imageio
from PIL import Image
import torch
from trellis2.pipelines import Trellis2ImageTo3DPipeline
from trellis2.utils import render_utils
from trellis2.renderers import EnvMap
import o_voxel
# 1. Setup Environment Map
envmap = EnvMap(torch.tensor(
cv2.cvtColor(cv2.imread('assets/hdri/forest.exr', cv2.IMREAD_UNCHANGED), cv2.COLOR_BGR2RGB),
dtype=torch.float32, device='cuda'
))
# 2. Load Pipeline
pipeline = Trellis2ImageTo3DPipeline.from_pretrained("microsoft/TRELLIS.2-4B")
pipeline.cuda()
# 3. Load Image & Run
image = Image.open("assets/example_image/T.png")
mesh = pipeline.run(image)[0]
mesh.simplify(16777216) # nvdiffrast limit
# 4. Render Video
video = render_utils.make_pbr_vis_frames(render_utils.render_video(mesh, envmap=envmap))
imageio.mimsave("sample.mp4", video, fps=15)
# 5. Export to GLB
glb = o_voxel.postprocess.to_glb(
vertices = mesh.vertices,
faces = mesh.faces,
attr_volume = mesh.attrs,
coords = mesh.coords,
attr_layout = mesh.layout,
voxel_size = mesh.voxel_size,
aabb = [[-0.5, -0.5, -0.5], [0.5, 0.5, 0.5]],
decimation_target = 1000000,
texture_size = 4096,
remesh = True,
remesh_band = 1,
remesh_project = 0,
verbose = True
)
glb.export("sample.glb", extension_webp=True)

174
o-voxel/README.md Normal file
View File

@@ -0,0 +1,174 @@
# O-Voxel: A Native 3D Representation
**O-Voxel** is a sparse, voxel-based native 3D representation designed for high-quality 3D generation and reconstruction. Unlike traditional methods that rely on fields (e.g., Occupancy fields, SDFs), O-Voxel utilizes a **Flexible Dual Grid** formulation to robustly represent surfaces with arbitrary topology (including non-manifold and open surfaces) and **volumetric surface properties** such as Physically-Based Rendering (PBR) material attributes.
This library provides an efficient implementation for the instant bidirectional conversion between Meshes and O-Voxels, along with tools for sparse voxel compression, serialization, and rendering.
![Overview](assets/overview.webp)
## Key Features
- **🧱 Flexible Dual Grid**: A geometry representation that solves a enhanced QEF (Quadratic Error Function) to accurately capture sharp features and open boundaries without requiring watertight meshes.
- **🎨 Volumetric PBR Attributes**: Native support for physically-based rendering properties (Base Color, Metallic, Roughness, Opacity) aligned with the sparse voxel grid.
- **⚡ Instant Bidirectional Conversion**: Rapid `Mesh <-> O-Voxel` conversion without expensive SDF evaluation, flood-filling, or iterative optimization.
- **💾 Efficient Compression**: Supports custom `.vxz` format for compact storage of sparse voxel structures using Z-order/Hilbert curve encoding.
- **🛠️ Production Ready**: Tools to export converted assets directly to `.glb` with UV unwrapping and texture baking.
## Installation
```bash
git clone -b main https://github.com/microsoft/TRELLIS.2.git --recursive
pip install TRELLIS.2/o_voxel --no-build-isolation
```
## Quick Start
> See also the [examples](examples) directory for more detailed usage.
### 1. Convert Mesh to O-Voxel [[link]](examples/mesh2ovox.py)
Convert a standard 3D mesh (with textures) into the O-Voxel representation.
```python
asset = trimesh.load("path/to/mesh.glb")
# 1. Geometry Voxelization (Flexible Dual Grid)
# Returns: occupied indices, dual vertices (QEF solution), and edge intersected
mesh = asset.to_mesh()
vertices = torch.from_numpy(mesh.vertices).float()
faces = torch.from_numpy(mesh.faces).long()
voxel_indices, dual_vertices, intersected = o_voxel.convert.mesh_to_flexible_dual_grid(
vertices, faces,
grid_size=RES, # Resolution
aabb=[[-0.5,-0.5,-0.5],[0.5,0.5,0.5]], # Axis-aligned bounding box
face_weight=1.0, # Face term weight in QEF
boundary_weight=0.2, # Boundary term weight in QEF
regularization_weight=1e-2, # Regularization term weight in QEF
timing=True
)
## sort to ensure align between geometry and material voxelization
vid = o_voxel.serialize.encode_seq(voxel_indices)
mapping = torch.argsort(vid)
voxel_indices = voxel_indices[mapping]
dual_vertices = dual_vertices[mapping]
intersected = intersected[mapping]
# 2. Material Voxelization (Volumetric Attributes)
# Returns: dict containing 'base_color', 'metallic', 'roughness', etc.
voxel_indices_mat, attributes = o_voxel.convert.textured_mesh_to_volumetric_attr(
asset,
grid_size=RES,
aabb=[[-0.5,-0.5,-0.5],[0.5,0.5,0.5]],
timing=True
)
## sort to ensure align between geometry and material voxelization
vid_mat = o_voxel.serialize.encode_seq(voxel_indices_mat)
mapping_mat = torch.argsort(vid_mat)
attributes = {k: v[mapping_mat] for k, v in attributes.items()}
# Save to compressed .vxz format
## packing
dual_vertices = dual_vertices * RES - voxel_indices
dual_vertices = (torch.clamp(dual_vertices, 0, 1) * 255).type(torch.uint8)
intersected = (intersected[:, 0:1] + 2 * intersected[:, 1:2] + 4 * intersected[:, 2:3]).type(torch.uint8)
attributes['dual_vertices'] = dual_vertices
attributes['intersected'] = intersected
o_voxel.io.write("ovoxel_helmet.vxz", voxel_indices, attributes)
```
### 2. Recover Mesh from O-Voxel [[link]](examples/ovox2mesh.py)
Reconstruct the surface mesh from the sparse voxel data.
```python
# Load data
coords, data = o_voxel.io.read("path/to/ovoxel.vxz")
dual_vertices = data['dual_vertices']
intersected = data['intersected']
base_color = data['base_color']
## ... other attributes omitted for brevity
# Depack
dual_vertices = dual_vertices / 255
intersected = torch.cat([
intersected % 2,
intersected // 2 % 2,
intersected // 4 % 2,
], dim=-1).bool()
# Extract Mesh
# O-Voxel connects dual vertices to form quads, optionally splitting them
# based on geometric features.
rec_verts, rec_faces = o_voxel.convert.flexible_dual_grid_to_mesh(
coords.cuda(),
dual_vertices.cuda(),
intersected.cuda(),
split_weight=None, # Auto-split based on min angle if None
grid_size=RES,
aabb=[[-0.5,-0.5,-0.5],[0.5,0.5,0.5]],
)
```
### 3. Export to GLB [[link]](examples/ovox2glb.py)
For visualization in standard 3D viewers, you can clean, UV-unwrap, and bake the volumetric attributes into textures.
```python
# Assuming you have the reconstructed verts/faces and volume attributes
mesh = o_voxel.postprocess.to_glb(
vertices=rec_verts,
faces=rec_faces,
attr_volume=attr_tensor, # Concatenated attributes
coords=coords,
attr_layout={'base_color': slice(0,3), 'metallic': slice(3,4), ...},
grid_size=RES,
aabb=[[-0.5,-0.5,-0.5],[0.5,0.5,0.5]],
decimation_target=100000,
texture_size=2048,
verbose=True,
)
mesh.export("rec_helmet.glb")
```
### 4. Voxel Rendering [[link]](examples/render_ovox.py)
Render the voxel representation directly.
```python
# Load data
coords, data = o_voxel.io.read("ovoxel_helmet.vxz")
position = (coords / RES - 0.5).cuda()
base_color = (data['base_color'] / 255).cuda()
# Render
renderer = o_voxel.rasterize.VoxelRenderer(
rendering_options={"resolution": 512, "ssaa": 2}
)
output = renderer.render(
position=position, # Voxel centers
attrs=base_color, # Color/Opacity etc.
voxel_size=1.0/RES,
extrinsics=extr,
intrinsics=intr
)
# output.attr contains the rendered image (C, H, W)
```
## API Overview
### `o_voxel.convert`
Core algorithms for the conversion between meshes and O-Voxels.
* `mesh_to_flexible_dual_grid`: Determines the active sparse voxels and solves the QEF to determine dual vertex positions within voxels based on mesh-voxel grid intersections.
* `flexible_dual_grid_to_mesh`: Reconnects dual vertices to form a surface.
* `textured_mesh_to_volumetric_attr`: Samples texture maps into voxel space.
### `o_voxel.io`
Handles sparse voxel file I/O operations.
* **Formats**: `.npz` (NumPy), `.ply` (Point Cloud), `.vxz` (Custom compressed, recommended).
* **Functions**: `read()`, `write()`.
### `o_voxel.serialize`
Utilities for spatial hashing and ordering.
* `encode_seq` / `decode_seq`: Converts 3D coordinates to/from Morton codes (Z-order) or Hilbert curves for efficient storage and processing.
### `o_voxel.rasterize`
* `VoxelRenderer`: A lightweight renderer for sparse voxel visualization during training.
### `o_voxel.postprocess`
* `to_glb`: A comprehensive pipeline for mesh cleaning, remeshing, UV unwrapping, and texture baking.

Some files were not shown because too many files have changed in this diff Show More