UVTR

Unifying Voxel-based Representation with Transformer for 3D Object Detection

Yanwei Li, Yilun Chen, Xiaojuan Qi, Zeming Li, Jian Sun, Jiaya Jia

[ arXiv] [ BibTeX]

This project provides an implementation for the NeurIPS 2022 paper " Unifying Voxel-based Representation with Transformer for 3D Object Detection " based on mmDetection3D . UVTR aims to unify multi-modality representations in the voxel space for accurate and robust single- or cross-modality 3D detection.

Preparation

This project is based on mmDetection3D , which can be constructed as follows.

Install PyTorch v1.7.1 and mmDetection3D v0.17.3 following the instructions .
Copy our project and related files to installed mmDetection3D:

cp -r projects mmdetection3d/
cp -r extra_tools mmdetection3d/

Prepare the nuScenes dataset following the structure .
Generate the unified data info and sampling database for nuScenes dataset:

python3 extra_tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes_unified

Training

You can train the model following the instructions . You can find the pretrained models here if you want to train the model from scratch. For example, to launch UVTR training on multi GPUs, one should execute:

cd
 /path/to/mmdetection3d
bash extra_tools/dist_train.sh 
${CFG_FILE}
 ${NUM_GPUS}

or train with a single GPU:

python3 extra_tools/train.py 
${CFG_FILE}

Evaluation

You can evaluate the model following the instructions . For example, to launch UVTR evaluation with a pretrained checkpoint on multi GPUs, one should execute:

bash extra_tools/dist_test.sh 
${CFG_FILE}
 ${CKPT}
 ${NUM_GPUS}
 --eval=bbox

or evaluate with a single GPU:

python3 extra_tools/test.py 
${CFG_FILE}
 ${CKPT}
 --eval=bbox

nuScenes 3D Object Detection Results

We provide results on nuScenes val set with pretrained models.

	NDS(%)	mAP(%)	mATE↓	mASE↓	mAOE↓	mAVE↓	mAAE↓	download
Camera-based
UVTR-C-R50-H5	40.1	31.3	0.810	0.281	0.486	0.793	0.187	GoogleDrive
UVTR-C-R50-H11	41.8	33.3	0.795	0.276	0.452	0.761	0.196	GoogleDrive
UVTR-C-R101	44.1	36.1	0.761	0.271	0.409	0.756	0.203	GoogleDrive
UVTR-CS-R50	47.2	36.2	0.756	0.276	0.399	0.467	0.189	GoogleDrive
UVTR-CS-R101	48.3	37.9	0.739	0.267	0.350	0.510	0.200	GoogleDrive
UVTR-L2C-R101	45.0	37.2	0.735	0.269	0.397	0.761	0.193	GoogleDrive
UVTR-L2CS3-R101	48.8	39.2	0.720	0.268	0.354	0.534	0.206	GoogleDrive
LiDAR-based
UVTR-L-V0075	67.6	60.8	0.335	0.257	0.303	0.206	0.183	GoogleDrive
Multi-modality
UVTR-M-V0075-R101	70.2	65.4	0.333	0.258	0.270	0.216	0.176	GoogleDrive

Acknowledgement

We would like to thank the authors of mmDetection3D and DETR3D for their open-source release.

License

UVTR is released under the Apache 2.0 license .

Citing UVTR

Consider cite UVTR in your publications if it helps your research.

@inproceedings{li2022uvtr,
  title={Unifying Voxel-based Representation with Transformer for 3D Object Detection},
  author={Li, Yanwei and Chen, Yilun and Qi, Xiaojuan and Li, Zeming and Sun, Jian and Jia, Jiaya},
  booktitle={Advances in Neural Information Processing Systems},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
extra_tools		extra_tools
projects		projects
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

extra_tools

extra_tools

projects

projects

.gitignore

.gitignore

README.md

README.md

Repository files navigation

UVTR

Preparation

Training

Evaluation

nuScenes 3D Object Detection Results

Acknowledgement

License

Citing UVTR

About

Releases

Packages

Contributors 2

Languages

dvlab-research/UVTR

Folders and files

Latest commit

History

Repository files navigation

UVTR

Preparation

Training

Evaluation

nuScenes 3D Object Detection Results

Acknowledgement

License

Citing UVTR

About

Topics

Resources

Stars

Watchers

Forks

Languages