DiffPose: Toward More Reliable 3D Pose Estimation, CVPR2023 1 JIA GONG *, 1 Lin Geng Foo *, 2 Zhipeng Fan , 3 Qiuhong Ke , 4 Hossein Rahmani , 1 Jun Liu , * equal contribution 1 Singapore University of Technology and Design, 2 New York University, 3 Monash University, 4 Lancaster University [Paper] | [Project Page] | [SUTD-VLG Lab] DiffPose Model Architecture DiffPose Diffusion Process Our code is built on top of DDIM . Environment The code is developed and tested under the following environment: Python 3.8.2 PyTorch 1.7.1 CUDA 11.0 You can create the environment via: conda env create -f environment.yml Dataset Our datasets are based on 3d-pose-baseline and Video3D data . We provide the GMM format data generated from the above datasets here . You should put the downloaded files into the ./data directory. Note that we only change the format of the Video3D data to make them compatible with our GMM-based DiffPose training strategy, and the value of the 2D pose in our dataset is the same as them. Frame-based experiments Evaluating pre-trained models for frame-based experiments We provide the pre-trained diffusion model (with CPN-dected 2D Pose as input) here . To evaluate it, put it into the ./checkpoint directory and run: CUDA_VISIBLE_DEVICES=0 python main_diffpose_frame.py \ --config human36m_diffpose_uvxyz_cpn.yml --batch_size 1024 \ --model_pose_path checkpoints/gcn_xyz_cpn.pth \ --model_diff_path checkpoints/diffpose_uvxyz_cpn.pth \ --doc t_human36m_diffpose_uvxyz_cpn --exp exp --ni \ > exp/t_human36m_diffpose_uvxyz_cpn.out 2>&1 & We also provide the pre-trained diffusion model (with Ground truth 2D pose as input) here . To evaluate it, put it into the ./checkpoint directory and run: CUDA_VISIBLE_DEVICES=0 python main_diffpose_frame.py \ --config human36m_diffpose_uvxyz_gt.yml --batch_size 1024 \ --model_pose_path checkpoints/gcn_xyz_gt.pth \ --model_diff_path checkpoints/diffpose_uvxyz_gt.pth \ --doc t_human36m_diffpose_uvxyz_gt --exp exp --ni \ > exp/t_human36m_diffpose_uvxyz_gt.out 2>&1 & Training new models To train a model from scratch (CPN 2D pose as input), run: CUDA_VISIBLE_DEVICES=0 python main_diffpose_frame.py --train \ --config human36m_diffpose_uvxyz_cpn.yml --batch_size 1024 \ --model_pose_path checkpoints/gcn_xyz_cpn.pth \ --doc human36m_diffpose_uvxyz_cpn --exp exp --ni \ > exp/human36m_diffpose_uvxyz_cpn.out 2>&1 & To train a model from scratch (Ground truth 2D pose as input), run: CUDA_VISIBLE_DEVICES=0 python main_diffpose_frame.py --train \ --config human36m_diffpose_uvxyz_gt.yml --batch_size 1024 \ --model_pose_path checkpoints/gcn_xyz_gt.pth \ --doc human36m_diffpose_uvxyz_gt --exp exp --ni \ > exp/human36m_diffpose_uvxyz_gt.out 2>&1 & Video-based experiments Refer to https://github.com/GONGJIA0208/Diffpose_video Bibtex If you find our work useful in your research, please consider citing: @InProceedings{gong2023diffpose, author = {Gong, Jia and Foo, Lin Geng and Fan, Zhipeng and Ke, Qiuhong and Rahmani, Hossein and Liu, Jun}, title = {DiffPose: Toward More Reliable 3D Pose Estimation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2023}, } Acknowledgement Part of our code is borrowed from DDIM , VideoPose3D , Graformer , MixSTE and PoseFormer . We thank the authors for releasing the codes.