3D human pose estimation in video with temporal convolutions
This implementation mainly trains the VideoPose3D in a supervised manner, modified from https://github.com/facebookresearch/VideoPose3D
VideoPose3D Details
Due to the Ascend-Python's low-efficient implementation on Conv1D, this version changes all conv1d to conv2d.
Results on Human3.6M
Under Protocol 1 (mean per-joint position error).
| results | FPS | epoches | AMP_Type | Device(s) |
|---|---|---|---|---|
| 46.5mm | 5544 | 80 | O1 | 1p NPU |
| 46.79 mm | 40757 | 80 | O1 | 8p NPU |
Quick start
To get started as quickly as possible, follow the instructions in this section. This should allow you train a model from scratch, test our pretrained models, and produce basic visualizations. For more detailed instructions, please refer to DOCUMENTATION.md.
Dependencies
Make sure you have the following dependencies installed before proceeding:
- Python 3+ distribution
- PyTorch 1.5.0
pip install -r requirements.txt
Dataset setup
In order to proceed, you must also copy the 'data' file we provided into the VideoPose3D root directory.
Train
# 1p train perf
bash test/train_performance_1p.sh
# 8p train perf
bash test/train_performance_8p.sh
# 1p train full
bash test/train_full_1p.sh
# 8p train full
bash test/train_full_8p.sh
# 8p evaluate full
bash test/eval_full_8p.sh
License
This work is licensed under CC BY-NC. See LICENSE for details. Third-party datasets are subject to their respective licenses. If you use our code/models in your research, please cite our paper:
@inproceedings{pavllo:videopose3d:2019,
title={3D human pose estimation in video with temporal convolutions and semi-supervised training},
author={Pavllo, Dario and Feichtenhofer, Christoph and Grangier, David and Auli, Michael},
booktitle={Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2019}
}
Statement
For details about the public address of the code in this repository, you can get from the file public_address_statement.md