Implementation of 3D human pose estimation research paper

Overview of the 3D human pose estimation pipeline

During my internship at Telekinesis AI, I collaborated with a teammate to read and implement a research paper on 3D Human Pose Estimation. The paper, titled “XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera” by Mehta et al., introduces a real-time approach for multi-person 3D pose estimation. It consists of three stages and introduces a novel CNN architecture called SelecSLS Net. More information about the paper can be found here.

Our project involved thoroughly understanding each section of the paper and implementing its methodology from scratch in Python. We also trained the PyTorch model we had developed using the same datasets and configurations specified in the paper. The results we obtained were comparable to those reported by the authors.