Publication | Open Access
A generic diffusion-based approach for 3D human pose prediction in the wild
35
Citations
40
References
2023
Year
Unknown Venue
EngineeringMachine LearningHuman Pose Estimation3D Pose EstimationHuman PosesHuman Pose PredictionPose Estimations3D Computer VisionGeneric Diffusion-based ApproachImage AnalysisData SciencePattern RecognitionBiostatisticsMotion PredictionRobot LearningComputational GeometryHuman Pose ForecastingMachine VisionPredictive AnalyticsStructure From MotionDeep LearningMedical Image ComputingComputer VisionNatural SciencesScene Modeling
Predicting 3D human poses in real-world scenarios, also known as human pose forecasting, is inevitably subject to noisy inputs arising from inaccurate 3D pose estimations and occlusions. To address these challenges, we propose a diffusion-based approach that can predict given noisy observations. We frame the prediction task as a denoising problem, where both observation and prediction are considered as a single sequence containing missing elements (whether in the observation or prediction horizon). All missing elements are treated as noise and denoised with our conditional diffusion model. To better handle long-term forecasting horizon, we present a temporal cascaded diffusion model. We demonstrate the benefits of our approach on four publicly available datasets (Human3.6M, HumanEva-I, AMASS, and 3DPW), outperforming the state-of-the-art. Additionally, we show that our framework is generic enough to improve any 3D pose prediction model as a pre-processing step to repair their inputs and a post-processing step to refine their outputs. The code is available online: https://github.com/vita-epfl/DePOSit.
| Year | Citations | |
|---|---|---|
Page 1
Page 1