TR2026-040
Heatmap-to-SMPL Multi-View Radar Transformer for Multi-Person 3D Pose Estimation
-
- , "Heatmap-to-SMPL Multi-View Radar Transformer for Multi-Person 3D Pose Estimation", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2026.BibTeX TR2026-040 PDF
- @inproceedings{Kato2026may,
- author = {Kato, Sorachi and Wang, Pu and Fujihashi, Takuya and Markham, Andrew},
- title = {{Heatmap-to-SMPL Multi-View Radar Transformer for Multi-Person 3D Pose Estimation}},
- booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
- year = 2026,
- month = may,
- url = {https://www.merl.com/publications/TR2026-040}
- }
- , "Heatmap-to-SMPL Multi-View Radar Transformer for Multi-Person 3D Pose Estimation", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2026.
-
MERL Contact:
-
Research Areas:
Artificial Intelligence, Computational Sensing, Signal Processing
Abstract:
Radar-based 3D human pose estimation can be achieved using either sparse radar point clouds or, more recently, high-resolution multi-view radar heatmaps. Point-cloud approaches typically lever- age strong body-shape priors, e.g., Skinned Multi-Person Linear Model (SMPL), but depend on point-based backbones and potentially temporal aggregation to compensate for weak features; heatmap approaches preserve richer, reflectivity-level radar features, yet usually regress only 3D keypoints, ignoring body-shape priors. In this paper, by retaining heatmap fidelity and simultaneously exploiting shape priors, we propose RHAMP: a Radar HeAtmap- to-SMPL Pose transformer for 3D human pose estimation. Specifi- cally, each radar view is encoded by the backbone network, and a set of person queries cross-attends to the multi-view radar features to produce per-instance SMPL parameters in a single end-to-end stage. Experiments on the public HIBER dataset confirm the effectiveness of the proposed approach over a list of baselines.
