Aerial View 3D Human Pose Estimation Using Double Vector Quantized-Variational AutoEncoders
Citations

WEB OF SCIENCE

1
Citations

SCOPUS

1

초록

This study introduces a novel methodology for the precise estimation of the three-dimensional (3D) pose of individuals based on images captured from aerial viewpoints, particularly from top-to-bottom viewpoints. A motion capture system utilized for surveillance purposes is frequently constrained in its ability to capture dynamic scenarios, primarily due to the limited field of view of a third-person-view camera. To address the problem at hand, various approaches employ aerial views to overcome limitations in spatial constraints. Nevertheless, when observing the unmanned aerial vehicle (UAV) from an aerial perspective, it is common for the lower body to appear diminished and obstructed by the upper body. This phenomenon results in pose estimation that is highly unreliable and inaccurate. To overcome the existing limitation, we present a novel approach that utilizes the Vector Quantized- Variational AutoEncoder (VQ-VAE) to accurately predict and optimize the 3D human pose from aerial images. Thus, we introduce a novel pipeline for pose estimation and optimization using the codebook by learning aerial image features and pose features from large human pose datasets with VQ-VAE. The proposed method with the vector quantizer of VQ-VAEs can help improve the generalization capabilities of 3D pose estimation from aerial top-to-bottom viewpoints. Through conducting comparative experiments, our method has demonstrated a substantial enhancement in performance compared to those of existing state-of-the-art methods. © 2024 IEEE.

제목
Aerial View 3D Human Pose Estimation Using Double Vector Quantized-Variational AutoEncoders
저자
Hwang, JuheonKang, Jiwoo
DOI
10.1109/WACVW60836.2024.00042
발행일
2024-01
유형
Proceedings Paper
저널명
Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision Workshops, WACVW 2024
페이지
341 ~ 350