Abstract: In 3D human pose estimation, binocular vision typically relies on stereo matching to obtain depth information and calculates 3D keypoints using the disparity principle. However, the high ...
Abstract: Based on its excellent capability to extract temporal features, transformer has been widely used in monocular 3D human pose estimation. However, due to its global perspective, it performs ...