Token-Based Dual-Codebook Learning for Robust 3D Pose Lifting

Minsu Jeon, Janghyun Kim, Jinsun Park
Eurographics 2026 (Short Paper)

Abstract

3D human pose estimation from monocular images is inherently challenging due to frequent occlusions, which introduce significant ambiguity in joint visibility. For instance, regression-based methods are highly sensitive to these ambiguities, often leading to unstable and jittery pose estimates. To overcome these limitations, recent token-based methods discretize poses into structured representations and better capture joint dependencies. However, most existing approaches operate in a frame-wise manner, neglecting temporal continuity and consequently suffering from time-inconsistent predictions. Therefore, we propose a spatio-temporal token-based framework for 3D human pose estimation that explicitly models both spatial and temporal dependencies. In specific, a Spatio-Temporal Tokenizer decomposes 3D pose sequences into discrete spatial and temporal tokens via a dual codebook design. To predict these tokens from 2D pose sequences, we further develop a token classifier based on a SemGCN–GraphGRU architecture, enabling effective temporal reasoning while preserving skeletal structure. Extensive experiments on the Human3.6M dataset demonstrate that our method achieves state-of-the-art performance among short-sequence methods, while significantly reducing high-frequency jitter and producing smooth, physically plausible 3D pose sequences.

Hyperparameter

Framework Overview

Figure 1: 전체적인 시공간 토큰화 프레임워크 개요. 1단계에서 학습된 코드북과 디코더는 2단계 학습 시 고정(freeze)되어 유효한 구조적 사전 지식을 유지합니다[cite: 65, 69].

Supplementary Results Vedios

Methods (f: frames) MPJPE (mm) ↓ P-MPJPE (mm) ↓
PCT [GWW 23] (Single Frame) 50.8 41.9
Cai et al. [CFS 21] (f=7) 45.6 35.5
Ours (f=20) 45.5 36.6

Temporal Stability Analysis

Jitter Analysis

Figure 2:

BibTeX

@inproceedings{jeon2026token,
  title={Token-Based Dual-Codebook Learning for Robust 3D Pose Lifting},
  author={Jeon, Minsu and Lim, L. and Musialski, P.},
  booktitle={Proceedings of Eurographics (Short Papers)},
  year={2026}
}