22 June 2021

3D Skeletal Gesture Recognition via Discriminative Coding on Time-Warping Invariant Riemannian Trajectories

Learning 3D skeleton-based representation for gesture recognition has progressively stood out because of its invariance to the viewpoint and background dynamics of video. Typically, existing techniques use absolute coordinates to determine human motion features. The recognition of gestures, however, is irrespective of the position of the performer, and the extracted features should be invariant to body size. In addition, when comparing and classifying gestures, the problem of temporal dynamics can greatly distort the distance metric. In this paper, we represent a 3D skeleton as a point in the special orthogonal group SO(3) product space that expressly models the 3D geometric relationships between body parts. As such, a gesture skeletal sequence can be described by a trajectory on a Riemannian manifold. Following that, we propose to generalize the transported square-root vector field to obtain a time-warping invariant metric for comparing these trajectories (identifying these gestures). Moreover, by specifically considering the labeling information with encoding, a sparse coding scheme of skeletal trajectories is presented to enforce the discriminant validity of atoms in the dictionary. Experimental results indicate that the proposed approach has achieved state-of-the-art performance on many challenging gesture recognition benchmarks.