TR2015-069

Real-time Head Pose and Facial Landmark Estimation from Depth Images Using Triangular Surface Patch Features


Abstract:

We present a real-time system for 3D head pose estimation and facial landmark localization using a commodity depth sensor. We introduce a novel triangular surface patch (TSP) descriptor, which encodes the shape of the 3D surface of the face within a triangular area. The proposed descriptor is viewpoint invariant, and it is robust to noise and to variations in the data resolution. Using a fast nearest neighbor lookup, TSP descriptors from an input depth map are matched to the most similar ones that were computed from synthetic head models in a training phase. The matched triangular surface patches in the training set are used to compute estimates of the 3D head pose and facial landmark positions in the input depth map. By sampling many TSP descriptors, many votes for pose and landmark positions are generated which together yield robust final estimates. We evaluate our approach on the publicly available Biwi Kinect Head Pose Database to compare it against state-of-the-art methods. Our results show a significant improvement in the accuracy of both pose and landmark location estimates while maintaining real-time speed.

 

  • Related News & Events

    •  EVENT    Tim Marks to give lunch talk at Face and Gesture 2017 conference
      Date: Thursday, June 1, 2017
      Location: IEEE Conference on Automatic Face and Gesture Recognition (FG 2017), Washington, DC
      Speaker: Tim K. Marks
      MERL Contact: Tim K. Marks
      Research Area: Machine Learning
      Brief
      • MERL Senior Principal Research Scientist Tim K. Marks will give the invited lunch talk on Thursday, June 1, at the IEEE International Conference on Automatic Face and Gesture Recognition (FG 2017). The talk is entitled "Robust Real-Time 3D Head Pose and 2D Face Alignment.".
    •  
    •  NEWS    MERL Researcher Tim Marks presents an invited talk at MIT Lincoln Laboratory
      Date: April 27, 2017
      Where: Lincoln Laboratory, Massachusetts Institute of Technology
      MERL Contact: Tim K. Marks
      Research Area: Machine Learning
      Brief
      • MERL researcher Tim K. Marks presented an invited talk as part of the MIT Lincoln Laboratory CORE Seminar Series on Biometrics. The talk was entitled "Robust Real-Time 2D Face Alignment and 3D Head Pose Estimation."

        Abstract: Head pose estimation and facial landmark localization are key technologies, with widespread application areas including biometrics and human-computer interfaces. This talk describes two different robust real-time face-processing methods, each using a different modality of input image. The first part of the talk describes our system for 3D head pose estimation and facial landmark localization using a commodity depth sensor. The method is based on a novel 3D Triangular Surface Patch (TSP) descriptor, which is viewpoint-invariant as well as robust to noise and to variations in the data resolution. This descriptor, combined with fast nearest-neighbor lookup and a joint voting scheme, enable our system to handle arbitrary head pose and significant occlusions. The second part of the talk describes our method for face alignment, which is the localization of a set of facial landmark points in a 2D image or video of a face. Face alignment is particularly challenging when there are large variations in pose (in-plane and out-of-plane rotations) and facial expression. To address this issue, we propose a cascade in which each stage consists of a Mixture of Invariant eXperts (MIX), where each expert learns a regression model that is specialized to a different subset of the joint space of pose and expressions. We also present a method to include deformation constraints within the discriminative alignment framework, which makes the algorithm more robust. Both our 3D head pose and 2D face alignment methods outperform the previous results on standard datasets. If permitted, I plan to end the talk with a live demonstration.
    •