Face detection/alignment methods have reached a satisfactory state in static images captured under arbitrary conditions. Such methods typically perform (joint) fitting for each frame and are used in commercial applications; however in the majority of the real-world scenarios the dynamic scenes are of interest. We argue that generic fitting per frame is suboptimal (it discards the informative correlation of sequential frames) and propose to learn person-specific statistics from the video to improve the generic results. To that end, we introduce a meticulously studied pipeline, which we name PD 2 T, that performs person-specific detection and landmark localisation. We carry out extensive experimentation with a diverse set of i) generic fitting results, ii) different objects (human faces, animal faces) that illustrate the powerful properties of our proposed pipeline and experimentally verify that PD 2 T outperforms all the compared methods.

Chrysos Grigorios G., Zafeiriou Stefanos

A1 Journal article – refereed

G. G. Chrysos and S. Zafeiriou, "PD2T: Person-Specific Detection, Deformable Tracking," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 11, pp. 2555-2568, 1 Nov. 2018. doi: 10.1109/TPAMI.2017.2769654

https://doi.org/10.1109/TPAMI.2017.2769654 http://urn.fi/urn:nbn:fi-fe201902276473