Face alignment is a crucial step in multiple face analysis and recognition tasks. The current state-of-the-art is comprised by very slow methods based on deep learning that require computationally heavy inference and very fast methods based on cascades of regressors that lack the ability to cope with complicated cases or extreme poses. The authors show how collecting a small subset of unlabelled domain-specific data can improve the accuracy of fast-inference models utilising data annotated by a slower one and a teacher–student architecture. In the proposed solution, they annotate a small subset of facial images belonging to two challenging domains using a slow but more accurate model, and this data is used to incrementally train a fast one. Their results show that by adding as little as a 5% of challenging data, they can reduce the error rate in a specific domain up to 30% without losing any generalisation abilities. This training scheme has applicability in numerous computer vision and engineering problems where computational power and model size are constrained by the application and platform or real-time operation is a requirement.