Joint Clustering and Discriminative Feature Alignment for Unsupervised Domain Adaptation

Unsupervised Domain Adaptation (UDA) aims to learn a classifier for the unlabeled target domain by leveraging knowledge from a labeled source domain with a different but related distribution. Many existing approaches typically learn a domain-invariant representation space by directly matching the marginal distributions of the two domains. However, they ignore exploring the underlying discriminative features of the target data and align the cross-domain discriminative features, which may lead to suboptimal performance. To tackle these two issues simultaneously, this paper presents a Joint Clustering and Discriminative Feature Alignment (JCDFA) approach for UDA, which is capable of naturally unifying the mining of discriminative features and the alignment of class-discriminative features into one single framework. Specifically, in order to mine the intrinsic discriminative information of the unlabeled target data, JCDFA jointly learns a shared encoding representation for two tasks: supervised classification of labeled source data, and discriminative clustering of unlabeled target data, where the classification of the source domain can guide the clustering learning of the target domain to locate the object category. We then conduct the cross-domain discriminative feature alignment by separately optimizing two new metrics: 1) an extended supervised contrastive learning, i.e. , semi-supervised contrastive learning 2) an extended Maximum Mean Discrepancy (MMD), i.e. , conditional MMD, explicitly minimizing the intra-class dispersion and maximizing the inter-class compactness. When these two procedures, i.e. , discriminative features mining and alignment are integrated into one framework, they tend to benefit from each other to enhance the final performance from a cooperative learning perspective. Experiments are conducted on four real-world benchmarks ( e.g. , Office-31, ImageCLEF-DA, Office-Home and VisDA-C). All the results demonstrate that our JCDFA can obtain remarkable margins over state-of-the-art domain adaptation methods. Comprehensive ablation studies also verify the importance of each key component of our proposed algorithm and the effectiveness of combining two learning strategies into a framework.