Learning from Yourself to Others for Unsupervised Visible-Infrared Re-Identification

Unsupervised visible-infrared person re-identification (US-VI-ReID) aims to match unlabeled pedestrian images captured under varying lighting conditions. The key challenge lies in generating accurate pseudo-labels, alongside alleviating the significant modality gap between visible and infrared modalities. Existing methods mainly focus on mitigating the effects of noisy labels through loss functions during backward propagation. However, these noisy labels already influence the forward propagation, leading to incorrect cross-modality correspondences. To address this issue, we propose a Hierarchical Centrality Collaborative Learning (HCCL) framework for US-VI-ReID, which proactively identifies noisy labels during the forward propagation. The rationale behind HCCL is that intra-modality refinement serves as the foundation for establishing cross-modality correspondences, reflecting the principle of learning from yourself to others. For intra-modality learning, we propose a Closeness Centrality Selection (CCS), quantifying sample confidence using closeness centrality to identify noisy samples. By discarding the noisy samples during forward propagation, CCS mitigates their adverse effects and ensures identity-consistent representation learning. For cross-modality learning, a Hierarchical Consistency Matching (HCM) is proposed to establish local instance-level label associations by leveraging bidirectional consistency with the most reliable samples identified during intra-modality learning. These local associations are then propagated to guide the global cluster-level cross-modality correspondences. Extensive experiments demonstrate that our HCCL achieves competitive performance on mainstream datasets, even surpassing some supervised counterparts. Additionally, outstanding results on corrupted datasets verify its generalizability and robustness.