Hierarchically Contrastive Hard Sample Mining for Graph Self-Supervised Pretraining.
Journal
IEEE transactions on neural networks and learning systems
ISSN: 2162-2388
Titre abrégé: IEEE Trans Neural Netw Learn Syst
Pays: United States
ID NLM: 101616214
Informations de publication
Date de publication:
17 Aug 2023
17 Aug 2023
Historique:
pubmed:
17
8
2023
medline:
17
8
2023
entrez:
17
8
2023
Statut:
aheadofprint
Résumé
Contrastive learning has recently emerged as a powerful technique for graph self-supervised pretraining (GSP). By maximizing the mutual information (MI) between a positive sample pair, the network is forced to extract discriminative information from graphs to generate high-quality sample representations. However, we observe that, in the process of MI maximization (Infomax), the existing contrastive GSP algorithms suffer from at least one of the following problems: 1) treat all samples equally during optimization and 2) fall into a single contrasting pattern within the graph. Consequently, the vast number of well-categorized samples overwhelms the representation learning process, and limited information is accumulated, thus deteriorating the learning capability of the network. To solve these issues, in this article, by fusing the information from different views and conducting hard sample mining in a hierarchically contrastive manner, we propose a novel GSP algorithm called hierarchically contrastive hard sample mining (HCHSM). The hierarchical property of this algorithm is manifested in two aspects. First, according to the results of multilevel MI estimation in different views, the MI-based hard sample selection (MHSS) module keeps filtering the easy nodes and drives the network to focus more on hard nodes. Second, to collect more comprehensive information for hard sample learning, we introduce a hierarchically contrastive scheme to sequentially force the learned node representations to involve multilevel intrinsic graph features. In this way, as the contrastive granularity goes finer, the complementary information from different levels can be uniformly encoded to boost the discrimination of hard samples and enhance the quality of the learned graph embedding. Extensive experiments on seven benchmark datasets indicate that the HCHSM performs better than other competitors on node classification and node clustering tasks. The source code of HCHSM is available at https://github.com/WxTu/HCHSM.
Identifiants
pubmed: 37590106
doi: 10.1109/TNNLS.2023.3297607
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM