In semisupervised community detection, the membership of a set of revealed nodes is known in addition to the graph structure and can be leveraged to achieve better inference accuracies. While… Click to show full abstract
In semisupervised community detection, the membership of a set of revealed nodes is known in addition to the graph structure and can be leveraged to achieve better inference accuracies. While previous works investigated the case where the revealed nodes are selected at random, this paper focuses on correlated subsets leading to atypically high accuracies. In the framework of the dense stochastic block model, we employ statistical physics methods to derive a large deviation analysis of the number of these rare subsets, as characterized by their free energy. We find theoretical evidence of a nonmonotonic relationship between reconstruction accuracy and the free energy associated to the posterior measure of the inference problem. We further discuss possible implications for active learning applications in community detection.
               
Click one of the above tabs to view related content.