Skip-gram models are popular in large-scale network embedding for their cost-effectiveness. The objectives of many skip-gram based methods relate to the word2vec model which closely relates to Noise Contrastive Estimation… Click to show full abstract
Skip-gram models are popular in large-scale network embedding for their cost-effectiveness. The objectives of many skip-gram based methods relate to the word2vec model which closely relates to Noise Contrastive Estimation (NCE). Among existing embedding methods, the differences mostly lie in how the node neighborhood is modeled e.g., by different ways of random walk, which leads to different learning strategies. Orthogonal to these efforts, we take a unified view that the NCE based methods commonly involve two basic NCE components in the learning objective. This perspective allows a natural generalization of the objectives by taking different forms of scoring function in the NCE components. We theoretically analyze how the vanilla NCE-based objectives suffer from the slow convergence speed and challenge in first-/second-order proximity preservation. We also prove the fundamental difficulty for NCE methods to capture non-linearity of complex networks. To mitigate such issues, we devise a general distance-based term added to the used NCE term, inspired by its physical meaning. The distance functions include Wasserstein-
               
Click one of the above tabs to view related content.