Random walk is widely applied to sample large-scale graphs due to its simplicity of implementation and solid theoretical foundations of bias analysis. However, its computational efficiency is heavily limited by… Click to show full abstract
Random walk is widely applied to sample large-scale graphs due to its simplicity of implementation and solid theoretical foundations of bias analysis. However, its computational efficiency is heavily limited by the slow convergence rate (a.k.a. long burn-in period). To address this issue, we propose a common neighbor aware random walk framework called CNARW, which leverages weighted walking by differentiating the next-hop candidate nodes to speed up the convergence. Specifically, CNARW takes into consideration the common neighbors between previously visited nodes and next-hop candidate nodes in each walking step. Based on CNARW, we further develop two efficient “unbiased sampling” schemes, and we also design two variant algorithms which can reduce sampling cost and speed up the convergence. Experimental results on real-world network datasets show that our approach converges remarkably faster than the state-of-the-art random walk sampling algorithms; and to achieve the same estimation accuracy, our approach reduces the query cost significantly. Last, we use two case studies to demonstrate the effectiveness of our sampling framework in solving large-scale graph analysis tasks.
               
Click one of the above tabs to view related content.