Abstract The promise of big data is enormous and nowhere is it more critical than in its potential to contain important, undiscovered interdependence among thousands of variables. Networks have arisen… Click to show full abstract
Abstract The promise of big data is enormous and nowhere is it more critical than in its potential to contain important, undiscovered interdependence among thousands of variables. Networks have arisen as a powerful tool to detect how different variables are interconnected and how these interconnections mediate the internal workings and dynamics of various physical, chemical, biological, and social systems. Although a number of statistical methods have been developed for network reconstruction, the use of networks to excavate useful information from complex big data poses a conceptual, technical, and computational challenge. Here, we describe recent advances in statistical formalism that can recover mechanistically interpretable and practically applicable networks from wide big data domains. Traditional approaches can only infer an overall network from a number of samples, failing to reveal sample-specific differences. Identifying meaningful networks particularly requires the availability of high-density temporal or perturbed data and their dynamic fitting over time, both of which are hardly met in most big data sectors. The new formalism can particularly extract and tap dynamic information hidden in static data to reconstruct mobile networks without need of temporal data and, thereby, track how topological architecture changes from sample to sample and across time and space scales. We review the establishment principle of this new formalism derived from the seamless integration of ingredients of various disciplines, such as allometric scaling laws, evolutionary game theory, and developmental modularity theory. We show how this formalism can infer fully informative networks, encapsulated by bidirectional, signed, and weighted pairwise and high-order interactions, overcoming the intrinsic limitation of static data for network identification. We propose a general framework to augment a generalized argumentĀ for inferring omnidirectional, multilayer, and multispace networks from any high dimension of data and fill gaps in network reconstruction between standard techniques and the current state of the art. This formalism represents a paradigm shifting emerging technology that can make sophisticated networks a more efficient, effective and widespread tool to disentangle natural complexities in the era of big data.
               
Click one of the above tabs to view related content.