Recent Submissions

  • Universal patterns in egocentric communication networks

    Iñiguez, Gerardo; Heydari, Sara; Kertész, János; Saramäki, Jari; Department of Network and Data Science (Springer Science and Business Media LLC, 2023-08-26)
    Tie strengths in social networks are heterogeneous, with strong and weak ties playing different roles at the network and individual levels. Egocentric networks, networks of relationships around an individual, exhibit few strong ties and more weaker ties, as evidenced by electronic communication records. Mobile phone data has also revealed persistent individual differences within this pattern. However, the generality and driving mechanisms of social tie strength heterogeneity remain unclear. Here, we study tie strengths in egocentric networks across multiple datasets of interactions between millions of people during months to years. We find universality in tie strength distributions and their individual-level variation across communication modes, even in channels not reflecting offline social relationships. Via a simple model of egocentric network evolution, we show that the observed universality arises from the competition between cumulative advantage and random choice, two tie reinforcement mechanisms whose balance determines the diversity of tie strengths. Our results provide insight into the driving mechanisms of tie strength heterogeneity in social networks and have implications for the understanding of social network structure and individual behavior.
  • Dynamics of cascades on burstiness-controlled temporal networks

    Unicomb, Samuel; Iñiguez, Gerardo; Gleeson, James P.; Karsai, Márton; Department of Network and Data Science (Springer Nature, 2021)
    Burstiness, the tendency of interaction events to be heterogeneously distributed in time, is critical to information diffusion in physical and social systems. However, an analytical framework capturing the effect of burstiness on generic dynamics is lacking. Here we develop a master equation formalism to study cascades on temporal networks with burstiness modelled by renewal processes. Supported by numerical and data-driven simulations, we describe the interplay between heterogeneous temporal interactions and models of threshold-driven and epidemic spreading. We find that increasing interevent time variance can both accelerate and decelerate spreading for threshold models, but can only decelerate epidemic spreading. When accounting for the skewness of different interevent time distributions, spreading times collapse onto a universal curve. Our framework uncovers a deep yet subtle connection between generic diffusion mechanisms and underlying temporal network structures that impacts a broad class of networked phenomena, from spin interactions to epidemic contagion and language dynamics.
  • Priority areas for protection of plant-pollinator interaction networks in the Atlantic Forest

    Pereira, Juliana; Battiston, Federico; Jordán, Ferenc; Department of Network and Data Science (Elsevier, 2022)
    Quantitative methods of prioritization are necessary to optimize the selection of protected areas for biodiversity conservation. Reserve selection is traditionally based on single species, considers representative habitats or, occasionally, spatial configuration but mostly the needs of the society. However, protecting particular species as independent entities is not enough to ensure effective conservation of ecological communities, since their functioning depends on the interactions between species. We propose a strategy to identify priority areas for protection based on species interaction networks. Similar local networks are grouped according to two different sets of network features: interacting species pairs and overall network structure. These groups or clusters of networks are used to delimitate ecological subregions, which are then compared to current nature reserves. Subregions with a lower proportion of protected area are given higher priority. Results from species pairs and network structure are finally combined to obtain the network protection priority index. We present a case study applying this strategy to the Brazilian Atlantic Forest, using plant-pollinator networks. We found that subregions based on network structure show a more grainy pattern, and approach spatial patterns related to forest formation types, while subregions based on species pairs show more distinct patches and a higher level of detail in the division, especially for interior forests. Highest priority is given to portions of the seasonal semi-deciduous and deciduous forest, especially NE S ̃ao Paulo, NW Paran ́a, N Rio Grande do Sul and E MinasGerais and, secondarily, W S ̃ao Paulo and the S ̃ao Francisco region. The approach we suggest here goes beyond the level of species, seeking to perpetuate the ecological interactions and networks that make up biological communities. It is our hope and conviction that this strategy contributes to the development of more effective conservation planning.
  • A systematic comprehensive longitudinal evaluation of dietary factors associated with acute myocardial infarction and fatal coronary heart disease

    Milanlouei, Soodabeh; Menichetti, Giulia; Li, Yanping; Loscalzo, Joseph; Willett, Walter C.; Barabási, Albert-László; Department of Network and Data Science (Springer Nature, 2020)
    Environmental factors, and in particular diet, are known to play a key role in the development of Coronary Heart Disease. Many of these factors were unveiled by detailed nutritional epidemiology studies, focusing on the role of a single nutrient or food at a time. Here, we apply an Environment-Wide Association Study approach to Nurses’ Health Study data to explore comprehensively and agnostically the association of 257 nutrients and 117 foods with coronary heart disease risk (acute myocardial infarction and fatal coronary heart disease). After accounting for multiple testing, we identify 16 food items and 37 nutrients that show statistically significant association – while adjusting for potential confounding and control variables such as physical activity, smoking, calorie intake, and medication use – among which 38 associations were validated in Nurses’ Health Study II. Our implementation of Environment-Wide Association Study successfully reproduces prior knowledge of diet-coronary heart disease associations in the epidemiological literature, and helps us detect new associations that were only marginally studied, opening potential avenues for further extensive experimental validation. We also show that Environment-Wide Association Study allows us to identify a bipartite food-nutrient network, highlighting which foods drive the associations of specific nutrients with coronary heart disease risk.
  • Elites, communities and the limited benefits of mentorship in electronic music

    Janosov, Milán; Musciotto, Federico; Battiston, Federico; Iñiguez, Gerardo; Department of Network and Data Science (Springer Nature, 2020)
    While the emergence of success in creative professions, such as music, has been studied extensively, the link between individual success and collaboration is not yet fully uncovered. Here we aim to fill this gap by analyzing longitudinal data on the co-releasing and mentoring patterns of popular electronic music artists appearing in the annual Top 100 ranking of DJ Magazine. We find that while this ranking list of popularity publishes 100 names, only the top 20 is stable over time, showcasing a lock-in effect on the electronic music elite. Based on the temporal co-release network of top musicians, we extract a diverse community structure characterizing the electronic music industry. These groups of artists are temporally segregated, sequentially formed around leading musicians, and represent changes in musical genres. We show that a major driving force behind the formation of music communities is mentorship: around half of musicians entering the top 100 have been mentored by current leading figures before they entered the list. We also find that mentees are unlikely to break into the top 20, yet have much higher expected best ranks than those who were not mentored. This implies that mentorship helps rising talents, but becoming an all-time star requires more. Our results provide insights into the intertwined roles of success and collaboration in electronic music, highlighting the mechanisms shaping the formation and landscape of artistic elites in electronic music.
  • Bridging the gap between graphs and networks

    Iñiguez, Gerardo; Battiston, Federico; Karsai, Márton; Department of Network and Data Science (Springer Nature, 2020)
    Network science has become a powerful tool to describe the structure and dynamics of real-world complex physical, biological, social, and technological systems. Largely built on empirical observations to tackle heterogeneous, temporal, and adaptive patterns of interactions, its intuitive and flexible nature has contributed to the popularity of the field. With pioneering work on the evolution of random graphs, graph theory is often cited as the mathematical foundation of network science. Despite this narrative, the two research communities are still largely disconnected. In this commentary, we discuss the need for further crosspollination between fields – bridging the gap between graphs and networks – and how network science can benefit from such influence. A more mathematical network science may clarify the role of randomness in modeling, hint at underlying laws of behavior, and predict yet unobserved complex networked phenomena in nature.
  • Exploring food contents in scientific literature with FoodMine

    Hooton, Forrest; Menichetti, Giulia; Barabási, Albert-László; Department of Network and Data Science (Springer Nature, 2020)
    Thanks to the many chemical and nutritional components it carries, diet critically affects human health. However, the currently available comprehensive databases on food composition cover only a tiny fraction of the total number of chemicals present in our food, focusing on the nutritional components essential for our health. Indeed, thousands of other molecules, many of which have well documented health implications, remain untracked. To explore the body of knowledge available on food composition, we built FoodMine, an algorithm that uses natural language processing to identify papers from PubMed that potentially report on the chemical composition of garlic and cocoa. After extracting from each paper information on the reported quantities of chemicals, we find that the scientific literature carries extensive information on the detailed chemical components of food that is currently not integrated in databases. Finally, we use unsupervised machine learning to create chemical embeddings, finding that the chemicals identified by FoodMine tend to have direct health relevance, reflecting the scientific community’s focus on health-related chemicals in our food.
  • Temporal social network reconstruction using wireless proximity sensors: model selection and consequences

    Dai, Sicheng; Bouchet, Hélène; Nardy, Aurélie; Fleury, Eric; Chevrot, Jean-Pierre; Karsai, Márton; Department of Network and Data Science (Springer, 2020)
    The emerging technologies of wearable wireless devices open entirely new ways to record various aspects of human social interactions in a broad range of settings. Such technologies allow to log the temporal dynamics of face-to-face interactions by detecting the physical proximity of participants. However, despite the wide usage of this technology and the collected datasets, precise reconstruction methods transforming the raw recorded communication data packets to social interactions are still missing. In this study we analyse a proximity dataset collected during a longitudinal social experiment aiming to understand the co-evolution of children’s language development and social network. Physical proximity and verbal communication of hundreds of pre-school children and their teachers are recorded over three years using autonomous wearable low power wireless devices. The dataset is accompanied with three annotated ground truth datasets, which record the time, distance, relative orientation, and interaction state of interacting children for validation purposes. We use this dataset to explore several pipelines of dynamical event reconstruction including earlier applied naïve approaches, methods based on Hidden Markov Model, or on Long Short-Term Memory models, some of them combined with supervised pre-classification of interaction packets. We find that while naïve models propose the worst reconstruction, Long Short-Term Memory models provide the most precise way to reconstruct real interactions up to ${\sim} 90\%$∼90% accuracy. Finally, we simulate information spreading on the reconstructed networks obtained by the different methods. Results indicate that small improvement of network reconstruction accuracy may lead to significantly different spreading dynamics, while sometimes large differences in accuracy have no obvious effects on the dynamics. This not only demonstrates the importance of precise network reconstruction but also the careful choice of the reconstruction method in relation with the data collected. Missing this initial step in any study may seriously mislead conclusions made about the emerging properties of the observed network or any dynamical process simulated on it.
  • “Born in Rome” or “Sleeping Beauty”: Emergence of hashtag popularity on the Chinese microblog Sina Weibo

    Cui, Hao; Kertész, János; Department of Network and Data Science (Elsevier, 2023)
    To understand the emergence of hashtag popularity in online social networking complex systems, we study the largest Chinese microblogging site Sina Weibo, which has a Hot Search List (HSL) showing in real time the ranking of the 50 most popular hashtags based on search activity. We investigate the prehistory of successful hashtags from 17 July 2020 to 17 September 2020 by mapping out the related interaction network preceding the selection to HSL. We have found that the circadian activity pattern has an impact on the time needed to get to the HSL. When analyzing this time we distinguish two extreme categories: (a) “Born in Rome”, which means hashtags are mostly first created by superhubs or reach superhubs at an early stage during their propagation and thus gain immediate wide attention from the broad public, and (b) “Sleeping Beauty”, meaning the hashtags gain little attention at the beginning and reach system-wide popularity after a considerable time lag. The evolution of the repost networks of successful hashtags before getting to the HSL show two types of growth patterns: “smooth” and “stepwise”. The former is usually dominated by a superhub and the latter results from consecutive waves of contributions of smaller hubs. The repost networks of unsuccessful hashtags exhibit a simple evolution pattern.
  • Networks beyond pairwise interactions: Structure and dynamics

    Battiston, Federico; Cencetti, Giulia; Iacopini, Iacopo; Latora, Vito; Lucas, Maxime; Patania, Alice; Young, Jean-Gabriel; Petri, Giovanni; Department of Network and Data Science (Elsevier, 2020)
    The complexity of many biological, social and technological systems stems from the richness of the interactions among their units. Over the past decades, a variety of complex systems has been successfully described as networks whose interacting pairs of nodes are connected by links. Yet, from human communications to chemical reactions and ecological systems, interactions can often occur in groups of three or more nodes and cannot be described simply in terms of dyads. Until recently little attention has been devoted to the higher-order architecture of real complex systems. However, a mounting body of evidence is showing that taking the higher-order structure of these systems into account can enhance our modeling capacities and help us understand and predict their dynamical behavior. Here we present a complete overview of the emerging field of networks beyond pairwise interactions. We discuss how to represent higherorder interactions and introduce the different frameworks used to describe higher-order systems, highlighting the links between the existing concepts and representations. We review the measures designed to characterize the structure of these systems and the models proposed to generate synthetic structures, such as random and growing bipartite graphs, hypergraphs and simplicial complexes. We introduce the rapidly growing research on higher-order dynamical systems and dynamical topology, discussing the relations between higher-order interactions and collective behavior. We focus in particular on new emergent phenomena characterizing dynamical processes, such as diffusion, synchronization, spreading, social dynamics and games, when extended beyond pairwise interactions. We conclude with a summary of empirical applications, and an outlook on current modeling and conceptual frontiers.
  • Automating Terror: The Role and Impact of Telegram Bots in the Islamic State’s Online Ecosystem

    Alrhmoun, Abdullah; Winter, Charlie; Kertész, János; Department of Network and Data Science (Taylor & Francis, 2023)
    In this article, we use network science to explore the topology of the Islamic State’s “terrorist bot” network on the online social media platform Telegram, empirically identifying its connections to the Islamic State supporter-run groups and channels that operate across the platform, with which these bots form bipartite structures. As part of this, we examine the diverse activities of the bots to determine the extent to which they operate in synchrony with one another as well as explore their impacts. We show that these bots are mainly clustered around two communities of Islamic State supporters, or “munasirun,” with one community focusing on facilitating discussion and exchange, and the other one augmenting content distribution efforts. Operating as such, this network of bots is used to lubricate and augment the Islamic State’s influence activities, including facilitating content amplification and community cultivation efforts, and connecting people with the movement based on common behaviors, shared interests, and/or ideological proximity while minimizing risk for the broader organization.
  • Stability of Imbalanced Triangles in Gene Regulatory Networks of Cancerous and Normal Cells

    Rizi, Abbas Karimi; Zamani, Mina; Shirazi, Amirhossein; Jafari, G. Reza; Kertész, János; Department of Network and Data Science (Frontiers, 2021)
    Genes communicate with each other through different regulatory effects, which lead to the emergence of complex network structures in cells, and such structures are expected to be different for normal and cancerous cells. To study these differences, we have investigated the Gene Regulatory Network (GRN) of cells as inferred from RNA-sequencing data. The GRN is a signed weighted network corresponding to the inductive or inhibitory interactions. Here we focus on a particular of motifs in the GRN, the triangles, which are imbalanced if the number of negative interactions is odd. By studying the stability of imbalanced triangles in the GRN, we show that the network of cancerous cells has fewer imbalanced triangles compared to normal cells. Moreover, in the normal cells, imbalanced triangles are isolated from the main part of the network, while such motifs are part of the network's giant component in cancerous cells. Our result demonstrates that due to genes' collective behavior the structure of the complex networks is different in cancerous cells from those in normal ones.
  • Revealing Consensus and Dissensus between Network Partitions

    Peixoto, Tiago P.; Department of Network and Data Science (American Physical Society, 2021)
    Community detection methods attempt to divide a network into groups of nodes that share similar properties, thus revealing its large-scale structure. A major challenge when employing such methods is that they are often degenerate, typically yielding a complex landscape of competing answers. As an attempt to extract understanding from a population of alternative solutions, many methods exist to establish a consensus among them in the form of a single partition “point estimate” that summarizes the whole distribution. Here, we show that it is, in general, not possible to obtain a consistent answer from such point estimates when the underlying distribution is too heterogeneous. As an alternative, we provide a comprehensive set of methods designed to characterize and summarize complex populations of partitions in a manner that captures not only the existing consensus but also the dissensus between elements of the population. Our approach is able to model mixed populations of partitions, where multiple consensuses can coexist, representing different competing hypotheses for the network structure. We also show how our methods can be used to compare pairs of partitions, how they can be generalized to hierarchical divisions, and how they can be used to perform statistical model selection between competing hypotheses.
  • The anatomy of social dynamics in escape rooms

    O. Szabo, Rebeka; Department of Network and Data Science (Springer Nature, 2022)
    From sport and science production to everyday life, higher-level pursuits demand collaboration. Despite an increase in the number of data-driven studies on human behavior, the social dynamics of collaborative problem solving are still largely unexplored with network science and other computational and quantitative tools. Here we introduce escape rooms as a non-interventional and minimally biased social laboratory, which allows us to capture at a high resolution real-time communications in small project teams. Our analysis portrays a nuanced picture of different dimensions of social dynamics. We reveal how socio-demographic characteristics impact problem solving and the importance of prior relationships for enhanced interactions. We extract key conversation rules from motif analysis and discuss turn-usurping gendered behavior, a phenomenon particularly strong in male-dominated teams. We investigate the temporal evolution of signed and group interactions, finding that a minimum level of tense communication might be beneficial for collective problem solving, and revealing differences in the behavior of successful and failed teams. Our work unveils the innovative potential of escape rooms to study teams in their complexity, contributing to a deeper understanding of the micro-dynamics of collaborative team processes.