In an effort to better understand meaning from natural language texts, we explore methods aimed a... more In an effort to better understand meaning from natural language texts, we explore methods aimed at organizing lexical objects into contexts. A number of these methods for organization fall into a family defined by word ordering. Unlike demographic or spatial partitions of data, these collocation models are of special importance for their universal applicability in the presence of ordered symbolic data (e.g., text, speech, genes, etc...). Our approach focuses on the phrase (whether word or larger) as the primary meaning-bearing lexical unit and object of study. To do so, we employ our previously developed framework for generating word-conserving phrase-frequency data. Upon training our model with the Wiktionary—an extensive, online, collaborative, and open-source dictionary that contains over 100, 000 phrasal-definitions—we develop highly effective filters for the identification of meaningful, missing phrase-entries. With our predictions we then engage the editorial community of the Wiktionary and propose short lists of potential missing entries for definition, developing a breakthrough, lexical extraction technique, and expanding our knowledge of the defined English lexicon of phrases.
Social media, such as blogs, are often seen as democratic entities that allow more voices to be h... more Social media, such as blogs, are often seen as democratic entities that allow more voices to be heard than the conventional mainstream media as well as a balancing force against the arguably slanted elite media. A systematic comparison between social and mainstream media is necessary but challenging due to the scale and dynamic nature of modern communication. We propose empirical
Despite recent advances in uncovering the quantitative features of stationary human activity patt... more Despite recent advances in uncovering the quantitative features of stationary human activity patterns, many applications, from pandemic prediction to emergency response, require an understanding of how these patterns change when the population encounters unfamiliar conditions. To explore societal response to external perturbations we identified real-time changes in communication and mobility patterns in the vicinity of eight emergencies, such as bomb attacks and earthquakes, comparing these with eight non-emergencies, like concerts and sporting events. We find that communication spikes accompanying emergencies are both spatially and temporally localized, but information about emergencies spreads globally, resulting in communication avalanches that engage in a significant manner the social network of eyewitnesses. These results offer a quantitative view of behavioral changes in human activity under extreme conditions, with potential long-term impact on emergency detection and response.
The individual movements of large numbers of people are important in many contexts, from urban pl... more The individual movements of large numbers of people are important in many contexts, from urban planning to disease spreading. Datasets that capture human mobility are now available and many interesting features have been discovered, including the ultra-slow spatial growth of individual mobility. However, the detailed substructures and spatiotemporal flows of mobility--the sets and sequences of visited locations--have not been well studied. We show that individual mobility is dominated by small groups of frequently visited, dynamically close locations, forming primary "habitats" capturing typical daily activity, along with subsidiary habitats representing additional travel. These habitats do not correspond to typical contexts such as home or work. The temporal evolution of mobility within habitats, which constitutes most motion, is universal across habitats and exhibits scaling patterns both distinct from all previous observations and unpredicted by current models. The delay to enter subsidiary habitats is a primary factor in the spatiotemporal growth of human travel. Interestingly, habitats correlate with non-mobility dynamics such as communication activity, implying that habitats may influence processes such as information spreading and revealing new connections between human mobility and social networks.
ABSTRACT One of the key challenges in modeling the dynamics of contagion phenomena is to understa... more ABSTRACT One of the key challenges in modeling the dynamics of contagion phenomena is to understand how the structure of social interactions shapes the time course of a disease. Complex network theory has provided significant advances in this context. However, awareness of an epidemic in a population typically yields behavioral changes that correspond to changes in the network structure on which the disease evolves. This feedback mechanism has not been investigated in depth. For example, one would intuitively expect susceptible individuals to avoid other infecteds. However, doctors treating patients or parents tending sick children may also increase the amount of contact made with an infecteds, in an effort to speed up recovery but also exposing themselves to higher risks of infection. We study the role of these caretaker links in an adaptive network models where individuals react to a disease by increasing or decreasing the amount of contact they make with infected individuals. We find that, for both homogeneous networks and networks possessing large topological variability, disease prevalence is decreased for low concentrations of caretakers whereas a high prevalence emerges if caretaker concentration passes a well defined critical value.
Until recently, little quantitative data regarding collective human behavior during dangerous eve... more Until recently, little quantitative data regarding collective human behavior during dangerous events such as bombings and riots have been available, despite its importance for emergency management, safety and urban planning. Understanding how populations react to danger is critical for prediction, detection and intervention strategies. Using a large telecommunications dataset, we study for the first time the spatiotemporal, social and demographic
Identifying modular network structure is generally a problem of finding the correct community mem... more Identifying modular network structure is generally a problem of finding the correct community membership of each node in a network. An alternative approach, clustering links, naturally accounts for real world characteristics such as strong community overlap, multi-partite structure, and hierarchical organization. By introducing a pair-wise link similarity, we use a hierarchical clustering method to identify relevant communities in real-world examples
Many systems, from power grids and the internet, to the brain and society, can be modeled using n... more Many systems, from power grids and the internet, to the brain and society, can be modeled using networks of coupled overlapping modules. The elements of these networks perform individual and collective tasks such as generating and consuming electrical load or transmitting data. We study the robustness of these systems using percolation theory: a random fraction of the elements fail which
In an effort to better understand meaning from natural language texts, we explore methods aimed a... more In an effort to better understand meaning from natural language texts, we explore methods aimed at organizing lexical objects into contexts. A number of these methods for organization fall into a family defined by word ordering. Unlike demographic or spatial partitions of data, these collocation models are of special importance for their universal applicability in the presence of ordered symbolic data (e.g., text, speech, genes, etc...). Our approach focuses on the phrase (whether word or larger) as the primary meaning-bearing lexical unit and object of study. To do so, we employ our previously developed framework for generating word-conserving phrase-frequency data. Upon training our model with the Wiktionary—an extensive, online, collaborative, and open-source dictionary that contains over 100, 000 phrasal-definitions—we develop highly effective filters for the identification of meaningful, missing phrase-entries. With our predictions we then engage the editorial community of the Wiktionary and propose short lists of potential missing entries for definition, developing a breakthrough, lexical extraction technique, and expanding our knowledge of the defined English lexicon of phrases.
Social media, such as blogs, are often seen as democratic entities that allow more voices to be h... more Social media, such as blogs, are often seen as democratic entities that allow more voices to be heard than the conventional mainstream media as well as a balancing force against the arguably slanted elite media. A systematic comparison between social and mainstream media is necessary but challenging due to the scale and dynamic nature of modern communication. We propose empirical
Despite recent advances in uncovering the quantitative features of stationary human activity patt... more Despite recent advances in uncovering the quantitative features of stationary human activity patterns, many applications, from pandemic prediction to emergency response, require an understanding of how these patterns change when the population encounters unfamiliar conditions. To explore societal response to external perturbations we identified real-time changes in communication and mobility patterns in the vicinity of eight emergencies, such as bomb attacks and earthquakes, comparing these with eight non-emergencies, like concerts and sporting events. We find that communication spikes accompanying emergencies are both spatially and temporally localized, but information about emergencies spreads globally, resulting in communication avalanches that engage in a significant manner the social network of eyewitnesses. These results offer a quantitative view of behavioral changes in human activity under extreme conditions, with potential long-term impact on emergency detection and response.
The individual movements of large numbers of people are important in many contexts, from urban pl... more The individual movements of large numbers of people are important in many contexts, from urban planning to disease spreading. Datasets that capture human mobility are now available and many interesting features have been discovered, including the ultra-slow spatial growth of individual mobility. However, the detailed substructures and spatiotemporal flows of mobility--the sets and sequences of visited locations--have not been well studied. We show that individual mobility is dominated by small groups of frequently visited, dynamically close locations, forming primary "habitats" capturing typical daily activity, along with subsidiary habitats representing additional travel. These habitats do not correspond to typical contexts such as home or work. The temporal evolution of mobility within habitats, which constitutes most motion, is universal across habitats and exhibits scaling patterns both distinct from all previous observations and unpredicted by current models. The delay to enter subsidiary habitats is a primary factor in the spatiotemporal growth of human travel. Interestingly, habitats correlate with non-mobility dynamics such as communication activity, implying that habitats may influence processes such as information spreading and revealing new connections between human mobility and social networks.
ABSTRACT One of the key challenges in modeling the dynamics of contagion phenomena is to understa... more ABSTRACT One of the key challenges in modeling the dynamics of contagion phenomena is to understand how the structure of social interactions shapes the time course of a disease. Complex network theory has provided significant advances in this context. However, awareness of an epidemic in a population typically yields behavioral changes that correspond to changes in the network structure on which the disease evolves. This feedback mechanism has not been investigated in depth. For example, one would intuitively expect susceptible individuals to avoid other infecteds. However, doctors treating patients or parents tending sick children may also increase the amount of contact made with an infecteds, in an effort to speed up recovery but also exposing themselves to higher risks of infection. We study the role of these caretaker links in an adaptive network models where individuals react to a disease by increasing or decreasing the amount of contact they make with infected individuals. We find that, for both homogeneous networks and networks possessing large topological variability, disease prevalence is decreased for low concentrations of caretakers whereas a high prevalence emerges if caretaker concentration passes a well defined critical value.
Until recently, little quantitative data regarding collective human behavior during dangerous eve... more Until recently, little quantitative data regarding collective human behavior during dangerous events such as bombings and riots have been available, despite its importance for emergency management, safety and urban planning. Understanding how populations react to danger is critical for prediction, detection and intervention strategies. Using a large telecommunications dataset, we study for the first time the spatiotemporal, social and demographic
Identifying modular network structure is generally a problem of finding the correct community mem... more Identifying modular network structure is generally a problem of finding the correct community membership of each node in a network. An alternative approach, clustering links, naturally accounts for real world characteristics such as strong community overlap, multi-partite structure, and hierarchical organization. By introducing a pair-wise link similarity, we use a hierarchical clustering method to identify relevant communities in real-world examples
Many systems, from power grids and the internet, to the brain and society, can be modeled using n... more Many systems, from power grids and the internet, to the brain and society, can be modeled using networks of coupled overlapping modules. The elements of these networks perform individual and collective tasks such as generating and consuming electrical load or transmitting data. We study the robustness of these systems using percolation theory: a random fraction of the elements fail which
Uploads
Papers by James Bagrow