Information diffusion in online social networks is affected by the underlying network topology, b... more Information diffusion in online social networks is affected by the underlying network topology, but it also has the power to change it. Online users are constantly creating new links when exposed to new information sources, and in turn these links are alternating the way information spreads. However, these two highly intertwined stochastic processes, information diffusion and network evolution, have been predominantly studied separately, ignoring their co-evolutionary dynamics. We propose a temporal point process model, COEVOLVE, for such joint dynamics, allowing the intensity of one process to be modulated by that of the other. This model allows us to efficiently simulate interleaved diffusion and network events, and generate traces obeying common diffusion and network patterns observed in real-world networks. Furthermore, we also develop a convex optimization framework to learn the parameters of the model from historical diffusion and network evolution traces. We experimented with...
When a piece of malicious information becomes rampant in an information diffusion network, can we... more When a piece of malicious information becomes rampant in an information diffusion network, can we identify the source node that originally introduced the piece into the network and infer the time when it initiated this? Being able to do so is critical for curtailing the spread of malicious information, and reducing the potential losses incurred. This is a very challenging problem since typically only incomplete traces are observed and we need to unroll the incomplete traces into the past in order to pinpoint the source. In this pa- per, we tackle this problem by developing a two- stage framework, which first learns a continuous- time diffusion network model based on historical diffusion traces and then identifies the source of an incomplete diffusion trace by maximizing the likelihood of the trace under the learned model. Experiments on both large synthetic and real- world data show that our framework can effectively “go back to the past”, and pinpoint the source node and its initiation time significantly more accurately than previous state-of-the-arts.
The real social network and associated communities are often hidden under the declared friend or ... more The real social network and associated communities are often hidden under the declared friend or group lists in social networks. We usually observe the manifestation of these hidden networks and communities in the form of recurrent and time-stamped individuals’ activities in the social network. Inferring the underlying network and finding coherent communities are therefore two key challenges in social networks analysis. In this paper, we address the following question: Could we simultaneously detect community structure and network infectivity among individuals from their activities? Based on the fact that the two characteristics intertwine and that knowing one will help better revealing the other, we propose a multidimensional Hawkes process that can address them simultaneously. To this end, we parametrize the network infectivity in terms of individuals’ participation in communities and the popularity of each individual. We show that this modeling approach has many benefits, both conceptually and experimentally. We utilize Bayesian variational inference to design NetCodec, an efficient inference algorithm which is verified with both synthetic and real world data sets. The experiments show that NetCodec can discover the underlying network infectivity and community structure more accurately than baseline method.
Events in an online social network can be categorized roughly into endogenous events, where users... more Events in an online social network can be categorized roughly into endogenous events, where users just respond to the actions of their neighbors within the network, or exogenous events, where users take actions due to drives external to the network. How much external drive should be provided to each user, such that the network activity can be steered towards a target state? In this paper, we model social events using multivariate Hawkes processes, which can capture both endogenous and exogenous event intensities, and derive a time dependent linear relation between the intensity of exogenous events and the overall network activity. Exploiting this connection, we develop a convex optimization framework for determining the required level of external drive in order for the network to reach a desired activity level. We experimented with event data gathered from Twitter, and show that our method can steer the activity of the network more accurately than alternatives.
We consider the problem of determining the structural differences between different types of soci... more We consider the problem of determining the structural differences between different types of social networks and using these differences for applications concerning prediction of their structures. Much research on this problem has been conducted in the context of social media such as Facebook and Twitter, within which one would like to characterize and classify different types of individuals such as leaders, followers, and influencers. However, we consider the problem in the context of information gathered from law-enforcement agencies, financial institutions, and similar organizations, within which one would like to characterize and classify different types of persons of interest. The members of these networks tend to form special communities and thus new techniques are required. We propose a new generative model called Cliqster, for unweighted networks, and we describe an interpretable, and efficient algorithm for representing networks within this model. Our representation preserves the important underlying characteristics of the network and is both concise and discriminative. We demonstrate the discriminative power of our method by comparing to a traditional SVD method as well as a state-of-the-art Graphlet algorithm. Our results are general in that they can be applied to “person of interest” networks as well as traditional social media networks.
Information diffusion in online social networks is affected by the underlying network topology, b... more Information diffusion in online social networks is affected by the underlying network topology, but it also has the power to change it. Online users are constantly creating new links when exposed to new information sources, and in turn these links are alternating the way information spreads. However, these two highly intertwined stochastic processes, information diffusion and network evolution, have been predominantly studied separately, ignoring their co-evolutionary dynamics. We propose a temporal point process model, COEVOLVE, for such joint dynamics, allowing the intensity of one process to be modulated by that of the other. This model allows us to efficiently simulate interleaved diffusion and network events, and generate traces obeying common diffusion and network patterns observed in real-world networks. Furthermore, we also develop a convex optimization framework to learn the parameters of the model from historical diffusion and network evolution traces. We experimented with...
When a piece of malicious information becomes rampant in an information diffusion network, can we... more When a piece of malicious information becomes rampant in an information diffusion network, can we identify the source node that originally introduced the piece into the network and infer the time when it initiated this? Being able to do so is critical for curtailing the spread of malicious information, and reducing the potential losses incurred. This is a very challenging problem since typically only incomplete traces are observed and we need to unroll the incomplete traces into the past in order to pinpoint the source. In this pa- per, we tackle this problem by developing a two- stage framework, which first learns a continuous- time diffusion network model based on historical diffusion traces and then identifies the source of an incomplete diffusion trace by maximizing the likelihood of the trace under the learned model. Experiments on both large synthetic and real- world data show that our framework can effectively “go back to the past”, and pinpoint the source node and its initiation time significantly more accurately than previous state-of-the-arts.
The real social network and associated communities are often hidden under the declared friend or ... more The real social network and associated communities are often hidden under the declared friend or group lists in social networks. We usually observe the manifestation of these hidden networks and communities in the form of recurrent and time-stamped individuals’ activities in the social network. Inferring the underlying network and finding coherent communities are therefore two key challenges in social networks analysis. In this paper, we address the following question: Could we simultaneously detect community structure and network infectivity among individuals from their activities? Based on the fact that the two characteristics intertwine and that knowing one will help better revealing the other, we propose a multidimensional Hawkes process that can address them simultaneously. To this end, we parametrize the network infectivity in terms of individuals’ participation in communities and the popularity of each individual. We show that this modeling approach has many benefits, both conceptually and experimentally. We utilize Bayesian variational inference to design NetCodec, an efficient inference algorithm which is verified with both synthetic and real world data sets. The experiments show that NetCodec can discover the underlying network infectivity and community structure more accurately than baseline method.
Events in an online social network can be categorized roughly into endogenous events, where users... more Events in an online social network can be categorized roughly into endogenous events, where users just respond to the actions of their neighbors within the network, or exogenous events, where users take actions due to drives external to the network. How much external drive should be provided to each user, such that the network activity can be steered towards a target state? In this paper, we model social events using multivariate Hawkes processes, which can capture both endogenous and exogenous event intensities, and derive a time dependent linear relation between the intensity of exogenous events and the overall network activity. Exploiting this connection, we develop a convex optimization framework for determining the required level of external drive in order for the network to reach a desired activity level. We experimented with event data gathered from Twitter, and show that our method can steer the activity of the network more accurately than alternatives.
We consider the problem of determining the structural differences between different types of soci... more We consider the problem of determining the structural differences between different types of social networks and using these differences for applications concerning prediction of their structures. Much research on this problem has been conducted in the context of social media such as Facebook and Twitter, within which one would like to characterize and classify different types of individuals such as leaders, followers, and influencers. However, we consider the problem in the context of information gathered from law-enforcement agencies, financial institutions, and similar organizations, within which one would like to characterize and classify different types of persons of interest. The members of these networks tend to form special communities and thus new techniques are required. We propose a new generative model called Cliqster, for unweighted networks, and we describe an interpretable, and efficient algorithm for representing networks within this model. Our representation preserves the important underlying characteristics of the network and is both concise and discriminative. We demonstrate the discriminative power of our method by comparing to a traditional SVD method as well as a state-of-the-art Graphlet algorithm. Our results are general in that they can be applied to “person of interest” networks as well as traditional social media networks.
Uploads
Papers by Mehrdad Farajtabar
In this paper, we address the following question: Could we simultaneously detect community structure and network infectivity among individuals from their activities? Based on the fact that the two characteristics intertwine and that knowing one will help better revealing the other, we propose a multidimensional Hawkes process that can address them simultaneously. To this end, we parametrize the network infectivity in terms of individuals’ participation in communities and the popularity of each individual. We show that this modeling approach has many benefits, both conceptually and experimentally. We utilize Bayesian variational inference to design NetCodec, an efficient inference algorithm which is verified with both synthetic and real world data sets. The experiments show that NetCodec can discover the underlying network infectivity and community structure more accurately than baseline method.
In this paper, we address the following question: Could we simultaneously detect community structure and network infectivity among individuals from their activities? Based on the fact that the two characteristics intertwine and that knowing one will help better revealing the other, we propose a multidimensional Hawkes process that can address them simultaneously. To this end, we parametrize the network infectivity in terms of individuals’ participation in communities and the popularity of each individual. We show that this modeling approach has many benefits, both conceptually and experimentally. We utilize Bayesian variational inference to design NetCodec, an efficient inference algorithm which is verified with both synthetic and real world data sets. The experiments show that NetCodec can discover the underlying network infectivity and community structure more accurately than baseline method.