This document summarizes research on social media and Twitter. It analyzes data from over 41 million Twitter user profiles and 1.47 billion social connections to study the topological characteristics of Twitter's social network and how information spreads through retweets. Some key findings include:
- Twitter's follower network shows non-power law distribution, short effective diameter, and low reciprocity compared to other social networks.
- Ranking users by number of followers, PageRank, and number of retweets produces different results, indicating a gap between influence inferred from followers and influence from retweet popularity.
- Retweets allow information to reach an average of 1,000 users no matter the original tweet's number of followers, and retweets
Report
Share
Report
Share
1 of 35
Download to read offline
More Related Content
Social Media Part 2(KAIST)
1. Social Networking
and Social Media
May 2010
Sangki Han, Ph.D.
Professor / GSCT
KAIST
1
2. Social Media
• Media designed to be disseminated through
social interaction, created using highly
accessible and scalable publishing techniques
- Wikipedia, Nov. 22nd 2009
2
3. Basic Form of Social Media
Social Network: Facebook, MySpace, CyWorld, Bebo, Mixi, ...
Blogs: Naver, Tistory, Egloos, ...
Blog Aggregators: Technorati, AllBlog, Blog Korea, ...
Wikis: Wikipedia
Podcasts
Forum: Agora, Slashdot, ...
Content Communities: Flickr (Photo), YouTube (Video),
Del.icio.us (Bookmark), Digg (News)
Microblogging: Twitter, me2day
3
4. How Big a Deal Is Social
Media
• Technorati is tracking 112.8 million blogs and over 250 million
pieces of tagged social media
• 100,000,000 views a day on YouTube (Oct. 2009)
• More than 300 million profiles created by users on social
network MySpace and Facebook
• 13,000,000 articles on Wikipedia
• 3,600,000,000 photos on Flickr.com as of June 2009
• 3,000,000 Tweets per day
• 1,000,000,000 content shared each week on Facebook
4
7. Why Is Social Media So
Important?
From: Marta Kagan, Magaging Director of Espresso, US
• Out of 4 Americans use social technology
- Forrester, The Growth of Social Technology March 2009
Adoption, 2008 Global Faces and
Networked Places
A Nielsen report on
Social Networking’s
• ⅔ of the global internet population visit social networks New Global Footprint
• Visiting social sites is now the 4th most popular online
activity
• Time spent on social networks is growling at 3X the
overall Internet rate, accounting for ~10% of all Internet
time
• Social media is democratizing communications
INSIDE:
Social networks/
- “Technology is shifting the power away from the
blogs now 4th
most popular online
category – ahead of
personal e-mail
These sites account for
editors, the publishers, the establishment, the media
one in every 11 minutes online
Orkut in Brazil has the largest domestic
online reach (70%) of any social network
anywhere in the world
elite. Now it’s the people who are in control.” --
Facebook has the highest average time per visitor
amongst the 75 most popular brands online worldwide
Rupert Murdoch
7
9. Expenditure and Revenue Source
Media share of US Advertising 1959-2009 Change in Ad Revenue by Medium
2008 to 2009
Source: Martin Langeveld at Nieman Journalism Lab; data from NAA, TVB, IAB, McCann
9 SPRING 2010 GCT784
11. Changes of News Access
Trends in news access
Source: Pew Research
11
12. Newspaper Economics
Revenue and costs
Revenue (%) Costs as % of revenue
Advertising 80 Core 35%
Retail 40% Promotion 12%
Classified 32% Editorial 14%
National 8% Administrative 9%
Sales 20 Prodn & Distn 52%
Newsstand 17% Production 20%
Subscription 3% Distribution 14%
Total 100% 100 Raw materials 18%
Total 87% 87%
Internet distribution
could cut production
costs by at least half.
Source: Vogel, H, Entertainment Industry Economics, 7th edition, page 343
[Source: Hal Varian, March 9, 2010
(revised March 13, 2010)]
12
13. “In media, we are moving from a
content economy to a link
economy”
-- BuzzMachine blog by Jeff Jarvis, journalist and associate
professor at CUNY’s Graduate School of Journalism
13
15. New Journalism
“the impact of social media was overestimated
in the short term and underestimated in long
term” - Richard Sambrook, the director of the
BBC Global News Division
15 SPRING 2010 GCT784
18. Everyone is Author
• Denis G. Pelli, a professor of psychology and neural science at New York University and
co-inventor of the Pelli-Robson contrast sensitivity chart. Charles Bigelow, the Carey
Distinguished Professor of Graphic Arts at the Rochester Institute of Technology, a
MacArthur Foundation prize fellow, and co-designer of the widely used Lucida font.
18
20. Twitter User Statistics
• Twitter now has 105,779,710 registered users.
• New users are signing up at the rate of 300,000 per day.
• 180 million unique visitors come to the site every month.
• 75% of Twitter traffic comes from outside Twitter.com (i.e. via third
party applications.)
• Twitter gets a total of 3 billion requests a day via its API.
• Twitter users are, in total, tweeting an average of 55 million tweets a
day.
• Twitter's search engine receives around 600 million search queries
per day.
• Of Twitter's active users, 37 percent use their phone to tweet.
• Over half of all tweets (60 percent) come from third party
applications.
• Twitter itself has grown: in the past year alone, it has grown from 25
to 175 employees.
20 SPRING 2010 GCT784
28. Twitter Data Analysis
• Twitter's user growth is no longer accelerating. The rate of new user acquisition has plateaued at
around 8 million per month.
• Over 14% of users don't have a single follower, and over 75% of users have 10 or fewer followers.
• 38% of users have never sent a single tweet, and over 75% of users have sent fewer than 10 tweets.
• 1 in 4 registered users tweets in any given month.
• Once a user has tweeted once, there is a 65% chance that they will tweet again. After that second
tweet, however, the chance of a third tweet goes up to 81%.
• If someone is still tweeting in their second week as a user, it is extremely likely that they will
remain on Twitter as a long-term user.
• Users who joined in more recent months are less likely to stop using the service and more likely to
tweet more often than users from the past.
Robert J. Moore, the CEO and co-founder of RJMetrics
28
30. Twitter CEO on the
Future of Twitter
• User-Generated Lists on a
Particular Subject
• Geographical Location Datelines
• Reputation System
• Searchability and Organization of
Tweets
30
31. The Future of Twitter’s Platform at Chirp
• Twitter-Built BlackBerry App
• Acquisition of Tweetie
• Launch of Twitter’s ad platform
• Places: developers will be able to attach location-
based metadata and use it to enhance their
products.
• User Streams: will make Twitter apps real-time.
• Annotations: allows developers to attach little
pieces of metadata to tweets
31
32. Social Web Technology in 2010
• Rise of Twitter and realtime
messaging and search
• Realtime feed aggregation
technology
- ActivitySteams
- PubSubHubPub - realtime
notification
- Salmon - comments and responses
on syndicated feed content
33. Researches on Twitter
What is Twitter, a Social Network or a News Media?
Measuring User Influence in Twitter: The Million Follower Fallacy
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon
Meeyoung Cha∗ Hamed Haddadi† Fabr´cio Benevenuto‡
ı Krishna P. Gummadi∗
Department of Computer Science, KAIST
335 Gwahangno, Yuseong-gu, Daejeon, Korea ∗
Max Planck Institute for Software Systems (MPI-SWS), Germany
{haewoon, chlee, hosung}@an.kaist.ac.kr, sbmoon@kaist.edu †
Royal Veterinary College, University of London, United Kingdom
‡
CS Dept., Federal University of Minas Gerais (UFMG), Brazil
ABSTRACT 1. INTRODUCTION Abstract a minority of users, called influentials, excel in persuading
Twitter, a microblogging service less than three years old, com- Twitter, a microblogging service, has emerged as a new medium others (Rogers 1962). This theory predicts that by target-
mands more than 41 million users as of July 2009 and is growing in spotlight through recent happenings, such as an American stu- Directed links in social media could represent anything ing these influentials in the network, one may achieve a
fast. Twitter users tweet about any topic within the 140-character dent jailed in Egypt and the US Airways plane crash on the Hudson from intimate friendships to common interests, or even
a passion for breaking news or celebrity gossip. Such
large-scale chain-reaction of influence driven by word-of-
limit and follow others to receive their tweets. The goal of this river. Twitter users follow others or are followed. Unlike on most
directed links determine the flow of information and mouth, with a very small marketing cost (Katz and Lazars-
paper is to study the topological characteristics of Twitter and its online social networking sites, such as Facebook or MySpace, the
relationship of following and being followed requires no reciproca- hence indicate a user’s influence on others—a concept feld 1955). A more modern view, in contrast, de-emphasizes
power as a new medium of information sharing.
tion. A user can follow any other user, and the user being followed that is crucial in sociology and viral marketing. In this the role of influentials. Instead, it posits that the key fac-
We have crawled the entire Twitter site and obtained 41.7 million
user profiles, 1.47 billion social relations, 4, 262 trending topics, need not follow back. Being a follower on Twitter means that the paper, using a large amount of data collected from Twit- tors determining influence are (i) the interpersonal rela-
and 106 million tweets. In its follower-following topology analysis user receives all the messages (called tweets) from those the user ter, we present an in-depth comparison of three mea- tionship among ordinary users and (ii) the readiness of a
we have found a non-power-law follower distribution, a short effec- follows. Common practice of responding to a tweet has evolved sures of influence: indegree, retweets, and mentions. society to adopt an innovation (Watts and Dodds 2007;
into well-defined markup culture: RT stands for retweet, ’@’ fol- Based on these measures, we investigate the dynam- Domingos and Richardson 2001). This modern view of in-
tive diameter, and low reciprocity, which all mark a deviation from
lowed by a user identifier address the user, and ’#’ followed by a ics of user influence across topics and time. We make fluence leads to marketing strategies such as collaborative
known characteristics of human social networks [28]. In order to
word represents a hashtag. This well-defined markup vocabulary several interesting observations. First, popular users filtering. These theories, however, are still just theories, be-
identify influentials on Twitter, we have ranked users by the number
who have high indegree are not necessarily influential
of followers and by PageRank and found two rankings to be sim- combined with a strict limit of 140 characters per posting conve-
in terms of spawning retweets or mentions. Second,
cause there has been a lack of empirical data that could be
ilar. Ranking by retweets differs from the previous two rankings, niences users with brevity in expression. The retweet mechanism used to validate either of them. The recent advent of social
most influential users can hold significant influence over
indicating a gap in influence inferred from the number of followers empowers users to spread information of their choice beyond the networking sites and the data within such sites now allow
a variety of topics. Third, influence is not gained spon-
and that from the popularity of one’s tweets. We have analyzed the reach of the original tweet’s followers. researchers to empirically validate these theories.
taneously or accidentally, but through concerted effort
tweets of top trending topics and reported on their temporal behav- How are people connected on Twitter? Who are the most influ- such as limiting tweets to a single topic. We believe that Moving from theory into practice, we find that there are
ior and user participation. We have classified the trending topics ential people? What do people talk about? How does information these findings provide new insights for viral marketing
diffuse via retweet? The goal of this work is to study the topolog-
many other unanswered questions about how influence dif-
based on the active period and the tweets and show that the ma- and suggest that topological measures such as indegree
ical characteristics of Twitter and its power as a new medium of fuses through a population and whether it varies across top-
jority (over 85%) of topics are headline news or persistent news in alone reveals very little about the influence of a user.
nature. A closer look at retweets reveals that any retweeted tweet information sharing. We have crawled 41.7 million user profiles, ics and time. People have different levels of expertise on
is to reach an average of 1, 000 users no matter what the number 1.47 billion social relations, and 106 million tweets1 . We begin various subjects. When it comes to marketing, however, this
of followers is of the original tweet. Once retweeted, a tweet gets with the network analysis and study the distributions of followers Introduction fact is generally ignored. Marketing services actively search
retweeted almost instantly on next hops, signifying fast diffusion and followings, the relation between followers and tweets, reci- for potential influencers to promote various items. These
procity, degrees of separation, and homophily. Next we rank users Influence has long been studied in the fields of sociology, influencers range from “cool” teenagers, local opinion lead-
of information after the 1st retweet.
To the best of our knowledge this work is the first quantitative by the number of followers, PageRank, and the number of retweets communication, marketing, and political science (Rogers ers, all the way to popular public figures. However, the ad-
study on the entire Twittersphere and information diffusion on it. and present quantitative comparison among them. The ranking by 1962; Katz and Lazarsfeld 1955). The notion of influence vertised items are often far outside the domain of expertise
retweets pushes those with fewer than a million followers on top plays a vital role in how businesses operate and how a soci- of these hired individuals. So how effective are these mar-
of those with more than a million followers. Through our trending ety functions—for instance, see observations on how fashion keting strategies? Can a person’s influence in one area be
topic analysis we show what categories trending topics are classi- spreads (Gladwell 2002) and how people vote (Berry and
Categories and Subject Descriptors fied into, how long they last, and how many users participate. Fi- Keller 2003). Studying influence patterns can help us bet-
transferred to other areas?
J.4 [Computer Applications]: Social and behavioral sciences In this paper, we present an empirical analysis of influ-
nally, we study the information diffusion by retweet. We construct ter understand why certain trends or innovations are adopted
retweet trees and examine their temporal and spatial characteris-
ence patterns in a popular social medium. Using a large
faster than others and how we could help advertisers and
tics. To the best of our knowledge this work is the first quantitative amount of data gathered from Twitter, we compare three dif-
marketers design more effective campaigns. Studying influ-
General Terms study on the entire Twittersphere and information diffusion on it. ferent measures of influence: indegree, retweets, and men-
ence patterns, however, has been difficult. This is because
This paper is organized as follows. Section 2 describes our data tions.1 Focusing on different topics, we examine how the
Human Factors, Measurement such a study does not lend itself to readily available quan-
crawling methodology on Twitter’s user profile, trending topics, three types of influential users performed in spreading pop-
tification, and essential components like human choices and
and tweet messages. We conduct basic topological analysis of the ular news topics. We also investigate the dynamics of an
the ways our societies function cannot be reproduced within
Twitter network in Section 3. In Section 4 we apply the PageRank individual’s influence by topic and over time. Finally, we
Keywords the confines of the lab.
algorithm on the Twitter network and compare its outcome against characterize the precise behaviors that make ordinary indi-
ranking by retweets. In Section 5 we study how their popularity Nevertheless, there have been important theoretical stud- viduals gain high influence over a short period of time.
Twitter, Online social network, Reciprocity, Homophily, Degree of ies on the diffusion of influence, albeit with radically dif-
separation, Retweet, Information diffusion, Influential, PageRank rises and falls among users over time. In Section 6 we focus in-
formation diffusion through retweet trees. Section 7 covers related ferent results. Traditional communication theory states that 1
Indegree is the number of people who follow a user; retweets
Copyright is held by the International World Wide Web Conference Com- work and puts our work in perspective. In Section 8 we conclude.
mittee (IW3C2). Distribution of these papers is limited to classroom use,
Copyright c 2010, Association for the Advancement of Artificial mean the number of times others “forward” a user’s tweet; and
and personal use by others. 1
We make our dataset publicly available online at: Intelligence (www.aaai.org). All rights reserved. mentions mean the number of times others mention a user’s name.
WWW 2010, April 26–30, 2010, Raleigh, North Carolina, USA. http://an.kaist.ac.kr/traces/WWW2010.html
ACM 978-1-60558-799-8/10/04.
33
35. Cultural Difference?
• Top 14% of users account for 80% of total tweets
- 20% of tweets from Top 1% users
• Koreans use more @ and RT than international users
* Data for International user is from: danah boyd, Scott Golder, and Gilad Lotan (Forthcoming, 2010)."Tweet Tweet Retweet: Conversational Aspects of
Retweeting on Twitter." Proceedings of HICSS-42, Persistent Conversation Track. Kauai, HI: IEEE Computer Society. January 5-8, 2010.
35