Mining the Social Web to Analyze the Impact of

Social Media on Socialization

Md. Nazmus Sadat, Shibbir Ahmed, and Muhammad Tasnim Mohiuddin
Department of Computer Science and Engineering
Bangladesh University of Engineering and Technology
Dhaka, Bangladesh
Email: {nazmus.cse, shibbirahmedtanvin, tasnimcse08}@gmail.com

Abstract— Socialization refers to the lifelong process of II. RELATED WORK

inheriting and disseminating norms, customs and ideologies,
providing an individual with the skills and habits necessary for Barkhuus and Tashiro [1] in their paper entitled “Student
participating within his or her own society. The process of Socialization in the Age of Facebook” focused on offline
socialization is operative not only in childhood but throughout socializing structures around an online social network
life. Traditional agents of socialization include family, age mates, (exemplified by Facebook) and how this can facilitate in-
teachers and mass media. But now-a-days, in a world dominated person social life for students. Because students lead nomadic
by technology, social media also play a vital role in socialization. lives, they find Facebook a particularly useful tool for initiating
Social media drastically change the Internet and public relations and managing social gatherings, and as they adopt mobile
landscape for everyone with Internet-access and a desire to technologies that can access online social networks, their ad-
friend, tweet and update a status. Billions of people are now hoc social life is further enabled.
creating and sharing massive amounts of data with one another.
Disappearing barriers with a view to producing information and Kert [2] in his paper entitled” Online Social Network Sites
interacting with other people, social media have started the Social for K-12 Students: Socialization or Loneliness” surveyed
Data Revolution. These data too contain in capsule form, the Social Network Site usage preferences of 170 high school
premise of our culture, its attitudes and ideologies. The words are students. He analyzed few relations between these preferences
always written by someone and these people too join the teachers, and students‟ loneliness level. Students' loneliness level was
the peers, the parents in the socialization process. In this paper, measured by means of UCLA loneliness scale developed by
we investigate the role of social media in socialization process by Russell, Peplau and Cutrona (1980). The results revealed that
mining the social web. there was a negative correlation between the number of user‟s
friends on SNS and the UCLA loneliness scale points.
Keywords—social media; socialization; Twitter; Facebook;
Hashtag. Little research has been done on Facebook‟s growth in
developing countries (and a lot would be needed to capture
I. INTRODUCTION even some of the diversity included under the blanket term
Man is not only social but also cultural. It is the culture that “developing world”). Two small, recent studies [3], [6] of
provides opportunities for man to develop the personality. Kenyan Facebook users in poor areas by Susan Wyche of
Development of personality is not an automatic process. Every Michigan State University are among the first to be published,
society prescribes its own ways and means of giving social and they provide some interesting insights.
training to its new born members so that they may develop
their own personality. This social training is called III. SOCIAL MEDIA AND SOCIALIZATION
„socialization‟. It seems social media play crucial role in re-socialization
Traditional agents of socialization include family, age and anticipatory socialization.
mates, teachers and mass media. In addition to these, social A. Re-socialisation
media also play a vital role in socialization. Billions of people
are now creating and sharing massive amounts of data with one Re-socialization means stripping away of
another. These data too contain in capsule form, the premise of learned patterns and substitution of new ones for them. Such
our culture, its attitudes and ideologies. re-socialization takes place mostly when a social role is
radically changed. It may also happen in periods of rapid social
Even in developing countries, the use of social media is mobility.
noteworthy. According to MIT Technology Review, Facebook
is gaining many developing-world users [4]. For example, let 1) Can social media help to re-socialize drug addicts?:
us investigate Facebook Statistics of some South Asian „Drug addicts on Facebook‟ is an experiment of
countries. In Bangladesh 41.63% of internet users use ikgebruik.nl, an independent organization (no government
Facebook. In case of India, Pakistan, Nepal and Bhutan it is funding), which dedicates itself to provide proper information
45.78%, 27.41%, 72.15%, 54.49% respectively [5]. for drug addicts [9]. In this experiment they want to find out if
social media can help to re-socialize drug addicts. In the long IV. EXPERIMENTATION ENVIRONMENT
term it is also an experiment to find out how „social‟ social In this section, we discuss our experimentation
media are. environment.
This is an interesting experiment. Dutch branding agency A. Mining the Social Web
Lemz is helping drug addicts Monica by setting up their own
Facebook profile page. The idea is that by making friends Twitter‟s accessible APIs, inherent openness and rampant
online, they will be able to gain self-confidence and new worldwide popularity make it an ideal social website to zoom
interests that will help inspire them to quit addiction. A 37- in on and find relevant patterns of trending topics related to
year-old Dutch woman Monica is addicted to heroin and socialization. For this, we followed the example codes from
cocaine. Now, with help from Dutch charity site Ikgebruik.nl Mining the Social Web (2nd Edition) [11] and modified the
(and branding agency Lemz), Monica is taking part in a social- codes according to our needs. We installed the virtual machine
media experiment. She has set up a Facebook profile, and is known as VirtualBox [12] as a development environment
asking you to friend her, so that she can expand her social instead of using existing Python installation because there are
network beyond fellow drug users. By making friends online, some non-trivial configuration management issues that are
the thinking goes, she might be able to gain self-confidence involved in installing IPython Notebook [13]. IPython
and find new interests that will help her quit addiction. In other Notebook is a powerful, interactive Python interpreter that
words, social media could help re-socialize her. provides a notebook like user experience from within the web
browser and combines code execution, code output, text,
B. Anticipatory socialisation mathematical typesetting, plots etc. for mining the Twitter data.
Men not only learn the culture of the group of which they We also required Vagrant [14] which is a tool for building
are immediate members. They may also learn the culture of the complete development environment. With an easy-to-use
groups to which they do not belong. Such a process whereby workflow and focus on automation, Vagrant lowers
men socialize themselves into the culture of a group with the development environment setup time and increases
anticipation of joining that group is referred to by sociologists development-production parity as well. For the requirement of
like Merton as anticipatory socialization. a simplified toolbox with a view to mining rich social web data
like Twitter, the following two steps were performed:
In anticipatory socialization, non-group-members learn to
take on the values and standards of groups that they aspire to 1) Virtual Machine setup and access IPython Notebook:
join, so as to ease their entry into the group and help them  At first, we installed Oracle VM VirtualBox v.4.3.4
interact competently once they have been accepted by it. It is [12] as a development environment.
the process of changing one's attitudes and behaviors, in  We employed an easier configuration management by a
preparation for a shift in one's role. simplified virtualization technology which is Vagrant.
Now how social media influence anticipatory socialization? We installed Vagrant v.1.4.0 [14] for this purpose.
For example, let us assume someone wants to join an  Then we managed the required contents for the IPython
organization. Then he can visit the organization‟s Facebook or Notebook [13] from Github repository of Mining the
Twitter page. Thus he can communicate with that Social Web [11].
organization‟s present members and take on the values and  After that, Vagrant was initialized for synchronizing
standards of the organization. thousands of files which decompress the host machine
Facebook statistics shows that, total number of Facebook to prepare a portable toolkit for social web mining.
pages is 50 million. Average number of monthly posts per  Finally, we accessed interactive IPython Notebook
Facebook page is 36 [7]. Average number of pages, groups, from the web browser through http://localhost:8888
and events a user is connected to is 80 [8]. which is configured as default port in the provided
Vagrant file.
C. Interaction with agemates through social media
Facebook users often hope that their "friends" will "like" or 2) Obtaining Twitter API access:
comment on their photos, posts, etc. It is almost as if we are  At first, we created a demo app in order to get Open
starting to value ourselves differently depending on how many Authorization i.e., OAuth credentials from Twitter‟s
of our peers approve of our Facebook activities. Family is said
developer website. As, Twitter implements OAuth
to be the most influential agent of socialization, but with the
1.0A for its standard authentication mechanism and in
development of Facebook and other social networking sites,
peer groups are gaining more power. order to use it to make requests to Twitters‟ API, a
sample application has to be created.
It is interesting to see how the social rules evolve in  After that, we created an OAuth access token
interactions over Facebook, Twitter or other social networking credentials in order to get access token, access token
websites. While posting something on Facebook or Twitter we secret etc.
need to consider the consequences for our social lives  Finally, we used consumer key, access token and other
especially in the matter of violence and anarchy. A vulgar post
necessary authentications to create Twitter API
on these social media can lead to an offensive situation every
connection from the interactive IPython Notebook for
now and then.
mining Twitter data.
2) Related hashtags: Around the circle at the center,
Install VirtualBox Install Vagrant
representing the hashtag we searched, we will also find its top
related hashtags - up to 10 - displayed as other circles. The
correlation is measured as the percentage of tweets using the
Setup Virtual Machine & Clone Github Repository of searched hashtag which also use the related one. We can also
Access IPython Notebook Mining the Social Web get the actual percentage in every other circles including the
popularity rating.
Access IPython
We can explore the field of related hashtags by clicking
Notebook Initialize Vagrant on one of them (blue circles); this will automatically search
that hashtag and will display its own related ones [10]. In this
way we can easily find all the hashtag related to our interests,
and learn which ones are the most and least popular and most
and least specific.
3) Hashtag trends: The mission of hashtagify.me is to
provide all the most useful information to find the best
Create a demo app to get hashtags. One important piece of information is Hashtags
OAuth credentials in
Twitter’s Developer Website trending data.
There is also a graph of the popularity of searched hashtag
Create OAuth Access during the last two months, also with a projected value for the
Obtain Twitter API Access Token credentials current week - the projection is computed as soon as there are
enough data, usually on Tuesday evening. As, we can assume
Use Consumer Key, Access that posts with hashtags influence socialization a lot. There is
Token etc. to create Twitter
API Connection also a statistics [15] analyzed that 26% of social properties
possess better performance in posts with hashtags.

Fig. 1. Brief Overview of the technological environment used for mining

Twitter data related to socialization.

B. Hashtagify 26%
Hashtagify [10] is the Most Advanced Twitter Hashtags Posts with Hastags
Search Engine. It is a free tool that promotes the best use of Outperform (26%)
hashtags by finding and understanding them in a quick, 53% No Significant
intuitive, visual way. Difference (21%)
Hashtags are one of the best ways to find and reach the
Posts with Hastags
right audience for our message on social media. Hashtagify.me 21% Underperform (53%)
allows to search among 25,403,098 Twitter hashtags and
quickly find their popularity, relationships, languages,
influencers and other metrics.
In April 2011 hashtagify.me started collecting information Fig. 2. Pie Charts showing how effective are the Twitter posts with Hashtags.
about hashtags usage patterns on Twitter, examining
1,717,202,262 tweets (from the 1% sample that Twitter C. Methodology
distributes for free) since then. Based on this data the first We set up our experimentation environment using Virtual
visual hashtags explorer was created and published, which still Machine, Vagrant and IPython Notebook. Then based on
constitutes the base on which the advanced hashtag search
engine is built. frequency analysis of Twitter hashtags obtained from
experiment, we selected seven keywords related to
1) Hashtag popularity: Hashtagify provides numerical socialization:
popularity rating of the hashtag. This 0-100 rating is relative to
 society
the most popular hashtag on Twitter. The most popular
hashtag will get 100, while a hashtag that is never used would  culture
get 0 (but actually, it will just be "not found").  religion
In addition to absolute Popularity it also provides the  family
weekly and monthly variation – that is, if and how much the  politics
popularity of the hashtag increased or decreased week on  history
week (W) and month on month (M).
 art
Then we used hashtagify to analyze hashtags popularity C. Religion
rating corresponding to these keywords in Twitter. After that, The keyword religion has popularity rating 58.5.In the
we illustarted the weekly variation of each of those hashtag‟s following figure P represents popularity and C represents
popularity trends including the statistical analysis of selected correlation from #religion.
Christian P:65.8
In the following subsections we present experimentation P:61.6 C:6.9% God
result for each of the keywords. C:3.6% P:70
A. Society
The keyword society has popularity rating 51.4.In the Christianity Islam
following figure P represents popularity and C represents P:51.9 P:69.8
C:4.8% C:12.7%
correlation from #society.
Life P:60.8 Atheism Universal
P:75.1 News
C:3.5% P:86.4 P:60.1 P:49.6
C:2.9% C:12%
C:1.7% C:3.1%

Politics Novosti Jesus

P:67.2 P:63.1 goharShahi P:70.6
C:2.1% C:1.7% P:47.6 Atheist C:3.8%
Society C:3.6% P:57
P:51.4 C:4.6%
Mobile P:61.8
P:68.6 C:1.5% Fig. 5. Related hashtags of Religion with Popularity Rating 58.5.
D. Family
Technology Economy The keyword family has popularity rating 75.2. In the
Humane P:62.3
P:67.8 P:31.2 C:2.6% following figure P represents popularity and C represents
C:3.2% C:1.7% correlation from #family.

Fig. 3. Related hashtags of Society with Popularity rating 51.4.

44million Friends
B. Culture beliebers
P:54.2 Happy
The keyword culture has popularity rating 60.8.In the C:2% P:77.5
following figure P represents popularity and C represents C:2.3%
correlation from #culture. Funny
P:55.6 P:73.8
C:2% C:1.9%
Popular P:75.6 Family
P:58 C:8.6% Novosti P:75.7
C:2% P:63.1
C:2.3% Music Love
P:85.4 P:86.4
C:1.8% C:13.4%
News Travel
P:86.4 P:75
C:2.5% C:4.5% Travel Fun
Culture P:75.1 Fashion P:73.8
P:60.8 C:1.8% P:77.6 C:2.4%
Arts C:1.7%
P:66.3 P:66
C:2.4% C:2.6%
Fig. 6. Related hashtags of Family with Popularity Rating 75.7.
Music History E. Ploitics
P:85.4 Web P:64.5
C:3.8% P:64.8 C:4.1% The keyword politics has popularity rating 67.2. In the
C:1.9% following figure P represents popularity and C represents
correlation from #politics.
Fig. 4. Related hashtags of Culture with Popularity Rating 60.8.
Egypt P:81.4
Iran Love
P:78.7 C:9.2% Music P:85.4
C:8.9% P:73.6 Fashion
C:6.8% P:86.4 C:3.7%
C:3% P:77.6
World Fox
P:69.7 P:63.1 Illustration
P:57.8 Drawing
C:8.7% C:5.9% P:61.7
C:2.8% C:4.8%
P:67.2 Art
USA P:75.6
Euronews P:76.2
P:57.6 Photography Artist
C:4.9% P:64
C:4.9% P:77.2
C:5.5% C:3.7%

Health News
P:74.3 Iraq P:86.4 Photo
C:5.3% P:69.6 C:25% P:75.2
60.6 Design
C:6.4% C:5.5% P:71.3 C:3.1%

Fig. 7. Related hashtags of Politics with Popularity Rating 67.2.

Fig. 9. Related hashtags of Art with Popularity Rating 75.6.
F. History H. Analysis of hashtags trends
The keyword history has popularity rating 64.5. In the In the following graph we compare popularity trend of our
following figure P represents popularity and C represents selected hashtags.
correlation from #history.
WWll P:49.8 70
P:49.7 C:1.3% art
C:1.7% P:75.6
C:1.5% Art
Popularity (on scale of 100)

Diamond ebay
P:56.2 P:66.9 Family
C:1.7% C:1.4% 50 History
History Politics
40 Religion
platinum P:52.6 Society
P:39.4 C:1.6%
C:2.2% 30

Education Culture
P:68.3 Travel P:60.8 20
P:75.1 C:2.4%
C:1.2% -8 -7 -6 -5 -4 -3 -2 -1 0
Weeks ago

Popularity Trend of Hashtags

Fig. 8. Related hashtags of History with Popularity Rating 64.5.
Fig. 10. Comparison of Popularity Trend of Hashtags related to Socialization.
G. Art
The keyword art has popularity rating 75.6. In the From the above data we calculate the mean and standard
following figure P represents popularity and C represents deviation of the selected hashtags.
correlation from #art.
