3D Based Visualization Tool To Analyze The Influential Topics Via Hashtags On Instagram Platform
3D Based Visualization Tool To Analyze The Influential Topics Via Hashtags On Instagram Platform
3D Based Visualization Tool To Analyze The Influential Topics Via Hashtags On Instagram Platform
com/press
Keywords Abstract
Information visualization,
This paper intends to develop an interactive, comprehensive information
Instagram,interactive 3D
visualization, social media, topic visualization platform of Instagram hashtag analysis. Instagram hashtags has
analysis ; developed themselves into all different kinds of group or communities for users to
share hobbies and find similar friends. In order to analyze topic influence and user
interest trend from Instagram, which contains billions of end-users and has
worldwide influence, hashtag analysis is necessary to gather such information and
compare the proportion of people involving in each tags and rank them to visualize.
The visualization is developed in 3D space and consists of time-varying data flow of
tags, together with tag comparison analysis, as well as event researches. In the rest
of the paper, we mainly discuss the design idea and the development process of the
system. An example of the system design work will be shown in the discussion,
which involves 4 popular hashtags discussed on Instagram and are shown on the
system, displayed as an 3D histogram, together with another comparison histogram
to compare different tags, as well as an event view in the back.
1. Introduction
Instagram is one of the most influential social media in the world. Various types of contents such as texts, images and
short videos are uploaded on this platform. Users can upload their photos, or videos with description, and give each
post corresponding hashtags. It can easily achieve people’s reaction from one certain event or topic through posts
under hashtag. It’s really important to analysis people’s reaction of different topics especially in such a huge social
media platform, using information visualization method to show the clout of one topic by amount of posts though the
timeline.
In Instagram, a hashtag under the topic actually show different perspective of this topic. Therefore, when calculating
the clout of one topic, all the posts which are under hashtags related to this subject will be considered in the
visualization model. To some extent, these hashtags will show the reason why clout of this topics suddenly increased.
For example, when analyzing the clout of football, the hashtag #worldcup will be considered. In world cup period, the
clout of football must have a significant increase because of the clout of the world cup. Therefore, these relative
hashtags of certain topic of each should be also included in the visualization system. Each topic must have a peak
clout, in somehow these peak values reflects the influence of the topics, which should also be considered in the
visualization model.
This paper will be based on this data introduces system for visualizing clout of topics via hashtags in Instagram,
explain the detailed of system design. Then some example cases applying that visualization system will be presented
in the paper, to directly show how the visualization systems visualizing these data.
pg. 1
Guo Xinyi, Liu Cuiting, Wang Ruisi/ Proceedings of Science and Technolgy
2. Literature review
BrandMap [1] is a visualization platform which uses a novel approach to visualize complex data. This paper proposes
a case study using BrandMap as a visualization tool to measure the distribution of brands in the blogosphere. As many
bloggers mentioned their brands, products and services, a huge resource of data requires to be organized. The
methodology of visualization is to use objects with different characteristics like colour, size and shapes to represent
the key brand dimensions like product attributes, features and themes. The objects are placed in circles with certain
angles between them and distances from the centre. The angle between the terms around the centre is computed by
hierarchy clustering technique according to their similarities. If two terms are closer to each other, they may be often
mentioned and related together in the blog. The distance between the centre and term is calculated depends on the
frequency that the term is cited in the blog. The more a term is mentioned in the blog, the closer it is to the centre of
circles. This visualization method helps people quickly observe the information about brand dissemination over the
Internet.
In Masahiko Itoh [2], social media has been one of the most popular sources for people to acquire information. The
goal of this paper is to analyse changes in people’s idea, experience and interests through information visualization.
A 3D visualization system is introduced in this paper to visualize time-varying topics in multiple media and analyse
their future trends. The system design enables people to observe the begin time of the topic, changes in trend of the
topic, bursting points, and its lifetime. Different images and events related to the topics are also considered as part of
the visualization contents. This visualization system consists of two main part which are Image Flow View and Event
View. To visualize the image flow, a three-dimension histogram including stacking images are created. The images
are arranged according to their topics and publish time. For the event view, TimeSlice which is a 2D plane is placed
in the 3D space to summarize events on the topic keywords. Once a visitor selects a time window, a tree presentation
will be displayed on the TimeSlice. In all, this 3D visualization system can be used to explore trends and events in
social media.
In Chen et al. [3], as social media becomes more and more popular, a large number of messages are spread over medias
every day. This paper aims to explore and analyse social behaviours during the process of message diffusion and
propagation. In this case, D-Map which is a comprehensive visualization system is proposed. In D-Map system, social
media users are represented by hexagonal nodes with colour and size indicating their behaviours and roles. The users
are grouped into different communities according to their behaviours, forming a map metaphor which can visually
show the social influence of the centre user. Each community is represented by one colour, and the centre user is
highlighted with an outside hexagon. There is also an inside hexagon in each node indicating the number of user’s
blogs. A centre user with high inner-community influence will be represented with a big size of node. A user with a
large number of blogs will be assigned with a large dark hexagon inside the node. This paper collects data from one
single user of Sina Weibo with all the reposting blogs, originates all these blogs and then use D-Map to visualize the
diffusion process of these blogs. In conclusion, D-Map visualizes users’ social behaviours and their influence
regarding spreading information on social media during the diffusion process.
3. Critical review
Masahiko Itoh [2] has provided a relatively reliable 3D visualization method by combining image stacked histograms
from multiple events together with corresponding line charts, as well as an interactive event view displaying aside in
3D space. (Figure 1) Some essential attributes need to be evaluated in its research. The most fundamental one is
timeline, which is organized according to the development process of the specific topic stacking with related images
and contains proper time interval such as a month, a week, a day. Besides, the stacked images (Figure 2) represent the
amount of discussion on social media in terms of this topic, which enables us to find the birth timing, bursting points,
changes in popular content, and the lifetime of trends for each topic. Another attribute is topic, displayed by different
histograms to classify various topics being discussed, explore differences in bursting timing for every topic, their
chronological order, and events with the same timing on different topics. Comparison on reports between mass media
and social media are also evaluated as well showing in the difference between histograms and line charts layered
together. In addition to the image flow view, event view (Figure 3) is also an indispensable attribute for evaluation,
represents by TimeSlices and TimeFluxes, which visualize respectively summarized events on the topic keyword
during a selected time window as a tree representation, and changes in the amount of information such as the number
of events within a given period of time.
pg. 2
Guo Xinyi, Liu Cuiting, Wang Ruisi/ Proceedings of Science and Technolgy
Figure 2 stacked images clustered on timeline Figure 3 Event view showing via TimeSlice
Similar research in terms of social media content analysis has developed multiple methodologies with respect to
provide clear user visual experience. BrandMap method uses a set of layered circles together with variously shaped
notations in evaluating angles and distances in between. This method is only helpful in observing topic classification,
rather than analyzing trends in information via time-varying data flow, for no time attribute is displayed in BrandMap,
and no topic relations is visible in 3D visualization system. D-Map, on the other hand, an easy-visualized form of
reposting tree that consist of clustered nodes represented in uniformed colour in the same group, summarizes the
diffusion process to illustrate the spreading of messages across different groups of people and reveal the social impact
of a central user, while the 3D visualization system only focusing on messages spreading effects--clustered messages
forming information trends. In this case, the proposed topic will only become clearer to visualize when applying the
3D model [2].
Limitation of the cited system is quite obvious though. Using stacked image flow is a convenient and plausible method
to consist the histogram classify different topics, however, the layered and clustered images will only mess up the
visual experience as the number of topics increases. In terms of improving visual effect, the stacked images will be
replaced with stacked cubes in a uniformed colour for each single topic in the proposed visualization system.
Meanwhile, TimeFluxes that summarizes events on the topic keyword during a selected time window seems to be less
meaningful as for TimeSlices will show the appearances every subtopic on selected time. Therefore, the proposed
system is going to remove it in order to simplify the visualization.
pg. 3
Guo Xinyi, Liu Cuiting, Wang Ruisi/ Proceedings of Science and Technolgy
The proposed 3D visualization system will demonstrate hashtag information only on Instagram. Since no related
similar social media is going to be analysed as comparison, line charts (Figure 1) will be removed in order to focusing
on the data flow of histograms. As we intend to enhance the analysis of hashtags to illustrate users’ interest distribution,
a new histogram on y-z plane about comparison between every topic will be summarized by gathering all highest data
flow point in each topic and noting the exact time of the occurrence. Specific visualization realization will be mainly
discussed in the next phase.
4. Proposed method
The method we want to use is 3D visualization method. In the visualization model, we want to have a timeline to
record the clout (amount of posts) of each topic. Therefore, there will be billions of timelines in such a widely-used
social media. If we just use different colored timeline to represent different topics, it will be in a mess. In this situation,
using 3D visualization method is a good choice to design the visualization model. We can construct a 3D coordinate,
using topics and timeline as x-axis and y-axis, the clouts of each topics following time as z-axis. What’s more, to
differentiate the topics in visual effect, we will use different colors for different topics. The projection of clout for
each topic on x-z plane will be shown as histogram. And corresponding related hashtags will be shown when clicking
certain topic at certain time in y-z plane as tree structure. In a world, we will sufficiently use the 3D system to visualize
the clout of topics in Instagram in different perspectives.
them, whether it is on alphabetic or categorizing order. This allows us to explore differences in bursting timing for
every hashtag, their chronological order, and hashtags with the same timing on different topics. In the z-axis, the
amount of discussion in the selected hashtag of corresponding time is represented. It is clearer to visualize with
gradient colour so that the levels of heat (discussion amount) in the selected hashtag are presented by the shade of the
colour, which provides easy comparison among the heat between different hashtags in different time.
The system supports the interaction to explore the detail information and exact content of the selected hashtags. Users
can access the original hashtag page by clicking the tag name, and the original webpage from the web browser will
pop out automatically and show all the information included in it and related side topics.
pg. 5
Guo Xinyi, Liu Cuiting, Wang Ruisi/ Proceedings of Science and Technolgy
selected hashtag and timing. User will need to choose a hashtag in the hashtag time-varying view, and select a specific
time point, the Event View retrieves tags related to the main hashtag and automatically moves to the point of timing
on the timeline to display events belonging to the time window. Users can hide it if it is not necessary. The visualization
that is going to be implemented is word cloud chart. it is a very simple, clear visualization for the displacement of all
the tags together comparing to the tree chart or any other method. It provides relations in between each tag as well as
their discussion heat comparison, in which the most frequently researched hashtags will be displayed in the largest
size of the word, and the size will reduce accordingly due to the rank of the heat of the tags. The chosen hashtag is
placed in the middle with the most eye-catching size and colour, and the title of the chart (in figure 8, showing as
“donaldtrump related tags”) is placed on the top of the view as well as the exact amount of its posts.
This event view allows users to explore the real-time events happened to the corresponding tags. For example, in the
hashtag “donaldtrump”, we can tell all the related tags from the event view is about Donald Trump, and the recent
behavior he did, like the one “trumpamerica” that shows him participating 2020 American president campaign; as
well as people’s attitude towards him, such as the one “gotrump”, which shows people’s resistance to Donald Trump
and wish him goes off stage.
pg. 6
Guo Xinyi, Liu Cuiting, Wang Ruisi/ Proceedings of Science and Technolgy
pg. 7
Guo Xinyi, Liu Cuiting, Wang Ruisi/ Proceedings of Science and Technolgy
pg. 8
Guo Xinyi, Liu Cuiting, Wang Ruisi/ Proceedings of Science and Technolgy
Compared with the first visualization, the outlook of bubble chart was neater and clearer to understand. Tags were
represented with bubbles in gradient color placed in order, and size of texts were fixed for each layer. What’s more,
users can easily realize the relationship like parent or grandparent between tags. The dataset collected on Instagram
platform for event view of “#disney” harshtag is shown at figure 20.
6. Evaluation
In this section, we intend to use the user evaluation method, questionnaire, to conduct the research on how the 3D
visualization model performs. We have found 10 people who are not related to the development of the visualization
system at all to the finish 2 questionnaires displayed down below, and the results are concluded as well. In the
development of the first questionnaire, 24 questions are composed and are considered the most comprehensive,
representative ones that can fully reflect the problem of the amended system. The second questionnaire is designed
for analysing people’s view about the comparison between the model created initially and the one amended, and is
consist of 7 questions in order to give an objective analysis about whether the changes is worth it or not.
The completed result as the degree of satisfaction for the questionnaire experiment is displayed below. The calculation
is based on the average of the value that the number of people in each degree multiply the degree value, which is
represented by 0, 25%, 50%, 75% and 100% corresponding to strongly disagree, disagree, neutral, agree, strongly
agree respectively. Here are some aspects that we have concluded from the result.
The model performs generally well in organization, complexity and accusation. The first 2 questions reflect the
superiority by giving the result of exceeding 80% from the feedback. For the hashtag time-varying view, the general
performance is 85%, which is considered fairly well in visualization in general. However, there is still a relatively low
result in telling the trend of each tags, which will be better in visualizing it by replacing the 3D histogram into line
chart, which is not a better solution in this case. In hashtag comparison view, the result is still a pleasant one in general
performance especially in organization and differentiating data trends. In the meantime, the line chart is considered a
good representation method for this view. Nevertheless, the difficulty of telling people’s interested topics is still exist,
which may be resolved by providing more hints through the diagram. In event view, the general performance is fair
enough to be considered as an indispensable part of the model, and bubble chart can make the visualization much
easier. The colour and size selection are helpful for visualizing heat differences among each tag.
pg. 9
Guo Xinyi, Liu Cuiting, Wang Ruisi/ Proceedings of Science and Technolgy
pg. 10
Guo Xinyi, Liu Cuiting, Wang Ruisi/ Proceedings of Science and Technolgy
Figure 165 Hashtag time varying view Figure 176 bit-map from top view
Figure 187 Event view [2] Figure 2819 Event view of our model
8. Conclusion
Above all, the visualization method proposed in this study is mainly used for evaluating the levels of heat (discussion
amount) of topics and related information. There may be still some weaknesses that the visualization has, anyway,
these approaches will be improved in the future work.
Appendix A. Questionaire 1
No. Question Strongly disagree Neutral agree Strongly
disagree agree
1 I am able to tell what kind of topic that
this model is analysing
2 I am able to tell that there are 3
significantly different views
3 I think the hashtag time-varying view is
well organized in visualization and 3D
histogram model is a good
representation
4 I am able to tell that there are 4
hashtags showing in the visualization
model
5 I am able to tell that the time in
analysis is 15 days in the visualization
model, and the time interval is 1 day
6 I am able to tell which tag is more
popular-discussed and which is not
pg. 11
Guo Xinyi, Liu Cuiting, Wang Ruisi/ Proceedings of Science and Technolgy
Appendix B. Questionaire 2
No. Question Strongly disagree Neutral agree Strongly
disagree agree
1 I think model 2 is better in
visualization in general than model
1
pg. 12
Guo Xinyi, Liu Cuiting, Wang Ruisi/ Proceedings of Science and Technolgy
Acknowledgements
This research was supported by Xiamen University Malaysia. We thank our supervisor Prof. Raja Majid Mehmood
from Xiamen University Malaysia who provided insight and expertise that greatly assisted the research.
References
[1] Amadeu Sa de Campos Filho, F. F. (2012). Brandmap: an Information Visualization Platform for Brand Association in Blogosphere. 16th
International Conference on Information Visualisation, (pp. 316-320).
[3] Siming Chen, S. C. (2016). D-Map: Visual Analysis of Ego-centric Information Diffusion Patterns in Social Media. IEEE Conference on
Visual Analytics Science and Technology (VAST) , (pp. 41-50). Baltimore, Maryland, USA .
[4] Flenner, J. L. (2016, 11). Using Data Visualization to Examine an Academic Library Collection. Retrieved from College Research Library:
https://crl.acrl.org/index.php/crl/article/viewFile/16555/18001
[5] Florian Windhager, P. F. (2019, 4 27). Visualization of Cultural Heritage Collection Data: State of the Art and Future Challenges. Retrieved
from IEEE Xplore Digital Library: https://ieeexplore.ieee.org/document/8352050
[6] Shiwen Hong, F. W. (2016). Design and Implementation of Data Visualization in Media Manuscripts Transmission System . Retrieved from
IEEE Xplore Digital Library: http://ieeexplore.ieee.org.sci-hub.tw/document/7778872
pg. 13