Frontend and Backend Web Technologies in Social Networking Sites: Facebook As An Example
Frontend and Backend Web Technologies in Social Networking Sites: Facebook As An Example
Frontend and Backend Web Technologies in Social Networking Sites: Facebook As An Example
86
87
B. MYSQL
MySQL is a DB software that has high speed and high
reliability, the reason for which Facebook is using it. It is
used primarily as a key-value store because data is randomly
distributed among a huge set of logical instances. These
instances are spread out across physical nodes, and the load
balancing is done at the physical node level [13].
For customization purposes, Facebook has developed a
custom partitioning scheme in which all data has an assigned
global ID. Another customization of MySQL from Facebook
is the archiving scheme that is based on the frequency of data
for every user [11].
C. PHP
Facebook uses PHP as the main programming language.
PHP is a web programming language with extensive support
and an active development community [14]. PHP is
appreciated for its rapid iterations and dynamic interpretation
as a scripting language. As mentioned earlier, Facebook
doesn't just use PHP of course, but it is heavily invested in
the language [14].
Figure 2. Facebook page processing [9]. A lot of programmers have asked that why Facebook
doesn’t migrate from PHP to other languages and the best
IV. FACEBOOK FRONTEND TECHNOLOGIES answer came from a Facebook engineer named Wong who
To make up its infrastructure, Facebook uses various tools has worked at the company in various roles between 2005
and programming languages. At the frontend, the servers are and 2010. He stated that it is because PHP has incumbent
based on LAMP (Linux, Apache, MySQL, and PHP) stack inertia and Facebook's engineers have managed to work
[10]. The following paragraphs explains what that means. around many of its flaws through a combination of patches at
LAMP is a combination of free and open source all levels of the stack and excellent internal discipline via
software. It refers to the first letters of the OS Linux, the code convention and style - the worst attributes of the
HTTP server Apache, the DB software MySQL, and language are avoided and coding style is rigidly enforced
programming languages PHP, Perl, and Python. Those are through a fairly tight culture of code review. Engineering
the main components used to build a viable general purpose management has never had to take a strong hand here; this
web server in the frontend of any scalable large scale arose largely due to key internal technical leaders just sort of
servicing website [11]. corralling everyone else along [10].
Although LAMP has the standard components which are D. MEMCHACHE
mentioned but variations are possible with respect to the web
scripting languages for instance where the Perl may be Memcache is a memory caching system that is used to speed
excluded. Other variations exist such as variations of the OS up websites that are driven by DBs, such as Facebook, by
like the WAMP which is based on Windows OS and MAMP caching data and objects in RAM in order to reduce the
which is based on Mac OS although this doesn’t apply in the reading time. Facebook uses Memcache as a primary caching
Facebook case in which Linux is the OS [10]. stage which helps in relieving the DB load, if direct access is
The software combination has become popular between granted for all requests. By having a caching system,
developers although the original designers of these programs Facebook can be as fast as it is in recalling the data [11].
did not design them all to work specifically with each other. Facebook has realized that there are downsides to using
The development philosophy and tool sets are common the LAMP stack. For example, PHP is not inherently
among several platforms because it is free of cost and open optimized for large websites and therefore hard to scale. In
source which make them easy to adapt [10]. addition, it is not the fastest executing language and the
extension framework is difficult to use [14]. Keeping in
A. LINUX & APACHE mind that Memcache is considered as the middle tier as has
Linux is an OS which is based on a Unix kernel. In addition been mentioned before although it is described here under
to the fact that it is an open source OS, what makes is the frontend section.
favorable is the good security and customization ability of it.
V. FACEBOOK BACKEND
Facebook runs the Linux OS on Apache HTTP Servers.
Apache is also free, as the rest of the bundle, and is a very Facebook has used different programming languages for its
popular open source web server in use [10,11,12]. Backend. A variety of Facebook services uses languages
such as Java, C++, Erlang and Python. Facebook doesn’t use
different programs for the sake of the variety but they
initially think of a service which they need to implement and
87
88
then they create the framework/toolset for the service and at There are a lot of more programming languages like
the end the right programming language for the function is bagpipe for dynamic pages but the scope of this report is for
chosen [13]. Facebook uses several different languages for the general overview [15].
its different services. PHP is used for the frontend as
explained earlier, and Erlang is used for Chat, while Java and VI. FUTURE CHALLENGES
C++ are used in several places. More scaling challenges will come and the programming
Next subsections will briefly go through some of the languages and technologies Facebook is using will always be
software Facebook is using in its backend. revised and evolved. The pace into which social networking
A. THRIFT is growing leaded by Facebook is incredible. Its user base is
increasing almost exponentially. Facebook is expected to be
Thrift is a cross-language framework which has been running into different performance bottlenecks as it’s
developed in the labs of Facebook. The function of Thrift is challenged by more and more page views, searches,
to tie all of the different languages Facebook uses together to uploaded images, status messages, and all the other ways that
make it possible for them to talk to each other. Facebook has Facebook users interact with the site and each other [15].
made Thrift open source [13]. Here are a few facts to give an idea of the scaling challenge
B. SCRIBE that Facebook has to deal with [15]:
Scribe is a server that is built on top of Thrift. Scribe has • Facebook serves 570 billion page views per month
been designed for the log data streamed from all of the (according to Google Ad Planner).
servers in real time. It aggregates the log data and it is a • There are more photos on Facebook than all other photo
scalable framework useful for logging a wide array of data
[15]. sites combined (including sites like Flickr).
• More than 3 billion photos are uploaded every month.
C. CASSANDRA • Facebook’s systems serve 1.2 million photos per second,
Cassandra is another open source system which is used by (this doesn’t include the images served by Facebook’s
Facebook. It is a distributed storage system which is used by CDN).
Facebook for its inbox search. What makes Cassandra • More than 25 billion pieces of content (status updates,
special is that it has no single point of failure [16]. comments, etc) are shared every month.
D. HIPHOP FOR PHP
VII. CONCLUSION
Although PHP has a lot of benefits which made Facebook
and a lot of other websites to use it, it still has drawbacks Finally, it is good to say that behind of all the well
such as optimization. Hiphop is a transformer which maintained and operated website that the entire world is
transforms the source code of PHP to an optimized C++ code using a very robust programs and systems that are all in
to be ready for a g++ compiler to make the machine code out harmony. Facebook is providing an easy way for friends to
of it [15]. keep in touch and for anyone to have a presence on the Web
without the need to build a website. People have been
E. HAYSTACK “Facebooking” each other for years now making Facebook
Haystack is an object store system which is used by the most used SN worldwide. Another important outcome
Facebook for the storage of photos. It is a high performance out of this report is that Facebook has a lot of self-invented
system that is handling more than 20 billion uploaded photos technologies and languages, some of which have been made
on Facebook, and each one is saved in four different open source and public to be customized. The reason behind
resolutions. Haystack stores photo data inside 10 GB bucket this was the challenges which haven’t been looked into
with 1 MB of metadata for every GB stored [15]. before Facebook in terms of the huge extraordinary traffic
and users.
F. HADOOP AND HIVE
Hadoop is a map-reduce system that performs calculations REFERENCES
on massive amount of data. Facebook uses this open source [1] M. Debusmann, and K. Geihs, “Towards Dependable Web Services”,
system for data analysis. Hive originated from within 10th IEEE Pacific Rim International Symposium on Dependable
Computing, pp. 5-14, 2004.
Facebook, and makes SQL queries with Hadoop, making it
[2] F. Schulz, and W. Theilmann, “Towards Systematic Mobile Cloud
easier for non-programmers to use. Both Hadoop and Hive Performance Analysis”, 6th Joint IFIP Wireless and Mobile
are open source [15]. Networking Conference (WMNC), pp. 1-4, 2013.
G. VARNISH [3] A. Hashmi, F. Zaidi, A. Sallaberry, and T. Mehmood, “Are All Social
Networks Structurally Similar?”, IEEE/ACM International
Varnish is a load balancer and HTTP accelerator. It also Conference on Advances in Social Networks Analysis and Mining
caches the content which can then be served lightning-fast. (ASONAM), pp. 310-314, 2012.
Serving photos and profile pictures and handling billions of [4] L. Brown, “The Social Networking Handbook - Everything you Need
requests every day is done by Varnish [17]. to Know about Social Networking”, Emereo Publishing, 2011.
88
89
[5] Top 15 Most Popular Social Networking Sites,
http://www.ebizmba.com/articles/social-
networking-websites
[6] Z. Whittaker, “Facebook Hits 1 Billion Active User Milestone”,
CNET, 2012, http://news.cnet.com/8301-1023_3-
57525797-93/facebook-hits-1-billion-active-
user-milestone
[7] L. Li, “Social Network Sites Comparison between the United States
and China: Case Study on Facebook and Renren Network”,
International Conference on Business Management and Electronic
Information (BMEI), vol.1, pp. 825-827, 2012.
[8] A. Zeichick, “How Facebook Works?”
http://www.technologyreview.com/featuredstory/
410312/how-facebook-works
[9] W. Graham, “Reaching Users Through Facebook: A Guide to
Implementing Facebook Athenaeum”, Code4lib Journal, Issue 5,
2008.
[10] D. Dougherty, “LAMP: The Open Source Web Platform”, 2001,
http://www.onlamp.com/pub/a/onlamp/2001/01/25/
lamp.html
[11] J. Lee, and B. Ware, “Open Source Web Development with LAMP:
Using Linux, Apache, MySQL, Perl, and PHP”, Addison Wesley,
2002.
[12] Apache Software Foundation, “Axis architecture guide”, 2003,
Version 1.0, http://ws.apache.org/axis
[13] E. Protalinski, “Why Facebook Hasn't Ditched PHP?”
http://www.zdnet.com/blog/facebook/why-
facebook-hasnt-ditched-php/9536
[14] S. Campbell, “How Does Facebook Work? The Nuts and Bolts:
Technology Explained”,
http://www.makeuseof.com/tag/facebook-work-
nuts-bolts-technology-explained
[15] Pingdom, “Exploring the Software Behind Facebook, The World’s
Largest Site,
http://royal.pingdom.com/2010/06/18/the-
software-behind-facebook
[16] G. Wang, and J. Tang, “The NoSQL Principles and Basic Application
of Cassandra Model”, International Conference on Computer Science
& Service System (CSSS), pp.1332-1335, 2012.
[17] Varnish Cache, https://www.varnish-cache.org/
89
90