Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Web Tech Class 4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Web Technology

CS 625/CO 423

Dr. Debojit Boro


1 Associate Professor, Tezpur University,Napaam,Tezpur-784028
Topic

Web Server Architecture

2 Dr. Debojit Boro


Associate Professor, Tezpur University,Napaam,Tezpur-784028
What is Web Server?
“A computer system that processes requests via HTTP, the basic
network protocol used to distribute information on theWorld
WideWeb.The term can refer to the entire system, or specifically
to the software that accepts and supervises the HTTP requests.’’ [1]
Popular Web Servers
 Apache HTTP Web Server  AOL Server
 Microsoft IIS  Zeus Web Server
 Apache TomCat  Xitami

 Nginx
 Lighttpd
 Oracle iPlanetWeb Server
 XAMPP
 LiteSpeedWeb Server
 Monkey HTTP Server
 Google Web Server
4
Apache Web HTTP Web Server
 Initially started in 1996 by Robert McCool.
 Since April 1996, Apache web server has been the most
popular HTTP server in the market on the World Wide Web.
 The Apache was the first web server architecture that was
used by the Netscape Communication Corporation.
 Apache has evolved with the years of the internet. Server is
used to support both static and dynamic pages online. Many
programming languages are supported by the Apache Server
are as follows: PHP, Perl, Python and alongside with MySql.
As of April 2008, the Apache Server serves approximately
50% of the current web pages.
5
Overview of the Apache HTTP Web Server
 Apache is a open source HTTP web server. It handles HTTP
requests sent to it and responds to them.
 Apache is built and maintained over at Apache.org
 Apache is comprised of Two main building Blocks with the
Latter being comprised of many other little building blocks.
The Building Blocks are the Apache Core and then the Apache
Modules that in a sense extend the Apache core.
 Very easy to implement and very easy to add extend its abilities
by the adding of different modules. This is why this server has
become so popular.
Apache Overview Diagram

Apache’s modular
approach to add to the
basic functionality of
the server without
disturbing the basic
Core implementation.
Apache HTTP Web Server Architecture
Web Server Types
Based on the way the Web Server handles request from theWeb
client, ApacheWeb Servers can be categorized into following
types.
 Single-Threaded Web Servers
 Multi-ProcessedWeb Servers
 Multi-ThreadedWeb Servers
Single-Threaded Web Servers
Accept HTTP Parse HTTP Retrieve and read Respond by sending
connection request connection request the resource file back the data

 Only one process sequentially handles all the client connections.


 Incapable of handling high Web server traffic but ideal for Web
sites that encounter low or moderate traffic.
 Scalability is a issue, only one client at a time can be handled.
Multi-Processed Web Servers
Process 1
Accept HTTP Parse HTTP Retrieve and read Respond by sending
connection request connection request the resource file back the data

Process N
Accept HTTP Parse HTTP Retrieve and read Respond by sending
connection request connection request the resource file back the data

 Can handle multiple connection requests simultaneously.


 Starts a new process for each connection request.
 Inter process communication is expensive and difficult.
 Memory costs is high and context switching is expensive.
Multi-Threaded Web Servers
Accept HTTP Parse HTTP Retrieve and read Respond by sending
connection request connection request the resource file back the data

 Can handle multiple connection requests simultaneously.


 Starts a new a thread within a same running process for each
connection request.
 Good performance.
 Changing thread policies is easier now.
 Synchronization among threads is required to avoid data race
condition.
 Deadlock condition may arise in I/O scenarios.
 IIS (Internet Information Services) on the Windows platform is an
example of a multi-threadedWeb server.
 Apache on a Unix platform is a multi-processedWeb server.
 Windows platform lack forking for server-client request
interaction which make Apache put on a Unix platform more
efficient.
Apache HTTP Web Server
 Apache Web Server is comprised of modular approach instead of
just having a single piece of code handling everything.
 This allows for more robustness and better customization
without getting rid of the security that is implemented within
the Apache Core.
 In order to achieve this Modular Approach the Apache Designers
decided to break down the server into two main Components.
 The Apache Core: Which Handles the Basic functionality of
the Server. Such as allocating requests, maintaining and
pooling all the connections.
 The Apache Modules: Which are in a sense the added
extensions to the server which handle a lot of the other types
of processing the server must achieve such as user
authentication.
Apache Core

Basic “brain” of the Apache Web Server.


Apache Core Components
 The Apache Core is comprised of many different little
components that handles the basic implementation of what a web
server should be doing.
 The core components are a series of classes that handle specific
tasks. These should not be confused with modules, which are just
add on implementations of different things that Apache can be
customized to do.
 The Apache Core provides the main functionality of a HTTP web
server. Without it or allowing a change to it will remove its
modularity, but also remove some of the security. This is why
modules are needed in order to extend the core functionality of
Apache.
Apache Core Components
The core components that make up the Apache core are as follows:
 http_protocol.c: This is the component that handles all of the
routines that communicate directly with the client by using
the HTTP protocol. This is the component that knows how to
also handle the socket connections through which the client
connects to the server. All data transfer is done through this
component.
 http_main.c: This component is responsible for the startup of
the server and contains the main server loop that waits for and
accepts connections. It is also in charge of managing timeouts.
 http_request.c: This component handles the flow of request
processing, passing control to the modules as needed in the
right order. It is also in charge of error handling.
Apache Core Components
 http_core.c: The component implementing the most basic
functionality, it just serves documents.
 alloc.c: The component that takes care of allocating resource
pools, and keeping track of them.
 http_config.c : This component provides functions for other
utilities, including reading configuration files and managing
the information gathered from those files, as well as support
for virtual hosts. An important function of http_config is that
it forms the list of modules that will be called to service
during different phases of the requests that are going on within
the server.
Apache Core Components
 As seen apache has many different components within the Core
these all allow the server to be more secure and more robust, but
also due to the implementation of the architecture raises security
since anyone that wants to add functionality to the server must
do so by the use of modules.
Apache Request Phases
How does Apache know what to do with a request that
it received from the client but also what so it does after
it has received the request and where does it go from
there in order to handle the request that was made to it?
• Modules are stranger to each other and one module alone
cannot completely fulfill or process the request that is made to
the Apache server.
• Requests are processed by sending the information from one
module back to the core then back to another module until the
request is completely handled and then it is sent back to the
client. Apache has something called Request Phases and is
handled by the HTTP_REQUEST component of the core.
Apache Request Phases
The phases or the logic that the HTTP_REQUEST module of the
Apache core controls are as follows:
 URI to filename translation;
 Check access based on host address, and other available
information;
 Get an user id from the HTTP request and validate it;
 Authorize the user;
 Determine the MIME type of the requested object (the
content type, the encoding and the language);
 Fix-ups (for example replace aliases by the actual path);
 Send the actual data back to the client;
 Log the request;
Apache Modules
 Made to extend/overwrite and implement the functionality of
the Apache web server. However modules do not directly
extend each other or “know” each other. So in turn modules are
connected to the Apache core all the same way.
 Modules since they do not know directly about each other must
pass all information back to the core and then the core sends
that information to another appropriate module through the
use of the HTTP_REQUEST component of the Apache Core.
This in turn does not allow any changing of the stable Apache
Core, but also implements a layer of security, because no
process can move on without passing the information to the
core and the core checks and handles errors through the
HTTP_REQUEST component.
Apache Modules
 Apache web server has a modular architecture with a core
component that defines the most basic functionality of a web
server and a number of modules which implements the steps of
processing a HTTP request, offering handlers for one or more
of the phases. The core accepts and manages HTTP connections
and calls the handlers in modules in the appropriate order to
service the current request by parent and child.
 Concurrency exists only between a number of persistent
identical processes that service incoming HTTP requests on the
same port. Modules are not implemented as separate process
although it is possible to fork children or to cooperate with
other independent process to handle a phase of processing a
request.
Apache Modules
 The functionality of Apache can be easily changed by writing
new modules which complements or replace the existing one.
The server is also highly configurable, at different levels and
modules can define their own configuration commands.
 One cool thing about Apache is that it makes it robust and
allows for better speed. Apache allows for initialization of
modules dynamically. So not every module is started when the
server starts up which really allows for a giant speed boost.
 So what this allows is Apache to only initialize the modules that
it needs at that moment. Which allows requests to be processed
a lot faster than usual.
Apache Modules
 Modules have something inside them that are called Handlers.
 Handlers: A handler performs some action for Apache in
some phase of servicing a request. For example a handler
that requests a file must open the file then read the file then
send it to the Apache core and then be sent to the client.
Handlers are defined by the modules depending on when
they are needed to fulfill a request then the Handlers are the
ones that send back the processing from the Apache Module
to the Apache Core HTTP_REQUEST component.
Apache Module Handlers
 Handler does what it needs to do to fulfill a request and then
send the response back to the HTTP_REQUEST component of
the Apache core in order to be sent to another module for
processing or back to the client.
Module Configuration
 Static configuration of Apache requires modules to be
incorporated with care.
 More the number of modules, the more the memory
consumption. Therefore, forking for multi-processing module
can have a significant effect on the memory.
 Modules can be explicitly enabled or disabled depending upon
the requirement.
 Some third party modules (e.g., authentication, PHP, or
mod_perl) also need to be included for theWeb service.
 Use configure --help to get a list of the available options.
Concurrency in Apache

Apache provides two levels of concurrency.


• Concurrent processes can execute simultaneously when they run on
separate processors or as separate processes running on a multitasking
system.
• In multi-threading a default of up to 50 threads is allowed for each process.
 Each request that the server receives is actually handled by a copy
of the http program. So rather than creating a new instance copy
when it is needed, and destroying it when a request is finished,
Apache maintains at least 5 and at most 10 inactive children at
any given time. The parent process runs a periodic check on a
structure called the scoreboard, which keeps track of all existing
server processes and their status.
 If the scoreboard lists is ever less than the minimum number of
idle servers, then the parent will spawn more.
 If the scoreboard lists more than the maximum number of idle
servers, then the parent will proceed to kill off the extra
children.
 When it receives a request, the parent process passes it along to
the next idle child on the scoreboard. Then the parent goes back
to listening for the next request.
 When doing the parent and child request there is a limit which
by default is set to 256 of the total number request at one time.
The default settings was programmed by the creators for the
server was to pick in order to keep the scoreboard file small
enough so that it can be scanned by the processes without causing
overhead concerns.
 Since the number of requests that can be processed at any one
time is limited by the number of processes that can exist, there is
a queue provided for waiting requests. The queue waiting list was
mentioned to be when the parent passes a request to a child,
which was idle, then the parent returns to receive next request.
The maximum number of pending requests that can sit on the
queue can reach somewhere in the 400-600.
 Apache server architecture was designed to maximize one
connection. It uses the persistent connection to allow multiple
requests from a client to be handled by one connection, rather
than opening and closing a connection for each request. The
default maximum number of requests allowed over one
connection is 100.The connection is closed by a timeout.
Apache HTTP Web Server Configuration
 Apache Server is running as a daemon process httpd in LINUX.
 Configuration File: Configuration of the Web Server is available in
/etc/httpd/conf/httpd.conf
 Directives: Apache configuration values are specified by variables
contained in configuration files. These variables are called directives
Directive Value eg. UserDir public_html
 Directives may be core or optional
 Directives are associated with Apache modules.
 Core directives are enabled by defaults and corresponding modules are
compiled in the running version of Apache Server
 If directives are specified but the expected functionalities are not received
from the server then may be the correct module is not compiled in the
running version.
 httpd –l lists all moduled compiled

Note:* Please visit the User Guide in website


http://httpd.apache.org/docs/2.0/ and
“http://httpd.apache.org/docs/1.3/mod/core.html”
Scopes of Directives

 Scope of a Directive can be limited in three ways:


 By <Directory>

<Directory /home/site1> Scope of the Directives is in /home/Site1 and its


Directives subdirectories if any
</Directory>

<DirectoryMatch Regex> Scope of the Directives is in all directories


Directives matching the Regex and their subdirectories
</DirectoryMatch>
Special file containing directives. Scope is the directory
.htaccess that contains the file
File name can be changed by AccessFileName directive
Scopes of Directives
 By URL
<Location sub-url>
Directives
</Location>

<LocationMatch Regex>
Directives
</LocationMatch>

Directory and Location are similar. The difference is Directory uses a


physical directory of a file system, Location uses a component of a url,
need not be physical directory
Scopes of Directives

 By File: scope is within a file(s)


<Files filename>
Directives
</Files>

<FilesMatch Regex>
Directives
</Files>
Modules
 Apache functionalities are implemented as Modules.
 httpd –l gives the list of modules already included in the
executable code of Apache
 ClearModuleList Clears the list of active modules
 AddModule activates a module that is compiled but
not active
AddModule mod_module.c
Dynamic Shared Objects
 DSO provide the capability of specifying which portions of
an executable are to be included/excluded at runtime
LoadModule perl_module libexec/libperl.so
Handlers
 Modules provide specific handlers to handle specific types of
requests/files
 To invoke a particular handler
SetHandler handler_name

 To associate a handler with a particular type of file


AddHandler cgi-script .pl
Directives

ServerType standalone|inetd
 httpd will start as standalone background process or will be started by
inetd as and whenrequired
Port 80 default ports in /etc/services
Port 443 for SSL
User Apache
Group 506
ServerAdmin root@yoursite.org
 Email address where the users will send mails incase of any problem
BindAddress * | IP-Address
 Server attends a specific IP-Address or * means all IP_Addresses if the local machine has
multiple NICs
Directives
ServerName www.example.com
 Specifies the alternative host name that will be returned to the client in
place of actual host name
 Normally www. may be prefixed with the actual host name
 Must be known to the DNS
DocumentRoot /var/www/html
 Website resides here
UserDir public_html
 User level individual website resides here
DirectoryIndex index.html index.htm default.htm welcome.htm
 Default home page
AccessFileName .htaccess
Alias /icon/ /usr/local/etc/httpd/icons/
Directives for CGI
<Directory /var/www/cgi-bin>
AllowOverride none
Options+EcecCGI
Order allow, deny
Allow from all
</Directory>
ScriptAlias /cgi-bin/ /var/www/cgi-bin/
References
1.https://opensource.com/business/16/8/top-5-open-source-web-servers

You might also like