(Linux) Apache Web Server Admi 84492
(Linux) Apache Web Server Admi 84492
(Linux) Apache Web Server Admi 84492
Administration
Apache_sw_1.3.14_9/10/01
Welcome
Course Objectives
Apache Web Server Administration will teach you:
After completing this course, you will be able to apply your Apache
administration knowledge to configure a fully functional and robust
Apache server and diagnose a variety of access and performance
problems.
Apache_sw_1.3.14_9/10/01
Course Structure
This course is a three-day, lecture and lab intensive, fast track curriculum.
Lectures follow the structure of the class's text, with labs and question and
answer sessions woven in after each chapter.
Linux Fundamentals
For these courses, plus many more, please visit us on the Internet at
http://www.itsinc-us.com/.
Apache_sw_1.3.14_9/10/01
Table of Contents
WELCOME
1
1
2
2
3
CHAPTER 1: INTRODUCTION
CHAPTER OVERVIEW
CHAPTER OBJECTIVES
OVERVIEW
APACHE'S STRENGTH WORLD -WIDE
APACHE'S OPERATING SYSTEMS
FEATURES
COMPARISON TO OTHER SERVERS
CHAPTER SUMMARY
7
7
8
8
8
9
10
11
13
CHAPTER OVERVIEW
CHAPTER OBJECTIVES
PLACING YOUR WEB SERVERS
UNTRUSTED USERS
OBTAINING APACHE
OBTAINING APACHE
COMPILING AND INSTALLING APACHE
COMPILING APACHE
APACHE BINARY INSTALLATION
EXECUTABLE AND CONFIGURATION FILE LOCATIONS
MODULES
STARTING AND TESTING APACHE
STARTING THE SERVER
TESTING THE SERVER
CHAPTER SUMMARY
13
13
14
14
15
15
16
16
16
17
18
23
23
24
25
27
CHAPTER OVERVIEW
CHAPTER OBJECTIVES
APACHE DIRECTIVES
SIMPLE DIRECTIVES
BLOCK DIRECTIVES
DIRECTORY LEVEL CONFIGURATION
SERVER CONFIGURATION
SELECTING A SERVER TYPE
CHOOSING THE HTTP PORT NUMBER
HOSTNAME LOOKUPS
27
27
28
28
28
30
31
31
31
32
Apache_sw_1.3.14_9/10/01
32
33
33
34
34
34
35
35
36
37
38
38
39
39
40
41
41
42
43
CHAPTER INTRODUCTION
CHAPTER OBJECTIVES
CONTROLLING APACHE
APACHECTL
SYSTEM V SCRIPT
APACHE COMMAND-LINE PARAMETERS
WORKING WITH THE APACHE LOGS
THE ERROR LOG
THE ACCESS LOG
CHAPTER SUMMARY
43
43
44
44
46
47
48
48
49
52
53
CHAPTER OVERVIEW
CHAPTER OBJECTIVES
IP ADDRESS VIRTUAL HOSTS
HOW TO SET UP APACHE
SETTING UP MULTIPLE DAEMONS
SETTING UP A SINGLE DAEMON
NAME-BASED VIRTUAL HOSTS
DYNAMICALLY-NAMED VIRTUAL HOSTS
SETTING UP THE CONFIGURATION FILE
SIMPLE DYNAMIC VIRTUAL HOSTS
COMBINING VIRTUAL HOSTING METHODS
MORE EFFICIENT IP ADDRESS-BASED VIRTUAL HOSTING
SYSTEM LIMITATIONS
FILE DESCRIPTOR LIMITS
IP ADDRESS LIMITS
CHAPTER SUMMARY
53
53
54
54
55
56
57
58
58
59
60
61
62
62
63
64
65
CHAPTER OVERVIEW
65
Apache_sw_1.3.14_9/10/01
CHAPTER OBJECTIVES
CONDITIONAL DIRECTIVES
TESTING FOR CONDITIONS
TESTING FOR MODULES
MODIFYING THE ENVIRONMENT
BROWSER MATCHING
PASSING THE ENVIRONM ENT ON
APACHE HANDLERS
HANDLERS
ASSOCIATING WITH FILES
CREATING HANDLERS
REDIRECTING CONTENT
SIMPLE ALIASES
PATTERN ALIASES
REDIRECTS
FANCY INDEXING
ASSOCIATING ICONS WITH FILES
ASSOCIATING DESCRIPTIONS WITH FILES
SPECIAL DIRECTORY FILES
EXCLUDING FILES
DELIVERING BROWSER-S ENSITIVE CONTENT
ENCODING
LANGUAGE
MEDIA TYPE
CHAPTER SUMMARY
65
66
66
67
68
68
69
70
70
71
72
73
73
73
74
75
75
76
76
76
77
77
77
79
80
81
CHAPTER OVERVIEW
CHAPTER OBJECTIVES
APACHE'S SECURITY AND PERFORMANCE GOALS
HARDWARE AND PLATFORM CONSIDERATIONS
PERFORMANCE TUNING
RUN-TIME TUNING
SECURITY
RESTRICTING ACCESS
SETTING ACCESS OPTIONS
ENABLING ACCESS TO LOCAL DOCUMENTS
SERVERROOT DIRECTORY PERMISSIONS
SAFE CGI
CHAPTER SUMMARY
81
81
82
82
84
84
87
87
88
90
90
91
92
93
CHAPTER OVERVIEW
CHAPTER OBJECTIVES
THE URL REWRITING ENGINE
REWRITING FUNDAMENTA LS
COMMON REWRITING NEEDS
TRAILING SLASHES
USERS ON ANOTHER SERVER
REDIRECT INVALID URLS
TIME IS IMPORTANT
FAKING STATIC PAGES
CHAPTER SUMMARY
93
93
94
94
98
98
99
99
100
100
101
Apache_sw_1.3.14_9/10/01
APPENDICES
103
LAB 1: INTRODUCTION
PART A (5 MINUTES)
LAB 2: APACHE INSTALLATION
PART A (10 MINUTES)
PART B (30-45 MINUTES)
LAB 3: APACHE CONFIGURATION
PART A (5 MINUTES)
PART B (40 MINUTES)
LAB 4: EFFECTIVELY WORKING WITH APACHE
PART A (5 MINUTES)
PART B (15 MINUTES)
PART C (30 MINUTES)
LAB 5: VIRTUAL HOSTS
PART A (10 MINUTES)
PART B (45 MINUTES)
PART C (15 MINUTES)
LAB 6: ADVANCED CONFIGURATION
PART A (5 MINUTES)
PART B (15 MINUTES)
PART C (15 MINUTES)
LAB 7: PERFORMANCE AND SECURITY
PART A (5 MINUTES)
PART B (45 MINUTES)
PART C (30 MINUTES)
LAB 8: URL REWRITING AND CUMULATIVE LAB
PART A (5 MINUTES)
PART B (90 MINUTES)
CHALLENGE 1 (90 MINUTES)
REFERENCES
104
104
105
105
105
107
107
107
109
109
109
109
110
110
110
111
112
112
112
112
113
113
113
114
115
115
115
115
116
Apache_sw_1.3.14_9/10/01
Chapter 1:
Introduction
Chapter Overview
Before using Apache, it is sensible to review the features it offers and how
it compares to other servers. In this chapter, you'll see the benefits Apache
gives administrators, and you'll see how Apache compares to other web
servers.
Chapter Objectives
After completing this chapter, you will be able to:
Apache_sw_1.3.14_9/10/01
Overview
The Apache web server began simply: to provide an open-source Web
server for Linux and other open-source operating systems. Originally
developed by the Apache Group, the Apache web server met that goal.
Today, Apache has grown far beyond its original scope. Currently funded
by the Apache Software Foundation (http://www.apache.org/),
the Apache web server is just one piece of a larger suite of many Internetoriented, open-source projects.
HP-UX
AIX
IRIX
Digital UNIX
Netware 5.x
OS/2
Macintosh
BeOS
SCO
Apache_sw_1.3.14_9/10/01
Features
There are numerous reasons to use Apache. Apache is:
powerful as it implements:
o DBM databases for authentication.
o customized error messages.
o different directory index views.
o unlimited and flexible URL rewriting and aliasing.
o content negotiation.
o virtual hosts.
o reliable logging.
Apache_sw_1.3.14_9/10/01
CGI execution.
configuration capability.
security.
10
Apache_sw_1.3.14_9/10/01
Chapter Summary
Apache is a widely used, stable, and robust Web server. After five years
of development, Apache evolved a rich set of configuration and
performance features that make it a top choice for high- volume web sites
around the world.
Apache excels in CGI script execution and security, but lacks some
performance because of its process-oriented model. Because volunteer
developers worldwide care about Apache's success on a daily basis, these
performance barriers are rapidly being removed in favor of better models.
11
Apache_sw_1.3.14_9/10/01
12
Apache_sw_1.3.14_9/10/01
Chapter 2:
Apache Installation
Chapter Overview
Installing Apache can be very simple or extremely complex. The range of
configuration possibilities that Apache offers is staggering, but the default
Apache installation is sufficient for many sites. This chapter will illustrate
the installation procedure and point out many of the configuration
parameters you can use to change the standard behavior.
Chapter Objectives
After completing this chapter, you will be able to:
13
Apache_sw_1.3.14_9/10/01
Untrusted users
When you will serve pages to any untrusted users, you'll need to take
several precautions to prevent unauthorized access to your server.
The general architecture for sites with untrusted users is:
14
Apache_sw_1.3.14_9/10/01
Obtaining Apache
Obtaining Apache
You can download Apache from the World Wide Web, or you can find it
on your Linux operating system CD. For Red Hat Linux users, Apache is
automatically installed with the "server" install, but you can add it
manually by selecting the "Web Server" option during a custom install.
Apaches web site, http://httpd.apache.org/, holds the latest
version for the Apache web server. This site provides the current release,
more recent beta-test releases (if available), and anonymous ftp sites.
15
Apache_sw_1.3.14_9/10/01
Compiling Apache
The Apache web site distributes the Apache source code in a compressed
"tarball" format. After unpacking the archive, you must configure and
build the software for your system. The example below shows the
recommended procedure; it requires no intervention because the server
software is highly portable:
$
$
$
$
$
The distribution will put the binary (httpd) and the standard
configuration files in your system-specific directories.
16
Apache_sw_1.3.14_9/10/01
Description
/home/httpd
/home/httpd/html
/home/httpd/cgi-bin
/home/httpd/html/manual
Configuration files
Directory
Description
.htaccess
Directory-based configuratio n
files. A .htaccess file holds
directives to control access to
files within the directory in
which it is located
/etc/httpd/conf
/etc/httpd/conf/httpd.conf
Application files
Directory
Description
/usr/sbin
/usr/doc
/var/log/http
17
Apache_sw_1.3.14_9/10/01
Modules
You can have particular "modules," which are simply extensions to
Apache's base code, dynamically linked at run-time. These modules have
already been compiled, but they're not actually part of the Apache
executable. Instead, you must explicitly load them into a running server
with the LoadModule directive, as shown below:
LoadModule mod_name modules/mod_name.so
The listing below (httpd.conf) shows the default modules that will be
loaded. Lines starting with a "#" are comments and are ignored:
# LoadModule foo_module modules/mod_foo.so
#LoadModule mmap_static_module modules/mod_mmap_static.so
LoadModule vhost_alias_module modules/mod_vhost_alias.so
LoadModule env_module
modules/mod_env.so
LoadModule config_log_module modules/mod_log_config.so
LoadModule agent_log_module
modules/mod_log_agent.so
LoadModule referer_log_module modules/mod_log_referer.so
#LoadModule mime_magic_module modules/mod_mime_magic.so
LoadModule mime_module
modules/mod_mime.so
LoadModule negotiation_module modules/mod_negotiation.so
LoadModule status_module
modules/mod_status.so
LoadModule info_module
modules/mod_info.so
LoadModule includes_module
modules/mod_include.so
LoadModule autoindex_module
modules/mod_autoindex.so
LoadModule dir_module
modules/mod_dir.so
LoadModule cgi_module
modules/mod_cgi.so
LoadModule asis_module
modules/mod_asis.so
LoadModule imap_module
modules/mod_imap.so
LoadModule action_module
modules/mod_actions.so
#LoadModule speling_module
modules/mod_speling.so
LoadModule userdir_module
modules/mod_userdir.so
LoadModule alias_module
modules/mod_alias.so
LoadModule rewrite_module
modules/mod_rewrite.so
LoadModule access_module
modules/mod_access.so
LoadModule auth_module
modules/mod_auth.so
LoadModule anon_auth_module
modules/mod_auth_anon.so
LoadModule db_auth_module
modules/mod_auth_db.so
LoadModule digest_module
modules/mod_digest.so
LoadModule proxy_module
modules/libproxy.so
#LoadModule cern_meta_module modules/mod_cern_meta.so
LoadModule expires_module
modules/mod_expires.so
LoadModule headers_module
modules/mod_headers.so
LoadModule usertrack_module
modules/mod_usertrack.so
#LoadModule example_module
modules/mod_example.so
#LoadModule unique_id_module modules/mod_unique_id.so
LoadModule setenvif_module
modules/mod_setenvif.so
#LoadModule bandwidth_module modules/mod_bandwidth.so
#LoadModule put_module
modules/mod_put.so
# Extra Modules
#LoadModule perl_module
#LoadModule php_module
#LoadModule php3_module
modules/libperl.so
modules/mod_php.so
modules/libphp3.so
18
Apache_sw_1.3.14_9/10/01
The server can have modules compiled in but not in use. To actually use
these modules, specify them with the AddModule directive. The
defaults, shown below, are acceptable for many sites.
#AddModule mod_mmap_static.c
AddModule mod_vhost_alias.c
AddModule mod_env.c
AddModule mod_log_config.c
AddModule mod_log_agent.c
AddModule mod_log_referer.c
#AddModule mod_mime_magic.c
AddModule mod_mime.c
AddModule mod_negotiation.c
AddModule mod_status.c
AddModule mod_info.c
AddModule mod_include.c
AddModule mod_autoindex.c
AddModule mod_dir.c
AddModule mod_cgi.c
AddModule mod_asis.c
AddModule mod_imap.c
AddModule mod_actions.c
#AddModule mod_speling.c
AddModule mod_userdir.c
AddModule mod_alias.c
AddModule mod_rewrite.c
AddModule mod_access.c
AddModule mod_auth.c
AddModule mod_auth_anon.c
AddModule mod_auth_db.c
AddModule mod_digest.c
AddModule mod_proxy.c
#AddModule mod_cern_meta.c
AddModule mod_expires.c
AddModule mod_headers.c
AddModule mod_usertrack.c
#AddModule mod_example.c
#AddModule mod_unique_id.c
AddModule mod_so.c
AddModule mod_setenvif.c
#AddModule mod_bandwidth.c
#AddModule mod_put.c
# Extra Modules
#AddModule mod_perl.c
#AddModule mod_php.c
#AddModule mod_php3.c
19
Apache_sw_1.3.14_9/10/01
Standard modules
The table below describes each of the standard modules:
Module
Description
http_core
mod_access
mod_actions
mod_alias
mod_asis
mod_auth
mod_auth_anon
mod_auth_db
mod_auth_dbm
mod_authoindex
mod_cern_meta
mod_cgi
20
Apache_sw_1.3.14_9/10/01
mod_digest
mod_dir
mod_env
mod_example
mod_expires
mod_headers
mod_imap
Control inline image map files, which have a xhttpd-imap MIME type or are parsed by the
imap handler
mod_include
mod_info
mod_log_agent
mod_log_config
mod_log_referer
mod_mime
mod_mime_magic
21
Apache_sw_1.3.14_9/10/01
mod_negotiation
mod_proxy
mod_rewrite
mod_setenvif
mod_so
mod_speling
mod_status
mod_userdir
mod_usertrack
mod_unique_id
22
Apache_sw_1.3.14_9/10/01
BSD style
With other distributions, such as Slackware, you'll need to manually add
the Apache server to the system start-up scripts. For example, assume you
installed the server in /usr/sbin/httpd, then you'd put the following
at the bottom of /etc/rc.d/rc.local:
# /etc/rc.d/rc.local
/usr/sbin/httpd &
23
Apache_sw_1.3.14_9/10/01
24
Apache_sw_1.3.14_9/10/01
Chapter Summary
In this chapter, you learned how to obtain, compile, install, start, and test
the Apache distribution. These steps only get the standard server running;
additional configuration is possible through the run-time extensions
provided by modules. The LoadModule and AddModule directives,
held in Apache's configuration file httpd.conf, allow you to alter the
run-time capabilities of the Apache server easily.
25
Apache_sw_1.3.14_9/10/01
26
Apache_sw_1.3.14_9/10/01
Chapter 3:
Apache Configuration
Chapter Overview
In this chapter, you will see a large collection of Apache's more popular
configuration parameters, and how they affect the operation of an Apacheserved web site. Understanding these parameters will allow you to tune
your Apache configuration to your sites' specific requirements.
Chapter Objectives
After completing this chapter, you will be able to:
27
Apache_sw_1.3.14_9/10/01
Apache Directives
The Apache configuration file, httpd.conf, is comprised of directives
that hold the Apache configuration operations. Directives allow you to
enter basic configuration information, such as your server name, or
perform more complex operations, such as implementing virtual hosts.
Since all directives and most of the options are case sensitive, it is best to
always use the exact format given to reduce syntax errors. A "#" at the
beginning of line denotes a comment, and you may continue a directive to
the next line by using a "\".
Simple directives
Simple directives have global scope in Apaches httpd.conf file and
take the form of the directive name followed by options. The syntax for a
simple directive is:
Directive Option Option . . .
For example, to set the server administrator's email address, you would
have the simple ServerAdmin directive set such as below:
ServerAdmin webmaster@company.com
Block directives
Block directives hold configuration parameters that apply to specific
components. Block directives are entered in pairs; specifically, there is a
beginning and terminating directive.
The beginning block directive takes an argument that specifies the
particular component to which the directives apply, and the terminating
directive consists of a slash and the directive name designating the blocks
end. This syntax, which is very much like HTML containers, has the
following syntax:
<BlockDirective Argument . . .>
Directive Option . .
Directive Option . .
</BlockDirective>
28
Apache_sw_1.3.14_9/10/01
Description
<Directory myDir>
<VirtualHost hostaddress>
<Files file(s)>
29
Apache_sw_1.3.14_9/10/01
TIP:
You can change the directory access control filename from
.htaccess with the AccessFileName directive. For example,
AccessFileName .access sets the filename to .access.
30
Apache_sw_1.3.14_9/10/01
Server Configuration
The httpd.conf file holds most of Apache's configuration, and for a
typical Apache installation, many of the directives' defaults can be left asis.
Older versions of Apache separated configuration into three files:
access.conf, httpd.conf, and srm.conf. Apache no longer
recommends this separation, and insists on keeping all configuration
information within httpd.conf.
You can use any number below 65535, as long as no other server is using
that port. The /etc/services file lists the ports normally associated
with particular servers, and you should check this file before randomly
adding a new port.
31
Apache_sw_1.3.14_9/10/01
Hostname lookups
The HostnameLookup directive allows you to log clients by either IP
address or hostname. If you enable this directive, every incoming
connection will generate a DNS lookup to translate the IP address into the
corresponding hostname. For example, 204.62.129.132 will be
changed into www.apache.org before writing information into the log
files.
Enabling this feature greatly reduces the servers response time, so unless
you have no other way to resolve hostnames that may be required for
certain analysis or statistical programs, you should leave it set to the
default of Off:
HostnameLookups Off
# Set to On to enable
32
Apache_sw_1.3.14_9/10/01
Should you decide to modify this directive, you must specify the parent
directory that holds the configuration, log, and module files. Within this
parent directory, there should be a directory named conf that holds
configuration information, logs that holds log information, and
modules that holds module files. On most systems, the logs and
modules directories don't reside in the parent directory; instead, they're
symbolic links to other directories in the filesystem.
33
Apache_sw_1.3.14_9/10/01
Apache looks for these files when a browser requests a directory and not a
specific file. The first file found in the directory that matches an entry in
the DirectoryIndex list is used. If none of the files exists and the
Indexes option is in effect for the directory, Apache generates a
directory file index; otherwise, an error message is shown.
USE_FCNTL_SERIALIZED_ACCEPT
USE_FLOCK_SERIALIZED_ACCEPT
Normally, the configure script doesn't set these compilation flags for
Linux. Unless you manually forced these compilation flags for your
Apache server, you can ignore this directive. If you compiled with these
flags, then the default directory is safe to leave unmodified.
LockFile /var/lock/httpd.lock
TIP:
The lock- file must reside on a local disk;
it can't be on a remote (e.g., NFS) filesystem.
34
Apache_sw_1.3.14_9/10/01
Defining hostnames
Apache can send browsers a different hostname than the one they
requested.
Returning a different hostname
The ServerName directive specifies the hostname to return to all
browsers. You cannot just invent host names; you must have a valid DNS
name. In the case where your server doesn't have a registered DNS name,
you should set the ServerName directive to your server's IP address.
ServerName localhost
Canonical hostnames
The UseCanonicalName directive (shown below) allows your server
to enforce name consistency. When set to On, Apache will always use the
ServerName and Port directives to create an explicit URL that uniquely
refers back to your server. This name, known as the canonical name,
enforces a consistent naming, which might be important for CGI scripts
that validate by hostname.
UseCanonicalName On
Cache configuration
By default, Apache sends a Pragma: no-cache header with each
content-negotiated document. This header asks proxy servers to not cache
the document, so that future requests to the document will force content
renegotiation.
Un-commenting the CacheNegotiatedDocs directive line disables
this behavior, which will allow proxies to cache documents:
#CacheNegotiatedDocs # uncomment to enable
35
Apache_sw_1.3.14_9/10/01
36
Apache_sw_1.3.14_9/10/01
At startup, and when operating in standalone mode, Apache will start one
master server, then start more servers as given by the StartServers
directive. Again, for average sites, the default is reasonable:
StartServers 8
Using the values specified above, when the daemon is started, the server
processes will run, waiting for connections. As more requests arrive,
Apache will ensure that at least 5 servers are ready to request connections.
When a request has been fulfilled and no new connections arrive, Apache
will begin killing processes until the number of idle Web server processes
is less than 20.
Safety nets
Apache can limit the total number of simultaneous server processes with
the MaxClient directive. The MaxClient directive should be
sufficiently high for your site's normal load. The default of 150 is almost
always large enough for most sites:
MaxClients 150
37
Apache_sw_1.3.14_9/10/01
38
Apache_sw_1.3.14_9/10/01
If you have disabled all users, you can use the enabled keyword
followed by a space-delimited username list to allow these users access.
These usernames will have directory translation performed even if a global
disable is in effect, but not if they also appear in a disabled clause.
The following directive disables all users except "john":
UserDir disabled
UserDir enabled john mike
UserDir disabled mike
39
Apache_sw_1.3.14_9/10/01
Directory specification
If neither the enabled nor the disabled keyword appears in the
UserDir directive, the argument is treated as a filename pattern. This
filename specifies the directory within a user's home directory to find web
content.
There are two ways that the UserDir directive can handle incoming
request that include a tilde expansion:
1. Identify the physical pathname of the individual users publicly
accessible directories.
2. Specify a URL to which the request is redirected.
Example
Suppose a browser requests the URL:
http://www.company.com/~john/
The UserDir directive affects how this URL is expanded, as shown in
the following table 1 :
Directive
Location
UserDir www
/home/john/www/
UserDir /usr/web
/usr/web/john/
UserDir /home/*/www
/home/john/www/
UserDir http://www.home.com/
http://www.home.com/john/
UserDir
http://www.home.com/users/
http://www.home.com/users/j
ohn/
UserDir http://www.home.com/~*/
http://www.home.com/~john/
The table assumes that user directories exist under /home in the local filesystem.
40
Apache_sw_1.3.14_9/10/01
CGI Programs
Common Gateway Interface (CGI) files are programs that browsers can
request the server to execute.
CGI by directory
Traditionally, these files were placed in the cgi-bin directory and could
only be executed if they resided in that specia l directory. Typically, a
Web site will only have one CGI directory.
Red Hat Linux sets the CGI directory, by default, to
/home/httpd/cgi-bin. You can set the ScriptAlias directive to
alter this default, as shown below:
ScriptAlias /cgi-bin/ /home/httpd/cgi-bin/
CGI by file
It is also possible to configure Apache to consider any files ending in a
particular extension as CGI programs. The AddHandler directive
allows you to map a filename extension to some behavior within Apache.
For example, the directive below maps all files that end in .cgi as CGI
programs:
AddHandler cgi-script .cgi
41
Apache_sw_1.3.14_9/10/01
Chapter Summary
Configuring Apache to meet your site's specific requirements is a critical
piece of a high-quality web site. In addition to understanding the syntax
of the Apache configuration file, httpd.conf, you'll need to understand
how the directives affect Apache's behavior. Of key importance to many
administrators is Apache's performance and security features, and to
adequately address these issues, an administrator must understand the
directives available in the Apache configuration file.
42
Apache_sw_1.3.14_9/10/01
Chapter 4:
Effectively Working with
Apache
Chapter Introduction
When you installed Apache, you configured it to start at system boot.
Though this is the usual way of starting Apache, you might encounter
situations where you need to restart or even stop Apache. At other times,
you might need to start Apache with a different set of start-up flags. Once
you've started Apache, you'll need to routinely monitor the error and
access logs for odd behavior.
This chapter will explain the various ways to start Apache, the meanings
of Apache's command- line flags, and how to examine the Apache logs.
Chapter Objectives
After completing this chapter, you'll be able to:
43
Apache_sw_1.3.14_9/10/01
Controlling Apache
Normally, you'll configure Apache to start at system boot and run until the
system is shut down. However, if you are testing or modifying Apache's
configuration, you will probably want to stop, start, or restart Apache
without rebooting the system.
There are a couple ways to control Apache, including the command- line
approach using the apachectl command or using the System V script.
apachectl
Apache (post version 1.3) comes with a command to control the Apache
server. In the source distribution, this file is found in
src/support/apachectl, but binary distributions will install the file
in /usr/sbin/apachectl.
Configuring apachectl
At the top of the apachectl script is a configuration section, shown
below:
# the path to your PID file
PIDFILE=/usr/local/apache/logs/httpd.pid
#
# the path to your httpd binary, including
# options if necessary
HTTPD='/usr/local/apache/src/httpd'
#
# a command that outputs a formatted text
# version of the HTML at the url given on the
# command-line. Designed for lynx, however
# other programs may work.
LYNX="lynx -dump"
#
# the URL to your server's mod_status status
# page. If you do not have one, then status
# and fullstatus will not work.
STATUSURL="http://localhost/server-status"
If you built Apache from the source code and modified the default Apache
installation directories, you'll need to update this configuration section to
reflect your changes.
44
Apache_sw_1.3.14_9/10/01
Using apachectl
The apachectl script accepts one of several parameters that control
Apache's behavior. The table below summarizes the parameters:
Parameter
Description
START
stop
restart
graceful
status
fullstatus
configtest
45
Apache_sw_1.3.14_9/10/01
System V script
Some systems, such as Red Hat Linux, provide an Apache System V-like
control script at /etc/rc.d/init.d/httpd. This script is similar to
the Apache control script, though not as configurable.
The following table describes the five parameters that the
/etc/rc.d/init.d/httpd script accepts:
Parameter
Description
START
Start the Apache server. The Red Hat Linux version turns
off core dumps, which will prevent you from performing
adequate debugging should Apache have a major startup
problem
stop
restart
reload
status
46
Apache_sw_1.3.14_9/10/01
Description
-C DIRECTIVE
-C directive
-d directory
-D parameter
-f file
-h
-l
-L
-S
-t
47
Apache_sw_1.3.14_9/10/01
The first information, held within the brackets ([]), is the date and time of
the error, as reported by the system clock. The second information, also
within brackets, shows the severity of the error. The remainder is error
specific, but usually provides clues as to the error's nature.
Example error
Often times, administrators will see the following error:
[Fri Jun 16 09:54:37 2000] [error] [client
192.168.0.1] File does not exist:
/home/httpd/htdocs/favicon.ico
48
Apache_sw_1.3.14_9/10/01
Formats
The access log, and in fact all logs within Apache, are governed by a
format. The format specifies what each entry in the log file should look
like. For example, the format might state if the log entryshould contain
the timestamp, and if so, where should it be placed relative to the other
information.
When you configure Apache, you can specify a different log format with
the LogFormat directive. The LogFormat directive has the following
syntax:
LogFormat format handle
49
Apache_sw_1.3.14_9/10/01
Description
%b
%f
%{Var}e
%h
%{Head}i
%l
%{Head}o
%p
%P
%r
%s
%t
%T
%u
%U
%v
50
Apache_sw_1.3.14_9/10/01
Multiple logs
The CustomLog directive allows you bind a log filename with a format
that applies to the log file. For example, CustomLog
logs/standard_log common would log information to
standard_log using the "common" format seen above.
Whenever log information is available, Apache scans the custom log
formats. If any of the formats contain the information that's available,
those entries are written immediately. If some of the information is
available, but not all, Apache writes as much as possible, filling in the
non-available fields with a hyphen (-).
51
Apache_sw_1.3.14_9/10/01
Chapter Summary
Occasionally, you'll find need to stop or restart the Apache server; perhaps
for diagnostic purposes or configuration changes. Rather than rebooting
your entire system to restart Apache, you can use the Apache-supplied
apachectl script or a script provided by your operating system. These
scripts make it easy for you to control and retrieve status information on
your Apache server.
Commonly, though, you'll look through Apache's logs. Monitoring
security and access statistics are vital for a healthy server, so
understanding the Apache log files is a necessary administrative duty.
Apache allows you to specify a custom log format with the CustomLog
and LogFormat directives. Setting these allows you to fine-tune your
logs to meet your precise requirements.
52
Apache_sw_1.3.14_9/10/01
Chapter 5:
Virtual Hosts
Chapter Overview
Virtual hosting refers to maintaining more than one server on a machine,
differentiated by host name or IP address. For example, companies
sharing a web server want to have their own domains and allow web
server accessibility by www.company1.com and www.company2.com,
without requiring any extra path information from the user. Apache
supports several types of virtual hosting: IP address-based, name-based,
and dynamically- named.
Chapter Objectives
After completing this chapter, you will be able to:
53
Apache_sw_1.3.14_9/10/01
54
Apache_sw_1.3.14_9/10/01
At system boot, start an http server using the configuratio n file for
company1, and an http server using the configuration file for company2
and you've achieved IP address virtual hosting.
55
Apache_sw_1.3.14_9/10/01
You can set these up with a single Apache server with IP address-based
virtual hosts with:
<VirtualHost 192.168.0.1>
ServerName www.company1.com
User www
Group company1
DocumentRoot /home/httpd/htdocs/company1/
ErrorLog company1/logs/error_log
CustomLog company1/logs/access_log common
</VirtualHost>
<VirtualHost 192.168.0.2>
ServerName www.company2.com
User www
Group company2
DocumentRoot /home/httpd/htdocs/company2/
ErrorLog company2/logs/error_log
CustomLog company2/logs/access_log common
</VirtualHost>
TIP:
Though you could specify the DNS name instead of the IP address in
the VirtualHost block, doing so isn't recommended. Apache has
to perform a DNS lookup before allowing access, which slows down
response time.
56
Apache_sw_1.3.14_9/10/01
TIP:
Apache looks up the server to access from the HTTP headers. If this
information isn't available (such as with very old browsers), Apache
will use the first defined virtual host.
57
Apache_sw_1.3.14_9/10/01
58
Apache_sw_1.3.14_9/10/01
59
Apache_sw_1.3.14_9/10/01
60
Apache_sw_1.3.14_9/10/01
61
Apache_sw_1.3.14_9/10/01
System Limitations
File Descriptor Limits
When using a large number of virtual hosts, Apache may run out of
available file descriptors if each VirtualHost block specifies different
log files. The total number of file descriptors used by Apache is one for
each distinct error log file, one for every other log file directive, plus 10 or
20 for internal use.
Most multi-tasking, multi- user operating systems, including Linux, limit
the number of file descriptors that a process may use. The limit is
typically 64, and usually may be increased up to a large hard limit.
Although Apache attempts to increase the limit as required, this may not
work if:
1. Your system does not provide the setrlimit() system call.
2. The setrlimit(RLIMIT_NOFILE) call does not function on
your system.
3. The number of file descriptors required exceeds the hard limit.
4. Your system imposes other file descriptor limits, such as a limit on
stdio streams only using file descriptors below 256.
In the event of problems you can:
reduce the number of log files by not specifying log files in the
VirtualHost blocks, but only server-wide.
increase the file descriptor limit (if your system falls under 1 or 2
above) before starting Apache, using a script like:
#!/bin/sh
ulimit -S -n 100
exec /usr/sbin/httpd
62
Apache_sw_1.3.14_9/10/01
IP address limits
If your system has only one IP address, then implementing virtual hosts
prevents access to your main server using that address. You can no longer
use your main server as a Web server directly, only indirectly to manage
your virtual hosts.
You could configure a virtual host to manage your main servers Web
pages. Then you could use your main server to support virtual hosts that
function as Web sites, rather than the main server operating as one site
directly.
If your machine has two or more IP addresses, one can be used for the
main server and the other for the virtual hosts. Mixing IP-based and
name-base virtual hosts is also allowed and so is using separate IP
addresses to support different virtual hosts sets.
Several domain addresses can access the same virtual host by placing a
ServerAlias directive listing the domain names within the selected
VirtualHost block:
ServerAlias www.company1.com www.alias.com
63
Apache_sw_1.3.14_9/10/01
Chapter Summary
Virtual hosting provides a method for maintaining more than one server
on a computer by differentiating between servers by host name. The
virtual hosting method you choose depends on your system's and users
needs. With several IP addresses, virtual hosting by IP address is efficient
and sensible.
With a single IP address, however, it makes sense to use name-based
virtual hosting. Finally, if you have a large number of hosts or would like
to repeat additional performance benefits, dynamically- named virtual
hosts are the best solution.
64
Apache_sw_1.3.14_9/10/01
Chapter 6:
Advanced Configuration
Chapter Overview
Apache supports an extensive set of configuration
directives. We have previously only touched on the major
ones. In
this chapter, you'll see that Apache can have conditional configuration,
attach handlers to particular types of files, and change how it renders
information.
Chapter Objectives
After completing this chapter, you will be able to:
redirect content.
65
Apache_sw_1.3.14_9/10/01
Conditional Directives
Apache provides two block directives, IfDefine and IfModule, that
allow you to alter Apache's configuration conditionally. These directives
let you section off configuration that should only be included when special
conditions exist.
TIP:
Parameter names are case-sensitive.
Reversing the condition
If you want to include configuration when a conditional is not defined,
you can still use IfDefine. Simply prefix the parameter name with an
exclamation mark, as shown below:
# include proxying only when not debugging the
# server
<IfDefine !DEBUG>
LoadModule rewrite_module modules/mod_rewrite.so
LoadModule proxy_module
modules/libproxy.so
</IfDefine>
TIP:
You can nest IfDefine directives for simple multi-parameter tests.
66
Apache_sw_1.3.14_9/10/01
67
Apache_sw_1.3.14_9/10/01
TIP:
SetEnvIf is case-sensitive; SetEnvIfNoCase is not.
Browser matching
A special case of the SetEnvIf directive is the BrowserMatch (and
BrowserMatchNoCase) directive. This directive only checks the
browser's type, so you can use this as a quick way to set environment
variables describing the client's browser:
# unset the javascript variable if the client's
# Internet Explorer (IE uses jscript)
BrowserMatch MSIE !javascript
68
Apache_sw_1.3.14_9/10/01
69
Apache_sw_1.3.14_9/10/01
Apache Handlers
Browsers instruct Apache to load files via URLs. Most often, these files
are simply HTML files that should simply be sent back to the browser.
Sometimes, however, the file is more complicated than a simple text file.
For example, Apache needs to execute CGI scripts and send the results
back to the browser; sending the CGI script itself could cause a security
compromise.
Handlers
Many handlers are compiled into Apache or are available in a module.
The table below lists the handlers available either by Apache directly or
through a module:
Handler
Description
Module
default-handler
core
send-as-is
mod_asis
cgi-script
mod_cgi
imap-file
mod_imap
server-info
mod_info
server-parsed
mod_include
server-status
mod-status
type-map
mod_negotiation
70
Apache_sw_1.3.14_9/10/01
By file extension
You can add a handler based on a file's extension with the AddHandler
directive:
AddHandler cgi-script .cgi
TIP:
You can specify more than one extension and you do not need the
leading dot with the AddHandler directive. For example,
AddHandler cgi-script .cgi pl causes Apache to treat all
files ending in ".cgi" and ".pl" as CGI scripts.
By file location
You can instruct Apache to use the same handler for all files in a certain
location with the SetHandler directive:
# all files in the users' cgi-bin directories
# are treated as CGI files
<Directory /home/*/public_html/cgi-bin/>
SetHandler cgi-script
</Directory>
# the /status file holds server status
<Location http://www.company.com/status>
SetHandler server-status
</Location>
71
Apache_sw_1.3.14_9/10/01
Creating handlers
You can create new handlers with the Action directive:
#
handler name
Action add-footer
script
/cgi-bin/footer.pl
72
Apache_sw_1.3.14_9/10/01
Redirecting Content
You can have Apache redirect users from one location to another with an
alias.
Simple aliases
The Alias directive allows you to associate an alias with a real file's
name, as shown below:
#
Alias
alias
/icons/
real location
/home/httpd/icons
Whenever a browser requests the alias, Apache actually goes to the real
location, which must be in the local filesystem, and retrieves the content
from there.
When using aliases, the alias must match exactly. This means that a
trailing slash on an alias, as above with /icons/, must be present in the
request for the alias to work. In the example above, a browser requesting
/icons/ would go to the aliased location, but one requesting /icons
would not.
TIP:
You can't use the Directory or Location block directives on an
alias. You must use them on the real location.
Pattern aliases
Rather than specifying an exact alias, you can specify an alias by regular
expression. The AliasMatch directive, shown below, is more powerful
than the Alias directive alone:
# redirect any requests to the icons/ directory
# to the particular corporation's icon directory
AliasMatch (.*)/icons/ /home/httpd/corp/$1/icons
73
Apache_sw_1.3.14_9/10/01
Redirects
You can redirect one URL to another with the Redirect directive:
# redirect all requests to the foo.html file
# to the web site at www.foo.com
Redirect /help.html http://www.help.org/
Redirection with the Redirect directive has several benefits over the
Alias directive:
1. The redirected location doesn't have to be in the local filesystem; it
can be anywhere on the web.
2. You can send a status indicator along with the redirection.
Browsers conforming to HTTP 1.1 will use this status code as an
indication of the redirection's status.
Sending a status
You can send a status along with the redirection by supplying an
additional parameter:
Redirect permanent /help.html http://www.help.org/
Redirect gone /intranet.html
TIP:
Use the RedirectMatch directive to exert more control over the
resource redirection.
74
Apache_sw_1.3.14_9/10/01
Fancy Indexing
When a browser requests a directory for which there is no index file,
Apache will display the directory's contents (assuming Allow Indexes
is enabled for the directory). By default, Apache shows the directory's
contents as a simple list, from which you can click on an item within the
directory to view the contents.
However, Apache can display an icon beside the file's name that visually
describes the file's type; Apache can also show a text description for the
file's type. To enable this, turn on fancy indexing:
IndexOptions FancyIndexing
Default icon
The DefaultIcon directive specifies the image to display when no
previous directive has associated an icon with a particular file:
DefaultIcon /icons/unknown.gif
75
Apache_sw_1.3.14_9/10/01
Excluding files
The IndexIgnore directive specifies files that shouldnt be included in an
auto-generated index:
IndexIgnore .??* *~ *# HEADER* README* RCS
all files that start with a dot and have at least two characters.
76
Apache_sw_1.3.14_9/10/01
Encoding.
Media type.
Encoding
The AddEncoding directive maps a file extension to a MIME (Multipurpose Internet Mail Extensions) encoding type:
#
MIME encoding
AddEncoding x-zip
extension
.zip
Language
The AddLanguage directive associates a file extension with a
language code:
#
AddLanguage
AddLanguage
AddLanguage
AddLanguage
code
en
de
it
ja
extension
.en
.de
.it
.ja
77
Apache_sw_1.3.14_9/10/01
Language priorities
The LanguagePriority directive allows you to give precedence to
some languages in case the browser doesn't ask for a particular language.
For example, a mostly-English site might specify:
# assume English first, then German and Italian
LanguagePriority en de it
Character sets
With different languages also comes the possibility of different character
sets. For example, a Japanese encoding will not use the Western character
set (ISO-8660-0), because it doesn't contain the Japanese alphabet.
You can map a document extension to a character set using the
AddCharset directive:
# support three Japanese character sets
AddCharset EUC-JP .euc
AddCharset ISO-2022-JP .jis
AddCharset SHIFT_JIS .sjis
Then, when a browser requests a particular encoding, Apache will look for
a file ending in the mapped extension.
For example, suppose a browser wanted index.html in the Japanese
language (language code ja) and in the ISO standard Japanese character
set (character set code ISO-2022-JP). Apache, configured as shown
above, would look for the following files in order:
1. index.html.ja.jis
2. index.html.jis.ja
3. index.html.ja
4. index.html.jis
5. index.html
TIP:
The Apache documentation for the mod_negotiation module
explains the content negotiation algorithm in more detail.
78
Apache_sw_1.3.14_9/10/01
Media type
The AddType directive maps a file extension to a MIME (Multi-purpose
Internet Mail Extensions) type:
#
AddType
AddType
AddType
MIME type
image/gif
image/jpg
audio/mpeg
extension
.gif
.jpg
mpga mp2 mp3
79
Apache_sw_1.3.14_9/10/01
Chapter Summary
Apache is a highly configurable web server. In addition to the everyday,
mundane characteristics, you can configure Apache conditionally or have
it take different action based on incoming requests.
Apache also supports handling, which makes it easy for an administrator
to define his or her own special processing for documents. Combined
with Apache's content negotiation features, which allows it to deliver
different content based on browsers' requests, handling can grow to cover
every conceivable configuration a site would need.
Finally, Apache supports two other useful features. First, fancy indexing
allows Apache to dynamically build a listing of files within a directory,
complete with descriptive text and icons. Second, alias and redirection
allows Apache to send clients to resources even if they move.
80
Apache_sw_1.3.14_9/10/01
Chapter 7:
Performance and Security
Chapter Overview
Administrators typically want to increase their servers' performance and
maximize their servers' security. Apache tailors to both these needs, as
well as to more fundamental needs, including correct operation.
Chapter Objectives
After completing this chapter, you will be able to:
81
Apache_sw_1.3.14_9/10/01
See http://httpd.apache.org/docs/misc/perf-tuning.html.
82
Apache_sw_1.3.14_9/10/01
Platform
The operating system you choose to run Apache is a site-specific issue.
Generally speaking, you should choose an operating system that fits in
with the rest of your network. For example, if your network is largely
Windows-based, don't run Apache under Linux; your administration staff
will be taxed more with learning the operating system than administering
Apache.
Regardless of the operating system you choose, make certain that you've
applied the latest patches, especially network patches.
TIP:
Apache is not yet seasoned for the Windows NT environment. The
programming model NT employs differs from the one Apache uses.
Hence, the performance of Apache under NT is significantly different
than that for a Unix- like system.
83
Apache_sw_1.3.14_9/10/01
Performance Tuning
You can tune Apache both at run-time and at compile-time. The
configuration script supplied with Apache chooses the best compile-time
configuration for your system, so you won't need to modify these settings.
Run-time tuning
Tuning Apache at run-time is a matter of configuring several key
directives appropriately.
AllowOverride
When you allow directories to have overrides (those provided by the
.htaccess file), you impose a significant search burden on Apache.
Consider the following configuration:
DocumentRoot /home/httpd/htdocs
<Directory />
AllowOverride All
</Directory>
A request to the site's homepage will cause Apache to check for and
process each of the following files:
/.htaccess
/home/.htaccess
/home/httpd/.htaccess/
/home/httpd/htdocs/.htaccess
These files' contents aren't cached, so every request will cause this
processing.
FollowSymLinks and SymLinksIfOwnerMatch
Similar to the problem caused by the AllowOverride directive,
FollowSymLinks and SymLinksIfOwnerMatch also cause Apache
to check extra file information. Specifically, Apache has to check each
file in directory pathnames to see if:
1. the file is a link, and if so, follow it.
2. the file's owner matches the requesting process's owner.
For maximum performance, engage FollowSymLinks everywhere and
disable SymLinksIfOwnerMatch. This reduces security by allowing
symbolic links created by other users, but increases Apache's performance.
84
Apache_sw_1.3.14_9/10/01
HostnameLookups
With HostnameLookups On, every request requires a DNS lookup to
complete the request. If the DNS is slow or, worse, down, the time to
complete requests will lag. For maximum performance, set
HostnameLookups Off.
It's possible to scope DNS lookups, so that only when Apache accesses
certain files will the lookup commence:
# disable DNS lookup
HostnameLookups Off
# lookup hostnames only when CGI programs
# requested
<Files ~ "\.cgi">
HostnameLookups On
</Files>
Keepalives
You should keep connections open with the KeepAliveTimeout
directive. This allows the client-server connection to remain open and
reduce the penalties of accepting new connections. However, keeping
Apache processes around too long means they'll sit in a busy (idle) loop,
which just wastes resources.
The default KeepAliveTimeout of 15 seconds attempts to minimize
this effect. Ho wever, the tradeoff between network bandwidth and server
resources remains regardless of the value. You should never raise this
value above 60 seconds, as the benefits are lost.
Negotiation
Don't disable content negotiation. The benefits far outweigh the
performance hit. The one scenario where it makes sense to restrict content
negotiation comes with directory indexing. Rather than using a directory
index wildcard, explicitly name the allowed index files:
# using index as a wildcard, like below which
# matches index*, is a performance no-no
# DirectoryIndex index
# explicitly name all valid index files
DirectoryIndex index.cgi index.pl \
index.shtml index.html
85
Apache_sw_1.3.14_9/10/01
Server creation
When Apache experiences an increased load, it starts enough servers to
meet the load and maintain the MinSpareServers setting. When the
load spikes (in other words, increases rapidly), Apache has to start servers
to meet the load. Unfortunately, this tends to swamp servers, making
them swap to disk.
To minimize swapping, Apache starts one server, waits a second, starts
two, waits another second, starts four; this continues exponentially until it
is spawning 32 servers per second and the MinSpareServers setting is
reached.
Experimental results show that it is usually unnecessary to adjust the
MinSpareServers, MaxSpareServers, and StartServers
settings. However, when Apache spawns more than 4 children per second,
a diagnostic message goes to the error log. Lots of these errors indicate
you should tune the values.
Server death
The MaxRequestsPerChild directive restricts the number of request
a child server will handle. Usually this value is 0, which means that there
is no limit to the number of requests handled per child. Your
configuration should not set this to a low number, like 50, as that's far too
few, and it typically causes swapping.
For operating systems where this parameter is important, like SunOS or an
old version of Solaris, limit this to approximately 10,000. This allows the
child to process enough requests to prevent swapping and also limits the
absorption of system memory through memory leaks.
86
Apache_sw_1.3.14_9/10/01
Security
Apache has several security features that every administrator should know.
Restricting access
Apache allows you to specify hosts that can and hosts that cannot access
your web sites. The Allow and Deny directives specify the hosts that
can and cannot, respectively, access your sites. These directives typically
appear within a Directory block directive.
<Directory />
Allow from company.com friend.org
Deny from foe.com
Allow from ally.foe.com
</Directory>
When Apache checks a host for access, it scans the configuration file from
top to bottom and uses the first match encountered. In the example above,
Apache would deny ally.foe.com even though it's declared Allow.
To enforce a particular order, use the Order directive. Revisiting the
previous example:
<Directory />
Order allow, deny
Allow from company.com friend.org
Deny from foe.com
Allow from ally.foe.com
</Directory>
Because the order is declared Allow first, all the Allow directives would
be checked before the Deny directives.
Pedantic access
You can allow and deny from a special class of hosts: All. When you
use All in either the Allow or Deny directive, Apache will match all
hosts. For example:
# access only from company.com & friend.org
<Directory />
Order deny, allow
Allow from company.com friend.org
Deny from All
</Directory>
87
Apache_sw_1.3.14_9/10/01
Description
All
ExecCGI
FollowSymLinks
Includes
IncludesNOEXEC
Indexes
MultiViews
None
Enable no features
88
Apache_sw_1.3.14_9/10/01
Description
All
AuthConfig
FileInfo
Indexes
Limit
None
Options
Example
Considering performance and maximum security, a typical site will have
the following configuration for user directories (notoriously the most
insecure area):
# set the security on users' directories
<Directory /home/*/public_html>
Options FollowSymLinks IncludesNOEXEC \
-FollowSymLinksIfOwnerMatch
AllowOverride AuthConfig Limit
</Directory>
TIP:
You can remove a previously set option by placing a hyphen (-) in
front of the option parameter.
89
Apache_sw_1.3.14_9/10/01
mkdir
mkdir
chown
chmod
cp httpd /etc/httpd/bin
chown root /etc/httpd/bin/httpd
chgrp 0 /etc/httpd/bin/httpd
chmod 511 /etc/httpd/bin/httpd
90
Apache_sw_1.3.14_9/10/01
Safe CGI
CGI programs are a major security concern. When you give your users
the ability to execute CGI scripts, you're giving them the ability to execute
programs as the same user as the Apache server. Because all the users'
scripts run as the same user (that of the Apache server), one user's CGI
script can overwrite another user's. This is problematic, but fortunately
Apache has a solution.
suEXEC
suEXEC allows the Web server to run CGI scripts as a different user than
the Apache server. suEXEC is not configured by default, so you will have
to go back to the Apache source to enable this.
When you configure Apache, pass the Apache configuration script
configure the --enable-suexec flag and the
--suexec-safepath flag. These will apply safe settings for suEXEC
for most installations.
TIP:
After configuring but before compiling, check the suEXEC setup by
typing configure --layout and inspecting the values.
When Apache starts up, a properly configured suEXEC system will log
the following message:
[notice] suEXEC mechanism enabled
the User and Group defined for a particular virtual host, as long
as these differ between the virtual host and the main server.
91
Apache_sw_1.3.14_9/10/01
Chapter Summary
Administrators are rightly concerned about performance and security.
Apache is written to be a generally powerful server, and in all cases,
Apache sacrifices speed for correct and secure operation.
Generally speaking, Apache is fast enough for most sites. Those sites that
have very large bandwidth connections demonstrate a need to increase the
number of Apache servers and computers, but moderate sites do not.
Administrators should maximize their systems' performance by increasing
the amount of system RAM. Also, administrators need to set appropriate
values in the Apache configuration file, keeping in mind the goal: reduce
the amount of time the server swaps.
Finally, Apache has a large array of security facilities that make it ideal for
sites with a large, untrusted user base. With Apache, you can fine-tune the
configuration for directories and files, tailoring them to your exact needs.
With the suEXEC mechanism, you can even make CGI scripts secure by
reducing their execution scope.
92
Apache_sw_1.3.14_9/10/01
Chapter 8:
URL Rewriting
Chapter Overview
The most complicated Apache module, mod_rewrite, is also the most
powerful. With it, you can translate any URL into another, incorporating
a wide array of conditions, variables, and patterns. Administrators
wishing to take their site to the "next level" will need to make heavy use
of this module.
Chapter Objectives
After completing this chapter, you will be able to:
93
Apache_sw_1.3.14_9/10/01
Rewriting fundamentals
The mod_rewrite module can operate in two contexts:
94
Apache_sw_1.3.14_9/10/01
Apache then processes the URL further, eventually finding the correct
data directory. Apache calls the mod_rewrite module again, this time
with a directory path, and not a URL3 . At this point, the mod_rewrite
module applies any per-directory rules by converting the directory back
into the URL (via the RewriteBase directive) and restarting the phases
with the URL.
There is no obvious distinction between a URL and a directory; both are ways of
expressing data's location, though URLs have a more general and robust syntax.
95
Apache_sw_1.3.14_9/10/01
96
Apache_sw_1.3.14_9/10/01
Options
a hyphen (-).
For example, suppose your site's main page had several versions based on
a browser's capability. You could select the appropriate page with the
following rules and conditions:
# enable the engine (this is required, because
# the engine is NOT on by default)
RewriteEngine on
# condition: does the browser (in the variable
# %{HTTP_USER_AGENT}) contain "Mozilla"
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*
# if the URL is / (ie, the main page), and the
# Netscape condition above applies, replace the
# URL with index.NS.html. Finally, stop the
# rewriting process (indicated by the [L]
# option).
RewriteRule ^/$ /index.NS.html [L]
# condition: is this the text-based Lynx
# browser?
RewriteCond %{HTTP_USER_AGENT} ^Lynx.*
# if so, use the text-only version page and
# stop rewriting.
RewriteRule ^/$ /index.TO.html [L]
# otherwise, rewrite the URL to use the
# standard page.
RewriteRule ^/$ /index.html
97
Apache_sw_1.3.14_9/10/01
Trailing slashes
When you request a URL that's a directory, you mustn't forget the trailing
slash. For example, suppose you try to access:
http://www.mycompany.com/~user/subdir.
Apache tries to locate the file "subdir", which probably doesn't exist.
What you wanted the URL to say was:
http://www.mycompany.com/~user/subdir/.
To accomplish this, use the following configuration:
RewriteEngine
on
[R]
Generally speaking, Apache actually tries to fix this trailing slash problem
on its own. Sometimes it fails, though (for example, when you've already
done a lot of complicated rewritings). Hence, the method above is hardcoded for those particular cases.
Users could, of course, put a general configuration (shown below) in their
top-level .htaccess files:
RewriteEngine
RewriteBase
on
/~user/
-d
$1/
[R]
98
Apache_sw_1.3.14_9/10/01
99
Apache_sw_1.3.14_9/10/01
Time is important
Suppose you want to redirect traffic based on the time of day. This might
be important for pages that service time information, where they might
want to show one page for AM and another for PM. This technique might
also apply for web sites that go down for maintenance routinely.
Again, the mod_rewrite module makes this trivial:
RewriteEngine on
# condition:
# variables)
# sequential
# together.
RewriteCond
RewriteCond
100
Apache_sw_1.3.14_9/10/01
Chapter Summary
The mod_rewrite module is perhaps the most powerful module
Apache has to offer. Without a doubt, it's also the most complicated.
Once you've mastered its syntax, you'll be able to rewrite URLs according
to your own specifications, taking into account server and environment
variables and other conditions.
The Rewriter works by hooking into Apache's multi-phase API. In either
per-server or per-directory context, the mod_rewrite module takes a
URL and transforms it into another URL using conditions and rules. You
specify conditions with the RewriteCond directive and rules with the
RewriteRule URL.
101
Apache_sw_1.3.14_9/10/01
102
Apache_sw_1.3.14_9/10/01
Appendices
APPENDICES
A-1
LAB 1: INTRODUCTION
LAB 2: APACHE INSTALLATION
LAB 3: APACHE CONFIGURATION
LAB 4: EFFECTIVELY WORKING WITH APACHE
LAB 5: VIRTUAL HOSTS
LAB 6: ADVANCED CONFIGURATION
LAB 7: PERFORMANCE AND SECURITY
LAB 8: URL REWRITING AND CUMULATIVE LAB
REFERENCES
A-2
A-3
A-5
A-7
A-8
A-10
A-11
A-13
A-14
103
Apache_sw_1.3.14_9/10/01
Lab 1: Introduction
Part A (5 minutes)
Answer the following questions:
1. What is Apache, and who provides enhancements and fixes?
104
Apache_sw_1.3.14_9/10/01
105
Apache_sw_1.3.14_9/10/01
106
Apache_sw_1.3.14_9/10/01
3. What are two ways Apache can provide "safety nets" for runaway
or buggy Apache servers?
107
Apache_sw_1.3.14_9/10/01
the server doesn't allow you to view the root and www user's
web pages.
the server allows you to view the contents of users' home pages
under their web directory.
the server only shows indexes when the files are named either
index.html or index.shtml, and always index.html
first.
108
Apache_sw_1.3.14_9/10/01
109
Apache_sw_1.3.14_9/10/01
company1
company2
company3
www.company1.com
www.company2.com
www.company3.com
110
Apache_sw_1.3.14_9/10/01
111
Apache_sw_1.3.14_9/10/01
PARANOID is defined
112
Apache_sw_1.3.14_9/10/01
2. Why does Apache not start spare servers immediately, but instead
starts some, waits a second, starts more, etc?
113
Apache_sw_1.3.14_9/10/01
10. Have your partner change the entries in his or her /etc/hosts
file to your IP address. For example, if your server's IP address is
192.168.0.1, your partner's /etc/hosts file would contain:
192.168.0.1 company1
192.168.0.1 company2
192.168.0.1 company3
www.company1.com
www.company2.com
www.company3.com
114
Apache_sw_1.3.14_9/10/01
Paranoid error logging that logs all events. You should only
enable this type of logging when the PARANOID variable is
set.
115
Apache_sw_1.3.14_9/10/01
References
The materials below provide valuable Apache administrative help and
should be close at hand for Apache administrators:
http://www.zdnet.com/pcmag/stories/reviews/0
,6755,2551188,00.html. ZDNet: Benchmark Tests: Web
Platforms. A benchmark of Apache on several operating systems.
116
Apache_sw_1.3.14_9/10/01