The Apache Web Server
The Apache Web Server
Table of Contents
1. Webserver Basics ...................................................................................................................................5 Discussion ..........................................................................................................................................5 Web Servers..............................................................................................................................5 Installation the Apache Web Server .........................................................................................5 Web Server Layout ...................................................................................................................6 The Document Root: /var/www/html/ .................................................................................7 Content Types ...........................................................................................................................8 Directories ................................................................................................................................9 Web Server Logging: /var/log/httpd/{access,error}_log......................................10 The Anatomy of a Web Request: the HTTP Protocol (Optional, but Interesting) .................12 The Hyper Text Markup Language (HTML) (Optional)........................................................17 Exercises ..........................................................................................................................................18 Specication ...........................................................................................................................18 Deliverables ............................................................................................................................19 Clean Up .................................................................................................................................19 Questions..........................................................................................................................................20 2. Apache Conguration..........................................................................................................................24 Discussion ........................................................................................................................................24 Apache Conguration: /etc/httpd/conf/httpd.conf ..................................................24 The Global Section .................................................................................................................25 The Main Section ...................................................................................................................30 The Answer Book: http://localhost/manual ...............................................................35 Exercises ..........................................................................................................................................36 Specication ...........................................................................................................................36 Deliverables ............................................................................................................................37 Questions..........................................................................................................................................37 3. Apache Conguration: Containers ....................................................................................................41 Discussion ........................................................................................................................................41 Tailoring Customization to Particular Content: Containers ...................................................41 Common Container Conguration .........................................................................................42 Red Hat Enterprise Linux Default Conguration...................................................................46 Location Containers: server-status and server-info ................................................................48 Exercises ..........................................................................................................................................50 Specication ...........................................................................................................................50 Deliverables ............................................................................................................................52 Questions..........................................................................................................................................52 4. Virtual Hosts ........................................................................................................................................57 Discussion ........................................................................................................................................57 Virtual Hosts...........................................................................................................................57 IP Based Virtual Hosting ........................................................................................................57 Name Based Virtual Hosts......................................................................................................58 Exercises ..........................................................................................................................................59 Specication ...........................................................................................................................59 Deliverables ............................................................................................................................62 Questions..........................................................................................................................................62
iii
5. The Squid Proxy Server ......................................................................................................................67 Discussion ........................................................................................................................................67 Proxy Servers..........................................................................................................................67 The squid Proxy Server..........................................................................................................68 Squid Conguration: /etc/squid/squid.conf ................................................................68 The servers identity: http_port ..........................................................................................69 Squid Access Control Lists: acl and http_access ............................................................69 Conguring Proxies for Web Clients......................................................................................73 Squid Logging: /var/log/squid/access.log ................................................................75 Finding Out More ...................................................................................................................76 Exercises ..........................................................................................................................................76 Specication ...........................................................................................................................76 Deliverables ............................................................................................................................78 Challenge Exercises................................................................................................................78 Questions..........................................................................................................................................78
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
iv
The web server that ships with Red Hat Enterprise Linux is the Apache webserver. In general terms, web servers map URL requests onto les within the local directory, using the Document Root (/var/www/html/) as the base of the translation. The web server associates meta-data with requested les, such as content types. When a client requests a directory instead of a le, Apache serves the le index.html (if it exists), generates a dynamically generated directory listing (if its allowed to), or returns an access denied error. Web servers and web clients communicate using the HTTP protocol. Often, the information served from a web server is structured using the HTML markup language.
Table 1-1. The Apache Web Server Packages Service Daemon Cong Files Logging Ports
httpd (with apr and httpd-suexec dependencies), plus other modules (usually starting mod_...), and httpd-manual. httpd
/usr/sbin/httpd
/etc/httpd/conf/httpd.conf, /etc/httpd/conf.d/* /var/log/httpd/{access,error}_log
Discussion
Web Servers
This lesson focuses on installing and starting the Apace web server, and publishing information using the default conguration. We also introduce some of the basics of the HTTP protocol and the HTML markup language, for those who are interested.
Chapter 1. Webserver Basics service: yum install ...; service ... start; chkconfig ... on.
[root@station ~]# yum install httpd
... Dependencies Resolved ============================================================================= Package Arch Version Repository Size ============================================================================= Installing: httpd i386 2.2.3-6.el5 rha-rhel 1.1 M ... Installed: httpd.i386 0:2.2.3-6.el5 Complete!
Starting httpd:
[root@station ~]$ chkconfig httpd on
OK
The availability of the Web Server can be conrmed by using any Web browser to reference http://localhost. The following example uses elinks, but the refox browser could have been used just as easily.
[root@station ~]$ elinks -dump http://localhost
Red Hat Enterprise Linux Test Page This page is used to test the proper operation of the Apache HTTP server after it has been installed. If you can read this page, it means that the Apache HTTP server installed at this site is working properly. ...
Skimming the output, the following relevant les and directories could be seen. Table 1-2. Web Server Filesystem Layout
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Log les, including access_log and error_log. The Web Server Document Root (more on this in a moment).
# Kernel sysctl configuration file for Red Hat Linux # # For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and # sysctl.conf(5) for more details. # Controls IP packet forwarding net.ipv4.ip_forward = 0 ...
Instead of a single le, entire directory trees can be copied into the /var/www/html directory.
[root@station ~]$ cp -a /etc/sysconfig /var/www/html/
Now, by accessing http://localhost/syscong with a web browser, the contents of the directory should be visible, with "clickable" le and subdirectory links.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Notice the shift in perspective. What we would call the directory /var/www/html/sysconfig, the web server refers to as just /sysconfig. This translation is the essence of the term "Document Root". Web browsers request information using "Uniform Resource Locators", or more commonly just "URL"s. Web related URLs are usually composed of a hostname and a le path.
http://hostname/dir1/dir2/filename
The hostname is simply the hostname or IP address of the host running the server, while the dir1/dir2/filename is thought of as being a path to a particular le on the server. When locating the le, the web server assumes that the root of the "URL Namespace" is the document root directory (/var/www/html). The http portion of the URL is the protocol, which tells the web browser both which port to connect to, and what "language" to expect to speak to whomever is listening on that port. For web servers, the port is 80, and the language is known as the Hypertext Transfer Protocol, or HTTP. Of course, its not a machines conguration les that one usually chooses to publish to the world. Well move on to more interesting content.
Content Types
The purpose of the web server is to serve the content of les, but web clients seem to learn not just the content of the le, but how to interpret the content, as well. As an example, consider a text le such as /etc/hosts, an HTML le such as /usr/share/doc/samba-version/htmldocs/manpages/net.8.html, and an image le, such as /usr/share/backgrounds/tiles/neurons.png, each of which are copied to a web servers document root.
[root@station [root@station [root@station [root@station [root@station ~]# mkdir /var/www/html/example ~]# cd /var/www/html/example example]# cp /etc/hosts . example]# cp /usr/share/doc/samba-*/htmldocs/manpages/net.8.html . example]# cp /usr/share/backgrounds/tiles/neurons.png .
rha230-5.0-1-en-2008-01-21T07:12:18-0500
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
hosts
net.8.html
neurons.png
How does a web client handle each of these? If youre sitting at a student workstation, try for yourself. (Of course, you will rst need to perform the above commands to put the les in place.) http://localhost/example/hosts http://localhost/example/net.8.html http://localhost/example/neurons.png
Note: Make sure to create or copy les underneath the /var/www/html directory as the root user. Do not move already existing les into the directory. If youre having trouble, give it a pass for now, until you read the section "But What Could Go Wrong?" below.
All of the les should have been treated reasonably by the client: the hosts le as a simple text le, the net.8.html le as a marked up man page, complete with bolded titles, italics, and hyperlinks, and neuron.png as a picture of blue blobs. Now lets shake things up a bit.
[root@station [root@station [root@station [root@station example]# example]# example]# example]# cp cp cp cp hosts hosts.html net.8.html net.8.txt hosts hosts.png neurons.png neurons.txt
Again, if at a student workstation, try the following. http://localhost/example/hosts.html http://localhost/example/net.8.txt http://localhost/example/hosts.png http://localhost/example/neurons.txt For those not able to follow along, hosts.html lost all of its formatting, net.8.txt dumped what you would see if you catted the le directly, hosts.png caused the browser to complain about a malformed image, and neurons.txt showed a bunch of glyphs representing binary data. Theres obviously some expectations on the part of the browser about how to interpret the data it receives: text to dump, marked up text (html) to format, or an image to render. The expectation about what type of data the client is receiving is known as the datas content type. Apparently, the content type is determined by the les lename extension. We still dont know if the extension is being interpreted into a content type by the server (before the les content is transmitted) or by the client (after the content is received). The answer is the server, and the server communicates that content type, as well as a lot of other meta-data about the transfer, using the HTTP protocol.
Directories
Weve seen how the web server responds when the web server requests a le: it returns the contents of the le to the client. How does the web server handle directories? In general, a webserver responds in one of three ways.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 1. Webserver Basics First, the web server checks to see if an index le (a le named index.html) exists in the directory. If so, the webserver returns the contents of the le, as if the request for http://localhost/example were for http://localhost/example/index.html. Secondly, if no index le exists, the web server checks to see if the Indexes option is enabled. If so, the web server returns a dynamically generated directory listing. Otherwise, the webserver returns an error to the client. (How the Indexes option is set or not set will be covered in a following lesson. In Red Hat Enterprise Linux, the option is set by default.) Table 1-3. Web Server Responses to Directory Requests Conguration
index.html exists
Response Return the contents of index.html Return a dynamically generated directory listing Return error 403 ("Access Denied")
Assuming you followed along above, create the le /var/www/html/example/index.html with the following content (you should be able to cut and paste directly from the browser).
<h1>Examples</h1> [<a href="hosts">hosts</a>] [<a href="net.8.html">net man page</a>] [<a href="neurons.png">picture of neurons</a>]
What happens when you now view http://localhost/example? You should see the marked up contents of the index le. Is the effect any different if you view http://localhost/example/index.html directly? (It shouldnt be.) Figure 1-2. Contents of http://localhost/example
What about the le /var/www/html/hosts.html? Is it still available? You should be able to access it by manually entering the URL http://localhost/example/hosts.html, but there is no way to click to it directly (except from this page, of course). Content behind an index le, which is not referenced directly, is obscured, but still available if someone knows its there.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
10
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
127.0.0.1 - - [13/Jul/2005:06:34:24 -0400] "GET /example/net.8.html HTTP/1.1" 20 0 26196 "http://localhost/rhasb/curr/rha230/html-instructor-classroom/rha230_htt pd_http.html" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050720 Fedora/1.0.6-1.1.fc4 Firefox/1.0.6" 127.0.0.1 - - [13/Jul/2005:06:34:24 -0400] "GET /example/samba.css HTTP/1.1" 404 290 "http://localhost/example/net.8.html" "Mozilla/5.0 (X11; U; Linux i686; enUS; rv:1.7.10) Gecko/20050720 Fedora/1.0.6-1.1.fc4 Firefox/1.0.6" 127.0.0.1- - [13/Jul/2005:06:34:25 -0400]"GET /favicon.ico HTTP/1.1" 404284" -" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050720 Fedora/1.0 .6-1.1.fc4 Firefox/1.0.6"
Amongst any line, we nd the following information. The IP address of the client who made the request. A timestamp of when the request occurred. The response code associated with the request. A response of code of 200 implies success, anything else is usually some type of failure. The length of the content returned, not to be confused with the response code which proceeds it.
Any request that does not complete successfully (i.e., whose response code is not 200) also generates information in the error_log.
[root@station ~]# tail -3 /var/log/httpd/error_log
[Tue Jul 13 06:34:24 2005] [error] [client 127.0.0.1] File does not exist: /var/ www/html/example/samba.css, referer: http://localhost/example/net.8.html [Tue Jul 13 06:34:25 2005] [error] [client 127.0.0.1] File does not exist: /var/ www/html/favicon.ico
The access_log and the error_log are one of the rst places an administrator should look when trying to gure out why something doesnt seem to be working. The following table itemizes some of the return codes associated with various errors (or successes). Table 1-4. HTTP return codes Code 200 301 403 404 501 Meaning Success Authorization Required Access Denied File Not Found Internal Server Error
There are many others, but these tend to be the most common. (In general, the HTTP protocol follows an response code convention used by many network services: partial success are in the 100s, successes in the 200s, incomplete transactions in the 300s, client errors in the 400s, and server errors in the 500s.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
11
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 1. Webserver Basics Watch closely the output the next time you use the simple ftp client, for example.)
or
[root@station ~]# restorecon /var/www/html/filename
The Anatomy of a Web Request: the HTTP Protocol (Optional, but Interesting)
This section introduces the HTTP protocol. The intent is not to be thorough, but instead to give students an impression of what is meant when people use terms such as HTTP headers, GET , and Response Code. For those who dont get enough, all of the details can be found at the World Wide Web Consortiums (http://www.w3.org) website (http://www.w3.org/Protocols). In order to introduce the HTTP protocol, its easiest to start with an example. The entire conversation between a web client and a web server can be captured using the wireshark network analyzer. If not already installed, yum install wireshark-gnome should do the trick. A capture is started by opening wireshark, choosing Capture:Start... from the menu, specifying a capture lter of (in this case) port 80, and "OK"ing. (Enabling "Update list of packets in real time" and "Automatic scroll in live capture" tends to make things more interesting for small captures, as well.)
rha230-5.0-1-en-2008-01-21T07:12:18-0500
12
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Once Wireshark is capturing packets, any conversations between a web client and a web server which occur on the local machine should be captured. For example, the following displays a conversation between a web client requesting http://station53.rosemont.wlan/example/hosts and a web server providing the answer. Once wireshark has been stopped, the individual IP packets can be browsed from a list.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
13
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
More interestingly for our purposes, wireshark can easily assemble the payload from each of the individual packets which compose a TCP/IP conversation by right clicking on any packet, and choosing Follow TCP Stream.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
14
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 1. Webserver Basics Figure 1-5. Viewing a TCP Conversation with Wireshark
The web client, in red, is making a request of the web server, in blue. The "language" the client and server use is the HTTP protocol.
GET/example/hostsHTTP/1.1 Host: station53.rosemont.wlan User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Geck... Accept: text/xml,...text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive
rha230-5.0-1-en-2008-01-21T07:12:18-0500
15
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 1. Webserver Basics The entire rst line is known as the Request-Line, and contains exactly three pieces of information in a specied order. The request method, which for our purposes can be thought of either being a GET or a POST . With a GET , the client is requesting information. With a POST , the client is submitting information. The URI, or "Universal Resource Identier". Think of this as the path portion of a URL. (The server portion has already been used to open the TCP/IP connection.) The exact protocol that the client is speaking. Only two protocols are generally considered, HTTP/1.0 and HTTP/1.1, and any modern client should be using the latter.
The next series of lines, which all have the form header: data, are known as the HTTP headers. These are used to associate any metadata with the request. Some HTTP request headers relevant to our discussion are the following. Host: The content of the host portion of the URL requested by the client. User-Agent: The User Agent is the client software. In this case, the client is the Firefox web browser, which identies itself as a variant of Mozilla. Accept: A list of the content types that the browser is willing to accept. This browser prefers to receive text/xml or text/html, but will also handle text/plain. For images, the browser prefers image/png, but in the end, the browser will accept */*, or anything the server will throw at it.
After a blank line, indicating the end of the HTTP headers, the content of the request would follow. For GET requests, such as this one, there is no content.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
16
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 1. Webserver Basics The Response-Line, like the Request-Line, is composed of three ordered parts. In the case of the response, however, the latter two elds are redundant. The exact protocol the server is using. The response code of the transaction, which is used to imply success, or qualify a type of failure. In this case, the response code 200 implies success. (More on these later.) A text representation of the response code. This is supplied only for diagnostic (debugging) purposes, as the response code is whats really important. The text OK is associated with the response code of 200.
Again, the next series of lines, which all have the form header: data, are known as the HTTP headers. We will only focus on one of the HTTP response headers. Content-Type: The server is providing the client with the type of the content, so the browser can render the data appropriately. For this response, the content type is text/plain, so the browser will display the content "as is", preserving whitespace. Other content types could include image/png, text/html, or application/msword.
After a blank line, indicating the end of the HTTP headers, the content of the response follows. For this response, the content is a simple text le. (In the output above, tabs have been replaced with periods, an artifact of how wireshark displays non-printing characters.)
rha230-5.0-1-en-2008-01-21T07:12:18-0500
17
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 1. Webserver Basics Rather than provide a full introduction to HTML in the text, a sample document is provided at http://rha-server/pub/rha/rha230/sample.html. Students are encouraged to examine this document, both as it is rendered by a web browser and the underlying text (which can usually be viewed in a browser by right clicking and choosing view page source).
Exercises
Lab Exercise
Objective: Install, start, and contribute content to an Apache web site. Estimated Time: 45 mins.
This exercise has you download and install material for your web server, using the web servers default conguration. The material consists of three texts which are not optimally organized for the Apache web server. The lab has you perform some simple renamings and repositioning of the material so that it is more naturally viewed using a web browser.
Specication
1. If the httpd package is not already installed on your machine, install it now. 2. Start the httpd service (if it is not already started), and congure the service to be started by default upon reboots. 3. Download a copy of the le http://rha-server/pub/rha/rha230/readings.tgz, and extract its contents into your web servers document root directory (/var/www/html/). Properly extracting the contents should result in a new /var/www/html/readings directory. 1 4. Using a web browser, browse the http://localhost/readings directory. You should be able to view the HTML les the_god_of_mars.html and war_of_the_worlds appropriately. 5. Correct a misnamed index le. a. Again using a web browser, examine the contents of the http://localhost/readings/relat10h/ subdirectory. You should discover the le index.htm. Try examining this le through the web browser: http://localhost/readings/relat10h/index.htm. b. Apparently, the intent of the author was that this page should serve as an index page, but the le is named incorrectly for Apaches default conguration. In the /var/www/html/readings/relat10h/ directory, create a link of index.htm named index.html (either hard or soft). c. Using a browser, again view the URL http://localhost/readings/relat10h/. You should now see the contents of the index page. d. To make life a little easier for anyone browsing your site, in the /var/www/html/readings directory, create a symlink to the relat10h directory called relativity.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
18
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 1. Webserver Basics e. Conrm that you may now access the content of the le index.htm using http://localhost/readings/relativity/. 6. Correct a misnamed directory. a. If you can stomach the physics (and, in fact, even if you cannot), skim the rst appendix to Einsteins theory of relativity, either by following the link from the main page, or by referencing http://localhost/readings/relativity/ap01.htm directly. b. You might notice that many of the equations, such as equation 29, equation 30, etc., are missing. Examine the end of /var/log/httpd/access_log, and note the many requested images les which received a 404 ("File Not Found") response code. c. Examine the end of the le /var/log/httpd/error_log, and you will discover more helpful messages.
[root@station ~]# tail /var/log/httpd/error_log
[Tue Jul 20 16:53:14 2005] [error] [client 127.0.0.1] File does not exist: /var/ www/html/readings/relat10h/pics, referer: http://localhost/readings/relat10h /ap01.htm ...
d. Examining the log messages closely, you may discover the problem. All of the web pages are expecting images to be in a directory named pics, but this directory does not exist. e. Through a simple directory renaming, or perhaps another symlink, solve the problem, so that all of the images of equations are properly displayed.
7. Now that you have completed the hard work, relax a little, by deriving the equation for the Lorentz transformation, following the steps in chapter 11. Place your results in a le titled that_was_easy in your academy users home directory. (Just kidding.)
Deliverables
1. An installed and running httpd service, congured to start by default on bootup. 2. The text of three books, browsable from the URL http://localhost/readings. 3. The table of contents of Einsteins theory of relativity at http://localhost/readings/relat10h. 4. The table of contents of Einsteins theory of relativity, also at http://localhost/readings/relativity. 5. The images of equations in appendix 1 (found at http://localhost/readings/relativity/ap01.htm) are displayed properly.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
19
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Clean Up
You will want to leave the /var/www/html/readings directory in place, as you will need it in the next section.
Questions
1. In Red Hat Enterprise Linux 5, which of the following packages provides the Apache web server? ( ) a. httpd ( ) b. apache ( ) c. webserver ( ) d. apr ( ) e. None of the above
2. Which of the following directories serves as the web servers document root? ( ) a. /opt/docroot ( ) b. /var/pub/ ( ) c. /var/www/html/ ( ) d. /etc/httpd ( ) e. None of the above After migrating the contents of a web site from one operating system to another, web clients, when viewing the URL http://localhost/zsh.txt, are displaying raw html instead of a formatted page:
rha230-5.0-1-en-2008-01-21T07:12:18-0500
20
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
3. What is the simplest solution to the problem? ( ) a. Install the mod_html package. ( ) b. Create a index.html le to reference this page. ( ) c. Use the txt2html utility to assign the le the HTML le type. ( ) d. Rename the le zsh.html. ( ) e. Use chcon to assign the le the appropriate SELinux context. Use the output of the following command to answer the next question, assuming the default Red Hat Enterprise Linux conguration of the Apache web server.
[root@station1 ~]# ls /usr/share/backgrounds/*
/usr/share/backgrounds/images: default.png ladybugs.jpg dewdop_leaf.jpg leafdrops.jpg ... /usr/share/backgrounds/tiles: 3dgreen.png dunes.png All-Good-People-1.jpg fibers.png ...
riverstreet_rail.jpg sneaking_branch.jpg
Planning-And-Probing-1.jpg plasma.png
4. What would you expect to see if you pointed the Firefox web browser to the URL http://localhost/backgrounds/images/? ( ) a. A dynamically generated web page which displays the images as pictures. ( ) b. A "404: File not Found" error page. ( ) c. A "403: Forbidden" error page. ( ) d. A page containing binary data, because the web server tries to interpret the directory as if it were a le.
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
21
Chapter 1. Webserver Basics ( ) e. A dynamically generated web page which lists the contents of the directory by lename.
5. If, when the directory above is referenced, you would prefer web clients to see the contents of a le, what should the relevant le be named? ( ) a. README.html ( ) b. HEADER.html ( ) c. index.htm ( ) d. DIR.htm ( ) e. None of the above
6. In what le are all web requests from clients ("hits") logged? ( ) a. /var/log/secure ( ) b. /var/log/httpd/error_log ( ) c. /var/log/messages ( ) d. /var/log/httpd/access_log ( ) e. Both C and D
7. If, when running service httpd start, the webserver fails to start, what le might contain helpful debugging messages? ( ) a. /var/log/secure ( ) b. /var/log/xferlog ( ) c. /var/log/httpd/error_log ( ) d. /var/log/httpd/access_log ( ) e. Both B and D
8. In what le are web requests that generate errors logged? ( ) a. /var/log/secure ( ) b. /var/log/httpd/error_log ( ) c. /var/log/messages ( ) d. /var/log/httpd/access_log ( ) e. Both B and D
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
22
Chapter 1. Webserver Basics 9. Which is the web servers "well known" port? ( ) a. 8080 ( ) b. 22 ( ) c. 25 ( ) d. 80
10. Apaches dynamically loaded modules are conventionally found in what directory? ( ) a. /usr/lib/httpd/modules ( ) b. /usr/lib/apache ( ) c. /usr/libexec/apache ( ) d. /usr/share/httpd/modules ( ) e. None of the above
Notes
1. An excellent source for public domain texts it the Gutenberg project (http://www.gutenberg.org).
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
23
The Apache server is congured using the /etc/httpd/conf/httpd.conf and /etc/httpd/conf.d/*.conf conguration les. The conguration le is informally divided into the Global, Main, and Virtual Server sections. The Global section denes aspects which pertain to the server as a whole, including client connection dynamics, server pool parameters, binding address, and which modules to load. The Main section denes aspects which may be redened by any virtual server, such as the document root, logging behavior, and URL namespace remappings. Comprehensive documentation is provided by the httpd-manual package, which, when installed, can be access at http://localhost/manual.
Discussion
Apache Conguration: /etc/httpd/conf/httpd.conf
The Apache web server is congured with text conguration les which are read upon startup. The primary conguration le is /etc/httpd/conf/httpd.conf, but the les /etc/httpd/conf.d/*.conf are "slurped up" into the conguration, as well.
[root@station ~]# ls /etc/httpd/conf /etc/httpd/conf.d/
The apache conguration le syntax is straightforward, and tends to be well documented (both as comments in the default conguration le, and in a separate manual to be discussed later). A sample of the conguration les syntax follows.
# # DocumentRoot: The directory out of which you will serve your # documents. By default, all requests are taken from this directory, but # symbolic links and aliases may be used to point to other locations. # DocumentRoot "/var/www/html" # # Each directory to which Apache has access can be configured with respect # to which services and features are allowed and/or disabled in that
24
Any empty line, or line which begins with a hash ("#"), is considered a comment. Any line which is not a comment generally starts with a keyword referred to as a directive. Directives are not case sensitive, but of course spelling is important. The syntax of the remainder of the line depends on the directive, but all of a directives arguments must occur on a single line. The only other way a line can begin is with a XML-like tag, which begins a container. Containers end with a XML-like closing tag. Generally, all directives found within a container only take effect within the scope of the container. We will discuss the effects of different types of containers in a later lesson.
The le is thought of as occurring in three sections, although the syntax does not formally enforce them. 1. The Global Section: This section contains conguration which applies to the web server as a whole, including any virtual servers. 2. The Main Section: Conguration which applies to the main server (as opposed to any virtual servers) belongs in this section. Any conguration in this section can be overridden by a virtual server. 3. Virtual Servers: The Apache web server can take on the appearance of being multiple distinct servers. Virtual servers will be discussed in more detail in the next lesson. We begin by examining conguration relevant to the server as a whole. You might want to open the le /etc/httpd/conf/httpd.conf in a pager or text editor and follow along as you read the following sections. (You should consider setting the editor into a "read only" mode, or making a backup of the le and browsing it).
rha230-5.0-1-en-2008-01-21T07:12:18-0500
25
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
The ServerRoot directive establishes context for future le references within the conguration le. Any relative le reference (one that does not begin with a "/") will be relative to the ServerRoot, which in Red Hat Enterprise Linux is /etc/httpd. In Unix, daemons traditionally record the fact that they are running by creating a le in the lesystem which contains their process id, called a pid le. The PidFile directive species where this le should be located.
Examining the /etc/httpd directory, we nd its populated with several symbolic links.
[root@station ~]$ ls -l /etc/httpd
4 2 1 1 1
root 4096 Jul 25 06:33 conf root 4096 Jul 25 06:33 conf.d root 19 Jul 25 06:33 logs -> ../../var/log/httpd root 27 Jul 25 06:33 modules -> ../../usr/lib/httpd/modules root 13 Jul 25 06:33 run -> ../../var/run
In the httpd.conf conguration le, le references that begin logs/, modules/, or run/ are mapped to the relevant directories. Can you convince yourself that the daemons pid le would be found at /var/run/httpd.pid? Its important to understand the role of the ServerRoot directive, and the use of the symbolic links in the /etc/httpd directory, but theres seldom any reason to change these values.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
26
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
70 # # KeepAlive: Whether or not to allow persistent connections (more than # one request per connection). Set to "Off" to deactivate. # KeepAlive Off 75 # # MaxKeepAliveRequests: The maximum number of requests to allow # during a persistent connection. Set to 0 to allow an unlimited amount. # We recommend you leave this number high, for maximum performance. 80 # MaxKeepAliveRequests 100 # # KeepAliveTimeout: Number of seconds to wait for the next request from the 85 # same client on the same connection. # KeepAliveTimeout 15
A particular httpd process can only communicate with one client at a time. A badly behaved client, which opens a TCP/IP connection but never uses it, could therefore tie up a server indenitely. The Timeout directive species how long, in seconds, before a server terminates a connection with a badly behaved client.
These directives decide if the server honors "Keep Alive" requests from a client, how many request can be made over a "Keep Alive" connection, and how long before an inactive connection should time out. The HTTP protocol is termed a "stateless" protocol, meaning that the server doesnt record any information about the client between one request and the next. In the original HTTP/1.0 protocol, clients are required to open a new socket for every request. Downloading a web page with 10 images, therefore, would require the client to open 11 sockets (one for the page, and one for each referenced image). The HTTP/1.1 protocol tried to improve efciency by allowing a client to leave a single socket open for "follow up" requests. Such a persistent socket is called a "Keep Alive" socket. Clients are more likely to abuse such persistent connections, however, by leaving them open but not making any followup requests, so stricter timeout values are usually assigned to such connections.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
27
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
? ? ? ? pts/5
Ss S S S S+
The following directive manage the dynamics of the server pool. Figure 2-4. /etc/httpd/conf/httpd.conf
# prefork MPM # StartServers: number of server processes to start 95 # MinSpareServers: minimum number of server processes which are kept spare # MaxSpareServers: maximum number of server processes which are kept spare # ServerLimit: maximum value for MaxClients for the lifetime of the server # MaxClients: maximum number of server processes allowed to start # MaxRequestsPerChild: maximum number of requests a server process serves 100 <IfModule prefork.c> StartServers 8 MinSpareServers 5 MaxSpareServers 20 ServerLimit 256 105 MaxClients 256 MaxRequestsPerChild 4000 </IfModule>
StartServers: The initial size of the server pool (in number of processes).
{Min,Max}SpareServers: The server pool scales dynamically. If a web server gets blitzed with many requests, more child daemons will be started. If things go quiet, unused child daemons will be killed. These directives place bounds on the server pool size. ServerLimit, MaxClients: The number of concurrent requests can be limited. Connection request above this limit will be greeted with a quick "Im busy... come back later", rather than actually handled. The distinction between the ServerLimit and MaxClients directives is subtle, and in practice they are set together to the same value. MaxRequestsPerChild: In order to improve stability, a given child daemon will only serve so many requests until it kills itself, and a new daemon must be started. (This suicide helps curtail memory leaks in poorly written libraries and CGI executables.)
rha230-5.0-1-en-2008-01-21T07:12:18-0500
28
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
The Listen directive controls which address the server binds to. In the default conguration (above), the server binds to internal IP address 0.0.0.0 (implying every active interface), port 80. Multiple Listen lines can be used to specify that the daemon should bind to multiple ports and/or addresses.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
29
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
The various modules tend to introduce new conguration directives to modify their behavior. For example, the log_cong_module provides the LogFormat directive, which we will encounter later. In the conguration le, the module must be loaded (with LoadModule) before any directives it provides are encountered. In order to ease the distribution of modules using a package managed system (such as RPM), the
Include directive species external conguration les to include, either directly or by using
rha230-5.0-1-en-2008-01-21T07:12:18-0500
30
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 2. Apache Conguration The ServerAdmin directive is mainly cosmetic. The email address is listed in the footer of the default error pages. For simple hosts, with a single external interface and therefore a clear concept of a hostname, the ServerName can be automatically determined. If in doubt, however, it should be specied manually. (For example, if the server is bound to multiple interfaces, the preferred name should be congured explicitly).
Notice that if multiple le names are specied, each will be searched for in sequence. Specifying too many alternatives, however, could lead to poor performance. For example, if migrating content from a Microsoft based server, setting DirectoryIndex to the following would be easier than renaming every le named index.htm to index.html.
DirectoryIndex index.html index.htm
rha230-5.0-1-en-2008-01-21T07:12:18-0500
31
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
The web server can easily determine the IP address of any client which is making a web request: its part of the requests IP protocol header. In order to determine the hostname of the client, however, the web server must work harder: it must perform a reverse DNS lookup on the clients IP address. This reverse lookup increases both time and network trafc on the part of the server, so by default, its disabled. As a result, all logging and access control list are implemented by IP address, not by hostname. If you desire logs and access control lists to use client hostnames instead of IP addresses, and are willing to pay the price in performance, HostnameLookup can be set to on.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
32
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
By default, the web server logs to the le /var/log/httpd/error_log (recall the role of the ServerRoot directive, and the /etc/httpd/logs symlink). For the main server, its hard to think of a reason to ever change it, though virtual hosts often override it. More interesting is the LogLevel, which determines how much information is logged. The vocabulary draws directly from the syslog service. When troubleshooting, an administrator often ratchets up the logging by setting the LogLevel to debug, for example. Of course, more copious logging slows down overall performance, so once a problem has been resolved, logging is returned to a more suitable default.
Transaction Logging: LogFormat and CustomLog For every web request, there is a large amount of information that an administrator can choose to log (or not). Such transaction logs are often referred to as "access logs". The LogFormat directive allows administrators to assign names to collections of information, so that they are easy to refer to later. This is all LogFormat does, however. In order to use one of the formats, they must be associated with a CustomLog. Figure 2-13. /etc/httpd/conf/httpd.conf
480 # # The following directives define some format nicknames for use with # a CustomLog directive (see below). # LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined 485 LogFormat "%h %l %u %t \"%r\" %>s %b" common LogFormat "%{Referer}i -> %U" referer LogFormat "%{User-agent}i" agent
# "combinedio" includes actual counts of actual bytes received (%I) and sent (%O); this 490 # requires the mod_logio module to be loaded. #LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O" combinedi
The following table illustrates some of the parameters most commonly used in access logs. Table 2-1. Apache Log Parameters Parameter References %h %u Remote host (IP or hostname) Remote user (for HTTP authentication) Example 127.0.0.1 elvis
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
33
Chapter 2. Apache Conguration Parameter References %t %r %s %b Timestamp Request line (from HTTP protocol) HTTP response status code Response size (in bytes) Example [15/Jul/2005:06:55:44 -0400] GET /icons/compressed.gif HTTP/1.1 200 1079 (depends on name)
Many more exist as well. As usual, with all of this exibility comes the need for convention. Two commonly used conventions are the common format and the combined format, which are the rst two formats dened above. The common format records IP address, username (if any), timestamp, request line, response status, and number of bytes transferred. 1 The combined format adds the identity of the client application, and the referring page (if any). While the combined format is used by default in Red Hat Enterprise Linux, administrators could well choose to drop back to the common format to save space and improve performance. Many external log analysis utilities (such as webalizer) rely on logs being in a standard format, so an administrator should consider the consequences before changing the log format arbitrarily. Finally, once a format has been decided, it can be associated with a log le using the CustomLog directive. Figure 2-14. /etc/httpd/conf/httpd.conf
# The location and format of the access logfile (Common Logfile Format). 495 # If you do not define any access logfiles within a <VirtualHost> # container, they will be logged here. Contrariwise, if you *do* # define per-<VirtualHost> access logfiles, transactions will be # logged therein and *not* in this file. # 500 #CustomLog logs/access_log common # # If you would like to have separate agent and referer logfiles, uncomment # the following directives. 505 # #CustomLog logs/referer_log referer #CustomLog logs/agent_log agent # 510 # For a single logfile with access, agent, and referer information # (Combined Logfile Format), use the following directive: # CustomLog logs/access_log combined
As the above conguration suggests, multiple log les, each containing different information, could be updated with each hit, though of course performance is a consideration. By default, Red Hat Enterprise Linux only updates the single le /var/log/httpd/access_log, using the combined format.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
34
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
As an example, the default Red Hat Enterprise Linux conguration aliases http://localhost/icons/ to the directory /var/www/icons/, which is not underneath the document root, but a sibling of it. The remapping should be easy enough to conrm by following the above link, and taking a ls of the icons directory. For better or for worse, we now have a way to expose portions of our lesystem which are not under the document root. Another option is the use of symbolic links, which will be discussed in more detail shortly. Also, notice the comments about trailing slashes, which have often been a source of confusion. The Apache webserver automatically redirects clients which refer to directories without the trailing slash to an equivalent URL which does (watch closely as you access http://localhost/example, and note that the browser ends up showing the omitted trailing slash). This causes some directory related conguration which doesnt specify the omitted slash to be interpreted twice, which can cause confusion.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
35
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
rha-rhel
831 k
[ [
OK OK
] ]
The manual provides comprehensive documentation, organized by directive name, module name, or by topic (such as "Log Files" or "Virtual Hosts"). Anyone wishing to quickly refresh memories, or learn more about Apache conguration, should denitely load the manual as well.
Exercises
Lab Exercise
Objective: Congure the Apache web server. Estimated Time: 45 mins.
Specication
You will probably want to make a backup of the main Apache conguration le (/etc/httpd/conf/httpd.conf) before starting this exercise, so that you can later restore the default conguration. If you have not already downloaded http://rha-server/pub/rha/rha230/readings.tgz and extracted its contents into the /var/www/html directory (as specied in the previous exercise), do so now. Edit your Apache conguration so that the server meets the following specications. The suggested technique is to duplicate the relevant lines of your conguration le, comment out the original conguration, and edit the new line to make your changes. You will probably want to make incremental changes, checking your conguration as you go. 1. Congure the Apache webserver so that it accepts HTTP/1.1 KeepAlive requests, but will only wait 3 seconds for a followup request before closing the connection. Hint: you can conrm this conguration by capturing a transaction between the Firefox browser and your webserver with ethereal, and examining the HTTP headers of both the request and response. 2. Manage the bounds of the server pool, such that there are always between 2 and 4 (inclusive) child daemons present. 3. The Apache server should be bound to port 8888 (of at least the loopback address), in addition to port 80 (on all interfaces). (Note: you will need to drop SELinux into permissive mode in order to allow Apache to bind to a port other than 80 and 443). 4. Congure the web server such that index.htm is recognized as an index le, as well as index.html. Conrm your conguration by removing the le
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
36
referencing http://localhost/readings/relat10h/. 5. Congure the server so that clients are logged by hostname (when available) as opposed to IP address. (Hint: You are not expected to need to edit any LogFormat directives). 6. Set the log level for the error log to debug. 7. In addition to the default logging, have every web request logged to the le /var/log/httpd/common_log, using what is commonly referred to as the common format. 8. In the separate conguration le /etc/httpd/conf.d/rha.conf, establish an Alias, so that the URL http://localhost/images/ refers to the directory /var/www/html/readings/relat10h/pics. (If the relevant directory is still named picts, rename it or symlink it to pics).
Deliverables
1. A running Apache webserver, that accepts Keep-Alive requests, but will close connections after 3 seconds of inactivity. 2. The server should maintain a server pool of between 2 and 4 pre-forked child daemons. 3. The server should be bound to the loopback addresss port 8888, in addition to the normal port 80. 4. The server should treat les named index.htm as index les, in addition to the standard index.html. 5. Transaction logging should log clients by hostname, if available. 6. The error log should log all messages with debug and higher priority. 7. In addition to the standard access_log, a transaction log named /var/log/httpd/common_log should be kept, logging in the common format. 8. The URL http://localhost/images/ should resolve to /var/www/html/readings/relat10h/pics, due to an alias established in the /etc/httpd/conf.d/rha.conf conguration le.
Questions
For all of the following questions, assume the default Red Hat Enterprise Linux conguration of the Apache webserver, unless the question states otherwise. 1. Which directory serves as the ServerRoot directory (i.e., the directory used as the base for all relative le references in the conguration le) ? ( ) a. /var/www/html
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
37
2. Which le(s) is(are) used to congure the Apache web server upon startup? ( ) a. /etc/httpd/conf/httpd.conf ( ) b. /etc/apache.conf ( ) c. /etc/httpd/conf.d/*.conf ( ) d. /etc/sysconfig/apache ( ) e. Both A and C
3. Which of the following directives could be used to improve the performance of a heavily loaded web server? ( ) a. KeepAlive ( ) b. MaxClients ( ) c. MaxSpareServers ( ) d. Timeout ( ) e. All of the above
4. Which of the following directives can be used to defend against memory leaks and other instabilities in poorly written libraries and CGI scripts? ( ) a. MaxClients ( ) b. MaxRequestsPerChild ( ) c. ServerLimit ( ) d. KeepAlive ( ) e. Listen
5. Which of the following best describes the default Apache server model? ( ) a. The server uses a traditional Unix forking model, where a new daemon is forked to handle connections for a particular client. ( ) b. The server uses a pre-forking model, whereby clients are distributed amongst a dynamic pool of pre-existing daemons. ( ) c. The server uses a multi-threaded model, whereby a single process clones multiple threads, each handling a distinct client.
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
38
Chapter 2. Apache Conguration ( ) d. The server uses a single process polling model, whereby the single process polls a collection of active connections for activity.
6. Which of the following lines would cause the web server to bind to port 8080 on the loopback address? ( ) a. Bind 127.0.0.1:8080 ( ) b. Bind 127.0.0.1 8080 ( ) c. Listen 127.0.0.1:8080 ( ) d. Listen 127.0.0.1 8080 ( ) e. None of the above
7. The apache manual states that %h is used to log the remote hostname or IP address. Yet, even using this parameter, and administrator nds a log le logs using IP addresses instead. Which of the following congurations would allow client hostnames to be logged? ( ) a. DNS /etc/resolv.conf ( ) b. HostnameLookups On ( ) c. LogNames On ( ) d. LogLevel info ( ) e. None of the above
8. Which of the following directives would have the same end effect as cd /var/www/html/data; ln -s ../images images ? ( ) a. Alias /data/images/ /var/www/html/images/ ( ) b. Symlink /images/ /data/images/ ( ) c. Alias /images/ /data/images/ ( ) d. View /var/www/html/images/ /data/images/ ( ) e. None of the above
9. Assuming the httpd-manual package is installed, where can Apache documentation be found? ( ) a. http://localhost/help ( ) b. http://localhost/guide ( ) c. http://localhost/apache ( ) d. http://localhost/man ( ) e. None of the above
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
39
Chapter 2. Apache Conguration 10. After editing an Apache conguration le, what should be done for changes to take effect? ( ) a. chkconfig httpd on ( ) b. service httpd restart ( ) c. chkconfig httpd reload ( ) d. service httpd status ( ) e. No action is required, because the apache daemon actively monitors its conguration le.
Notes
1. The observant might notice the omission of the second eld, inevitably a hyphen ("-"). This eld used to refer to the username as returned by the legacy identd service, which is seldom implemented today.
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
40
The Apache web server allows context dependent conguration through the use of Directory,
Location, Files, and VirtualHost containers.
Often, the Options directive is used within containers to allow or disallow symbolic link resolution (with FollowSymLinks) and dynamic directory generation (with Indexes), among other parameters. Often, the Order, allow from, and deny from directives are used within containers to implement access control based on the clients IP address or hostname. The default Red Hat Enterprise Linux conguration allows the resolution of symbolic links almost everywhere, but limits the generation of dynamic indexes to the intended document root directory. Dynamic information about the Apache webserver can be obtained using custom handlers which are conventionally associated with the /server-status and /server-info locations.
Discussion
Tailoring Customization to Particular Content: Containers
The Apache webserver allows conguration to be customized to particular les or directories using containers. Containers start with an XMLish opening tag, such as <Directory ...>, and end with an XMLish closing tag, such as </Directory>. Directives found within the container only affect les which fall under the containers scope. There are essentially four types of scoping containers, which are exemplied below and itemized in the following table. Figure 3-1. Sample Apache Containers
<Directory "/var/www/icons"> Options Indexes MultiViews AllowOverride None Order allow,deny Allow from all </Directory> <Location /server-status> SetHandler server-status Order deny,allow Deny from all Allow from .example.com </Location>
41
<Files ~ "*.hide"> Order allow,deny Deny from all </Files> <VirtualHost *:80> ServerAdmin webmaster@dummy-host.example.com DocumentRoot /www/docs/dummy-host.example.com ServerName dummy-host.example.com ErrorLog logs/dummy-host.example.com-error_log CustomLog logs/dummy-host.example.com-access_log common </VirtualHost>
Scope All les which exist in or underneath the specied directory in the lesystem, after URL to lename translation occurs. All les which exist in or underneath the specied location in the URL namespace, before URL to lename translation occurs. All les which match the specied pattern, no matter where they exist in the lesystem or URL namespace. All les served by a particular virtual server. Virtual hosts will be covered in detail in a later lesson.
The argument to the opening tag species the relevant le or directory (or, in the case of VirtualHost, IP address). The lename may either be explicit, or shell-like pathname expansion (le globbing) can be used.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
42
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Effect When a URL references a directory (as opposed to a regular le), and no
index.html le is present (more on this in a bit), and this option is enabled, the web server will return an automatically generated directory listing. If Indexes is
disabled, a 403 error page will be returned to the client (Access Forbidden).
FollowSymLinks This option must be enabled in order for the webserver to resolve (follow) a
symbolic link. A qualication of the FollowSymLinks option, where the symlink will only be SymLinksIfOwnerMatch followed if the le owner of the resulting le is the same as the le owner of the link itself.
ExecCGI
Allow CGI executables to be executed from withing this scope. (More on these later).
Includes, Server side includes are allowed (or, in the latter case, mostly allowed) from IncludesNOEXEC within this scope. Server side includes are beyond the scope of this course. Multiviews
If enabled, content negotiation between the client and the server is supported. This allows a server to serve a document in the most appropriate of multiple languages, for example. Further discussion of Multiviews is beyond the scope of this course. This option refers to all of the previous options collectively, with the exception of
Multiviews. Unless otherwise specied, this is the default conguration. (Recall
All
that in Red Hat Enterprise Linux, however, a different policy applies to the root directory, effectively establishing a different default.) Why not Indexes? The decision to allow the web server to automatically generate indexes or not is really a matter of control. If indexes are automatically generated, then merely locating a le underneath the document root allows anyone to view it or copy it (often with automated command line clients such as wget), unless an index.html le is created to hide les within a particular directory. In contrast, if indexes are not allowed, les must be explicitly linked from other les (index.html or otherwise) to be easily discovered. Many low maintenance, public web sites leave indexes on (such as the ofcial Linux kernel repository (http://www.kernel.org/pub/linux)). Other web sites, hoping for a more professional look or more rened control of information, do not.
Why not resolve Symbolic Links? Again, the decision to allow symlink resolution is basically one of control. If symlinks are not allowed,
rha230-5.0-1-en-2008-01-21T07:12:18-0500
43
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 3. Apache Conguration: Containers an administrator has a clear concept of what portions of the le system are exposed through the web server (only les underneath the document root). If symlinks are resolved, however, a symlink underneath the document root could expose any other part of the lesystem. More subtly, the decision to not resolve symlinks can degrade performance. When resolving a path to reference a le, the kernel automatically resolves symlinks. (If you were to cat the le /foo/biz/baz/buzz, you do not need to worry if the directory biz or baz is actually a symlink). If symlinks are disabled, however, the web server must make a system call on each of the nodes within a le path, asking "is it a symlink? is it a symlink? is it a symlink?" This degradation is one of the reasons why the default Red Hat Enterprise Linux conguration leaves FollowSymLinks enabled.
Options Syntax
The Options directive takes effect for the scope specied by its enclosing container. For example, the following container would enable indexes and symlink resolution for all les underneath the directory /var/www/html.
<Directory /var/www/html> Options FollowSymLinks Indexes </Directory>
The following container, however, would enable indexes and server side includes underneath /var/www/html/widgets.
<Directory /var/www/html/widgets> Options Indexes Includes </Directory>
The directory /var/www/html/widgets does not inherit its options from /var/www/html, but instead gets its conguration entirely from the new Options line. Because FollowSymLinks is not mentioned, symlinks underneath /var/www/html/widgets will not be resolved. In contrast, options can be preceded by a "+" or "-", implying that options should be inherited from the enclosing scope, with the simple addition or stripping of a particular option. Consider rewriting the above container as follows.
<Directory /var/www/html/widgets> Options +Includes </Directory>
In this case, the /var/www/html/widgets directory would have Includes, Indexes, and FollowSymLinks enabled (the latter two inherited from /var/www/html). Similarly, the following container would leave /var/www/html/widgets with only the
FollowSymLinks option enabled. <Directory /var/www/html/widgets> Options -Indexes </Directory>
rha230-5.0-1-en-2008-01-21T07:12:18-0500
44
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
The client_specification is composed of a whitespace separated list of any of the following elements. Table 3-3. Apache ACL client specication Syntax
ALL
Example
ALL
Meaning All clients The specied client All clients whose IP address begins as specied
192.168.0.3 172.63.
192.168.1.64/255.255.255.192 All clients who belong to the specied subnet 192.168.1.64/26 All clients who belong to the specied subnet (this example is completely equivalent to the preceding example). All clients whose reverse lookup domain name ends as specied (reverse lookups must be enabled with HostnameLookups)
.example.com
The Deny Directive The Deny directive uses an identical syntax to specify which clients are not allowed to connect to a given resource.
Deny from client_specification
The client_specification is composed of the same elements as for the Allow directive.
The Order directive Heres where things get interesting. Whenever client ACLs are specied with the Allow and Deny directives, the order of precedence must be specied with the Order directive. The Order directive usually comes in one of two forms.
Order Allow,Deny
rha230-5.0-1-en-2008-01-21T07:12:18-0500
45
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 3. Apache Conguration: Containers In this case, any clients which are unspecied (not matching any rule) or over specied (they match both an allow and deny rule) are denied.
Order Deny,Allow
In this case, any clients which are unspecied or over specied are allowed. Surprisingly, no spaces are allowed around the comma in either case. Some examples are in order. Example 1
<Directory /some/sensitive/content> Order Deny,Allow Deny from All Allow from 192.168.0. </Directory>
In this case, only clients from within the 192.168.0.0/255.255.255.0 subnet are allowed to access les underneath /some/sensitive/content. Example 2
<Directory /keep/them/out> Order Allow,Deny Allow from 192.168.0. Deny from 192.168.0.4 </Directory>
In this case, clients from within the 192.168.0.0/255.255.255.0 subnet are allowed to access les underneath /keep/them/out, except for client 192.168.0.4. All clients outside of the subnet are not allowed access. Example 3
<Directory /only/for/example> HostNameLookups on Order Allow,Deny Allow from .example.com </Directory>
In this case, clients from within the example.com domain allowed to access les underneath /only/for/example. If you are having trouble guring out how the term "order" applies to the effect of the Order directive, your author sympathizes. However, with a little experience, a certain sense of the syntax can be made. Until then, make sure that you conrm any ACLs by actually trying to access the material from the appropriate clients.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
46
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
In this case, the "/" in the opening tag is not syntax, but a reference to the root directory. So from the root directory on down (i.e., everywhere), the specied policies apply. Specically, the only allowed Option is FollowSymLinks, and no overrides are allowed. The next container loosens things up a bit for the directory /var/www/html. (Why was this directory picked for special attention?) Figure 3-3. /etc/httpd/conf/httpd.conf
290 <Directory "/var/www/html"> # # Possible values for the Options directive are "None", "All", # or any combination of: 310 # Indexes Includes FollowSymLinks SymLinksifOwnerMatch ExecCGI MultiViews # # Note that "MultiViews" must be named *explicitly* - - - "Options All" # doesnt give it to you. # 315 # The Options directive is both complicated and important. Please see # http://httpd.apache.org/docs-2.0/mod/core.html#options # for more information. # Options Indexes FollowSymLinks 320 # # AllowOverride controls what directives may be placed in .htaccess files. # It can be "All", "None", or any combination of the keywords: # Options FileInfo AuthConfig Limit 325 # AllowOverride None
rha230-5.0-1-en-2008-01-21T07:12:18-0500
47
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
In answer to the above question, access to content beneath /var/www/html is loosened a bit because that directory contains the expected content to be served from the webserver. The container also contains some client access control conguration, but only as an example, as the effect of the conguration is to allow everyone.
Both of these provide examples of virtual locations, in that, if enabled (and customized a bit), the server would respond to requests for http://localhost/server-info and http://localhost/server-status. The URLs do not map to any particular directory on the lesystem, however, so a Directory container would have been inappropriate. Each of these containers implements a custom handler using the SetHandler directive. A thorough discussion of the concept of a handler is beyond the scope of the current class, but essentially a handler
rha230-5.0-1-en-2008-01-21T07:12:18-0500
48
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 3. Apache Conguration: Containers determines how the server responds to a request. The default handler, which returns the contents of the referenced le to the client, is the only handler weve encountered so far. Other handlers allow the web server to respond differently to requests.
The Apache web server responds to http://localhost/server-status with a page of status information similar to the following. Figure 3-5. Apache Web Server Status Page
rha230-5.0-1-en-2008-01-21T07:12:18-0500
49
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
With this conguration active, the Apache web server responds to http://localhost/server-info with a page of conguration information similar to the following. Figure 3-6. Apache Web Server Status Page
Exercises
Lab Exercise
Objective: Congure the Apache web server using containers. Estimated Time: 45 mins.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
50
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Specication
If you have not already downloaded http://rha-server/pub/rha/rha230/readings.tgz and extracted its contents into the /var/www/html directory (as specied in the previous exercise), do so now. Also, apply the image directory x by renaming /var/www/html/readings/relat10h/picts to /var/www/html/readings/relat10h/pics if you have not already somehow resolved the problem. Edit your Apache conguration so that the server meets the following specications. Place all of your conguration in the le /etc/httpd/conf.d/rha.conf. You should be starting with directory structure similar to the following.
[root@station ~]# tree /var/www/html/readings/
/var/www/html/readings/ |-- relat10h | |-- ap01.htm | |-- ap02.htm | |-- ... | |-- index.htm | |-- index.html -> index.htm | |-- pics -> picts/ | |-- picts | | |-- arrow.gif | | |-- eq01.gif | | |-- ... | |-- preface.htm | -- works-blue.css |-- relativity -> relat10h/ |-- the_god_of_mars.html -- war_of_the_worlds.html
1. You decide that the use of symbolic links makes it too difcult to maintain control over a web site. Set options such that symbolic links are disabled everywhere underneath the /var/www/html/readings directory. (Notice that if you solved the image directory name problem with a symbolic link, you will need to now rename the directory instead). 2. You are willing to allow people to read Einsteins relativity starting from the table of contents, but do not want people browsing the directory structure directly. Disable directory indexes underneath the /var/www/html/readings/relat10h directory. 3. You decide that you would like to restrict access to the all of the readings only to local clients. Implement a policy whereby the contents underneath the /var/www/html/readings directory is only available to clients whose IP address starts 127.0. 4. However, you would like your graphics consultant to be able to review your images. For the directory /var/www/html/readings/relat10h/pics, allow access to all clients who start 127.0., and the special IP address 127.1.1.1. Also, enable directory indexes for this directory. 5. Because symbolic links are now disabled, you will no longer be able to make use of the relativity symbolic link to access the relat10h directory. Instead, establish an alias such that http://localhost/readings/relativity references the /var/www/html/readings/relat10h directory.
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
51
Chapter 3. Apache Conguration: Containers 6. Again, because symbolic links are now disabled, you will no longer be able to solve the index.htm problem with a symbolic link. Make sure that index.htm is considered a directory index as well. (Note, you might just need to make sure that your implementation from the previous lessons exercise is still in place.) 7. You would like to monitor the performance of your web server. In the main /etc/httpd/conf/httpd.conf conguration le, enable the http://localhost/server-status and http://localhost/server-info location containers, so that you may view dynamically generated performance and conguration information. 8. You would like your graphics consultant to be able to monitor the performance as well, so allow both 127.0.0.1 and 127.1.1.1 to access to these locations, but only these IP addresses.
Deliverables
1. The web server will not resolve symbolic links underneath the /var/www/html/readings directory. 2. The web server will not generate directory indexes underneath the /var/www/html/readings/relat10h directory. 3. Only clients whose IP address begins 127.0 may access content under the /var/www/html/readings. 4. However, the /var/www/html/readings/relat10h/pics directory allows access to 127.1.1.1 in addition to the 127.0 clients. Dynamically generated indexes are also allowed for this directory. 5. The URL http://localhost/readings/relativity resolves to /var/www/html/readings/relat10h. 6. The URL http://localhost/server-status presents dynamically generated status information, but is only available to 127.0.0.1 and 127.1.1.1. 7. The URL http://localhost/server-info presents dynamically generated status information, but is only available to 127.0.0.1 and 127.1.1.1.
Questions
1. Which of the following is not a legitimate keyword for opening an Apache scoping container? ( ) a. Files ( ) b. Directory ( ) c. Location ( ) d. Virtual Host ( ) e. All of these keywords are legitimate.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
52
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 3. Apache Conguration: Containers Use the following excerpt from an Apache conguration le, and the following directory structure, to answer the next 7 questions. You may assume there are no relevant URL aliases, and that all ownerships, permissions, and SELinux contexts are correct.
<Directory /var/www/html/pics> Options -Indexes -FollowSymLinks Order deny,allow deny from 192.168.1. </Directory> <Location /ogg> Options +Indexes Order allow,deny allow from 192.168.0. </Location>
[root@station ~]# tree /var/www/html
/var/www/html/ |-- ogg/ | |-- 01_track_1.ogg | |-- 02_track_2.ogg | |-- 03_track_3.ogg | -- _hidden/ | |-- 04_track_4.ogg | -- 05_track_5.ogg -- pics/ |-- demo/ | |-- 00001.jpg | |-- 00004.jpg | |-- 00010.vga.jpg | -- index.html |-- feb/ | |-- 15479.vga.jpg | -- 15491.jpg |-- index.html |-- mar/ | |-- 15651.jpg | -- 15659.vga.jpg -- spring -> mar
(Note that /var/www/html/pics/spring is a symbolic link to mar). 2. What would be the result of the client 192.168.0.4 trying to access the URL http://server.example.com/pics/mar/? ( ) a. A dynamically generated index. ( ) b. A 403 "Forbidden" error. ( ) c. A 404 "File Not Found" error. ( ) d. The contents of the le /var/www/html/pics/demo/index.html ( ) e. None of the above
rha230-5.0-1-en-2008-01-21T07:12:18-0500
53
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
3. What would be the result of the client 192.168.0.4 trying to access the URL http://server.example.com/pics/demo/? ( ) a. A 404 "File Not Found" error. ( ) b. The contents of the le /var/www/html/pics/demo/index.html ( ) c. A 403 "Forbidden" error. ( ) d. A dynamically generated index. ( ) e. None of the above
4. What would be the result of the client 192.168.0.4 trying to access the URL http://server.example.com/pics/spring/? ( ) a. The contents of the le /var/www/html/pics/index.html ( ) b. A dynamically generated index. ( ) c. A 404 "File Not Found" error. ( ) d. A 403 "Forbidden" error.
5. What would be the result of the client 192.168.1.1 trying to access the URL http://server.example.com/ogg/02_track_2.ogg? ( ) a. A 403 "Forbidden" error. ( ) b. The contents of the le /var/www/html/ogg/02_track_2.ogg ( ) c. A dynamically generated index. ( ) d. A 404 "File Not Found" error. ( ) e. None of the above
6. What would be the result of the client 192.168.0.4 trying to access the URL http://server.example.com/ogg/_hidden/05_track_5.ogg? ( ) a. A 404 "File Not Found" error. ( ) b. A 403 "Forbidden" error. ( ) c. The contents of the le /var/www/html/ogg/_hidden/05_track_5.ogg ( ) d. A dynamically generated index. ( ) e. None of the above
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
54
Chapter 3. Apache Conguration: Containers 7. What would be the result of the client 192.168.0.4 trying to access the URL http://server.example.com/ogg/*.ogg? ( ) a. The contents of all les matched by the glob. ( ) b. A 403 "Forbidden" error. ( ) c. A 404 "File Not Found" error. ( ) d. A dynamically generated index including all les matched by the glob. ( ) e. None of the above
8. What would be the result of the client 192.168.1.1 trying to access the URL http://server.example.com/ogg/i_dont_exist.ogg? ( ) a. A 403 "Forbidden" error. ( ) b. A 404 "File Not Found" error. ( ) c. The contents of the le /var/www/html/ogg/i_dont_exist.ogg. ( ) d. A dynamically generated index. ( ) e. None of the above Use the following excerpt from an Apache conguration le to answer the next 2 questions. You may assume there are no other relevant URL aliases, and that all ownerships, permissions, and SELinux contexts are correct.
<Location /server-status> SetHandler server-status Order deny,allow Deny from all Allow from 127.0.0.1 </Location>
9. What would be the result of the client 192.168.0.4 trying to access the URL http://server.example.com/server-status? ( ) a. A dynamically generated summary of the state of the each process in the Web Server Pool. ( ) b. A 404 "File Not Found" error. ( ) c. A 403 "Forbidden" error. ( ) d. The contents of the le /var/www/html/server-status. ( ) e. None of the above
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
55
Chapter 3. Apache Conguration: Containers 10. What would be the result of the client 127.0.0.1 trying to access the URL http://localhost/server-status? ( ) a. A dynamically generated summary of the contents of the /var/www/html/server-status/ directory. ( ) b. A 404 "File Not Found" error. ( ) c. A 403 "Forbidden" error. ( ) d. The contents of the le /var/www/html/server-status. ( ) e. None of the above
rha230-5.0-1-en-2008-01-21T07:12:18-0500
56
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Now, requests for http://www.republican.pol/propaganda.html would be mapped to the le /var/www/republican.pol/propaganda.html, and similarly, requests for
57
sites, but the client has no way of knowing. To the client, they seem to be completely independent sites. What conguration can be found within a VirtualHost container? Anything found within the Main section of the conguration le. The example above has the two hosts using distinct document roots and logs. Just as easily, they could add distinct Aliases, Options, and ACLs, and a host of other conguration.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
58
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
NameVirtualHost: The IP address 192.168.0.2 has now been identied as an address for which the server is implementing name based virtual hosting. Any request received over this IP address will now have its HTTP headers examined for the name of the server.
ServerName: The hostname supplied by the HTTP headers will be matched against the ServerName directive of all virtual hosts which share the relevant IP address. The ServerName directive now takes on new importance. What if the same virtual host should answer to more than one hostname (such as www.democrat.pol and just democrat.pol)? The ServerAlias directive can be used to add multiple names to consider when attempting to nd a matching virtual host, as in the following example, where the relevant line has been highlighted.
<VirtualHost 192.168.0.2> ServerAdmin webmaster@democrat.pol ServerName www.democrat.pol
ServerAlias democrat.pol democrat www.donkey.pol donkey.pol donkey
What if, probably due to a misconguration, a match in not found amongst the various 192.168.0.2 virtual hosts? The answer is that Apache defaults to the rst dened server on that IP address, in this case, www.democrat.pol. Once a virtual host has been dened for a NameVirtualHost IP address, requests over that IP address will never fall through to the main server. Notice that, in the example above, the server is really simultaneously implementing IP based virtual hosting (over IP address 192.168.0.1) and name based virtual hosting (over IP address 192.168.0.2).
Exercises
Lab Exercise
Objective: Congure Apache virtual hosts Estimated Time: 45 mins.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
59
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Specication
This lab will consist of setting up virtual hosts for four distinct trade organizations which are all sharing a common web server. The various virtual hosts will be bound to variants of the loopback address, so all conguration will be local to you machine. The skills required to congure a "real world" external web server would be nearly identical, however, only the IP addresses would need to change. 1. Create appropriate DNS entries. As a prerequisite, DNS should be congured to resolve all relevant hostnames appropriately. For our purposes, simply adding the following entries to your local /etc/hosts le will sufce.
127.1.1.1 127.1.1.2 127.1.1.2 127.1.1.2 www.peanutbutterisgood.rha www.jellyisgood.rha www.jamisgood.rha www.marmaladeisgood.rha
If you have congured the le correctly, you should be able to individually ping each of the hostnames, and conrm that they resolve correctly. (Dont be concerned that theres not really a top level domain called rha. Well x that in an upcoming workbook.) 2. Four advocacy organizations, one each promoting peanut butter, jelly, jam, and marmalade, want to use common infrastructure to support what looks like four independent sites. You are to congure your web server so that it serves four virtual hosts, with the following parameters. In the following table, all document roots are relative to the directory /var/www/vhostlab, represented by .... You will probably have to create this directory. Hostname IP Address Type Document Root
www.peanutbutterisgood.rha
IP 127.1.1.1 based Name 127.1.1.2 based Name 127.1.1.2 based Name 127.1.1.2 based
.../pb_root
www.jellyisgood.rha
.../namevhost/jelly_root
www.jamisgood.rha
.../namevhost/jam_root
www.marmaladeisgood.rha
.../namevhost/marmalade_root
The content for the various websites can be found at http://rha-server/pub/rha/rha230/pbandj_website.tgz. Each site consists of a single index.html le found in the relevantly named directory. Each index.html le also references a background image referenced as /images/some_name.jpg. a. Extract the tar archive, and position the index.html les so that they are located within the appropriate document roots. b. Within the tar archive, all four images are found in a single images directory. Install this
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
60
Chapter 4. Virtual Hosts directory on your web server as the directory /var/www/vhostlab/images. Congure your web server so each virtual host can reference images is this directory using a URL of the form http://vhostname/images/some_name.jpg. You may use whatever method you like, as long as the images are not moved (or copied) from the images directory, and you do not modify the index.html les. If installed correctly, your site should have the following minimum structure. (You may have added some additional links or whatnot to solve the image directory problem).
/var/www/vhostlab/ |-- images | |-- jam.jpg | |-- jelly.jpg | |-- marmalade.jpg | -- peanutbutter.jpg |-- namevhost | |-- jam_root | | -- index.html | |-- jelly_root | | -- index.html | -- marmalade_root | -- index.html -- pb_root -- index.html
3. Set options such that clients accessing http://www.peanutbutterisgood.rha/images receive a dynamically generated index, but dynamically generated indexes for http://www.jamisgood.rha/images, http://www.jellyisgood.rha/images, and http://www.marmeladeisgood.rha/images are prohibited. 4. The site http://www.peanutbutterisgood.rha should log hits (client access) to the le /var/log/httpd/pb_access_log, using the common format. The three named based virtual hosts should all log hits to the le /var/log/httpd/fruity_access_log, again using the common format. 5. Older web clients use the HTTP/1.0 protocol, instead of the HTTP/1.1 protocol, and do not always provide the HTTP host: header required to resolve name based virtual hosts. As a result, when accessing a site which uses named based virtual hosting, they are always bound to the default (rst dened) virtual host. In order to accommodate these older clients, create a new name based virtual host, with a ServerName of DummyPlaceholder, and assign it a document root of /var/www/vhostlab/namevhost. Make sure that its denition occurs before any other virtual host denitions for IP address 127.1.1.2. Create the le /var/www/namedlab/namevhost/index.html, with the following content.
<p>Which of the following high quality sites are you trying to access?</p> <ul> <li><a href="/jelly_root">www.jellyisgood.rha</a></li> <li><a href="/jam_root">www.jamisgood.rha</a></li> <li><a href="/marmalade_root">www.marmaladeisgood.rha</a></li> </ul>
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
61
Chapter 4. Virtual Hosts You may conrm your conguration by accessing the web server by IP address, instead of hostname: http://127.1.1.2. Make sure that pages accessed through this new (unnamed) virtual host resolve images correctly.
Deliverables
1. A local DNS conguration which resolves www.peanutbutterisgood.rha to 127.1.1.1, and each of www.jellyisgood.rha, www.jamisgood.rha, and www.marmaladeisgood.rha to 127.1.1.2. 2. An IP based virtual host on 127.1.1.1, with a document root of /var/www/vhostlab/pb_root, with the specied content, which logs hits to /var/log/httpd/pb_access_log using the common format. 3. Three name based virtual hosts (www.jellyisgood.rha, www.jamgood.rha, and www.marmaladeisgood.rha) which all share the IP address 127.1.1.2, mapped to the document roots /var/www/vhostlab/namevhost/jelly_root, /var/www/vhostlab/namevhost/jam_root, and /var/www/vhostlab/namevhost/marmalade_root, respectively, with the specied content. 4. Each name based host logs hits to the shared log le /var/log/httpd/fruity_access_log using the common format. 5. Requests for all four virtual hosts should resolve the URL /images to the directory /var/www/namevhost/images. 6. For the IP based virtual hosts 127.1.1.1, requests to the URL /images should result in a dynamically generated index. For all named based virtual hosts, dynamic index generation of /images should be disabled. 7. In order to support legacy clients, all requests which resolve to the host 127.1.1.2 which do not directly reference one of the specied name virtual hosts by name should resolve to the document root /var/www/namedlab, which contains the le index.html with the specied content.
Questions
1. Which of the following protocols does the Apache webserver use to associate an IP-based virtual host with a client request? ( ) a. TCP/IP ( ) b. DNS ( ) c. ARP ( ) d. HTTP ( ) e. None of the above
rha230-5.0-1-en-2008-01-21T07:12:18-0500
62
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 4. Virtual Hosts 2. Which of the following protocols does the Apache webserver use to associate a Name-based virtual host with a client request? ( ) a. TCP/IP ( ) b. DNS ( ) c. ARP ( ) d. HTTP ( ) e. None of the above
3. Which of the following directives would you not be able to override using an Apache virtual host? ( ) a. DocumentRoot ( ) b. ServerName ( ) c. KeepAliveTimeout ( ) d. ErrorLog ( ) e. DirectoryIndex Use the following excerpt from an Apache web servers main conguration le to answer the following 7 questions.
... DocumentRoot ... ErrorLog CustomLog DirectoryIndex ...
combined
<VirtualHost 192.168.24.32> DocumentRoot ServerName ErrorLog CustomLog </VirtualHost> NameVirtualHost 192.168.24.33 <VirtualHost 192.168.24.33> DocumentRoot ServerName ErrorLog CustomLog DirectoryIndex Alias /seeds/ /var/www/virtual/hamster.edu www.hamster.edu logs/hamster-error-log logs/hamster-access-log custom nuts.html /usr/share/seeds/ /var/www/virtual/chipmunk.edu www.chipmunk.edu logs/chipmunk-error-log logs/chipmunk-access-log combined
rha230-5.0-1-en-2008-01-21T07:12:18-0500
63
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone tollfree (USA) +1 866 626 2994 or +1 (919) 754 3700.
</VirtualHost> <VirtualHost 192.168.24.33> DocumentRoot ServerName ErrorLog CustomLog /var/www/virtual/gerbil.edu www.gerbil.edu /var/www/virtual/gerbil.edu/.hterrors logs/gerbil-access-log combined
You may assume that no omitted conguration affects URL to lename translation, and that an external DNS server appropriately maps the following hostnames. Hostname www.chipmunk.edu www.rat.edu www.hamster.edu www.gerbil.edu www.lemming.edu IP Address 192.168.24.32 192.168.24.32 192.168.24.33 192.168.24.33 192.168.24.33
4. To what le does the URL http://www.chipmunk.edu/seeds/sunower.html resolve? ( ) a. /var/www/html/seeds/sunflower.html ( ) b. /var/www/virtual/hamster.edu/seeds/sunflower.html ( ) c. /usr/share/seeds/sunflower.html ( ) d. /var/www/html/sunflower.html ( ) e. /var/www/virtual/chipmunk.edu/seeds/sunflower.html
5. To what le does the URL http://www.hamster.edu/seeds/sunower.html resolve? ( ) a. /var/www/virtual/hamster.edu/seeds/sunflower.html ( ) b. /var/www/html/sunflower.html ( ) c. /var/www/html/seeds/sunflower.html ( ) d. /var/www/virtual/chipmunk.edu/seeds/sunflower.html ( ) e. /usr/share/seeds/sunflower.html
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
64
Chapter 4. Virtual Hosts 6. To what le does the URL http://www.lemming.edu/seeds/sunower.html resolve? ( ) a. /var/www/virtual/chipmunk.edu/seeds/sunflower.html ( ) b. /var/www/html/seeds/sunflower.html ( ) c. /var/www/html/sunflower.html ( ) d. /var/www/virtual/hamster.edu/seeds/sunflower.html ( ) e. /usr/share/seeds/sunflower.html
7. To what le does the URL http://www.rat.edu/seeds/sunower.html resolve? ( ) a. /var/www/virtual/chipmunk.edu/seeds/sunflower.html ( ) b. /var/www/html/sunflower.html ( ) c. /usr/share/nuts/sunflower.html ( ) d. /var/www/virtual/hamster.edu/seeds/sunflower.html ( ) e. /var/www/html/seeds/sunflower.html
8. When accessing the URL http://www.gerbil.edu/seeds/acorns/, a 403 Access Denied error is generated. Assuming all lesystem ownerships, permissions, and SELinux contexts are correct, which of the following would allow access to the URL? ( ) a. Commenting out the Location directive from the appropriate container. ( ) b. Creating the le /var/www/virtual/gerbil.edu/seeds/acorns/nuts.html ( ) c. Creating the le /var/www/virtual/gerbil.edu/seeds/acorns/index.html ( ) d. Any of the above ( ) e. Either A or C
9. To what le(s) would information about the above (403 Access Denied) transaction be logged? ( ) a. /var/log/httpd/log/gerbil-error-log ( ) b. /var/log/httpd/log/gerbil-access-log ( ) c. /var/www/virtual/gerbil-edu/.hterrors ( ) d. A and B ( ) e. A and C
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
65
Chapter 4. Virtual Hosts 10. In the standard Red Hat Enterprise Linux conguration, which of the following les could also be used to provide virtual host conguration? ( ) a. /etc/httpd/gerbil.virtual ( ) b. /etc/httpd/conf.d/gerbil.conf ( ) c. /etc/httpd/conf.d/gerbil ( ) d. /var/www/html/.htgerbil ( ) e. B or C
rha230-5.0-1-en-2008-01-21T07:12:18-0500
66
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Proxy Server
80
Web Server
httpd 2.2.2.2
1. A client is congured to use the proxy server. This is a one time conguration, which usually requires the IP address and port of the proxy server. 2. When asked to connect to a service, instead of connecting directly, the client instead connects to the proxy server. 3. The proxy server accepts the request as if it were the server, but sends nothing back to the client immediately. Instead, the proxy server initiates the request to the real service, as if it were the client. 4. The true service receives the connection, and returns a response to the proxy server. 5. The proxy server then resends the response it received from the server to the client, as if it were the server. Why would anyone want to use such a convoluted scheme? The answer usually involves one of the following.
Access. The client may be on a machine that does not have a direct connection to the Internet, so it needs the services of a proxy server which does. In the scenario diagrammed above, the client is on a 192.168.0.0/24 private subnet, which by convention should not be routed directly to the Internet. Caching. The proxy server may store the response of the server, as well as returning it to the client. If the client (or another client) asks for the same information again, the proxy server merely needs to ask the real server "has your information changed?" If not, the proxy server can return the local copy, reducing trafc between the proxy server and the true service. Filtering. The proxy server becomes a single control point for all clients which it serves. Therefore, trafc can be ltered or logged for later auditing at the proxy server.
Although our gure diagrams a web proxy server, our discussion has been intentionally vague about what client and what service were talking about, because the idea of a proxy server is a general concept.
67
Chapter 5. The Squid Proxy Server The service in question could be a web server, an FTP server, or even an LDAP server, and the same concepts would apply.
... ============================================================================= Package Arch Version Repository Size ============================================================================= Installing: squid i386 7:2.6.STABLE6-3.el5 rha-rhel 1.2 M ... Installed: squid.i386 7:2.6.STABLE6-3.el5 Complete!
[root@station1 ~]# service squid start
OK
The out-of-the-box conguration is not useful directly, however, as the default access control lists do not let any useful clients connect.
All white lines (lines which are empty or contain only white space) are ignored, as are all comment lines that begin with a "#". All other lines begin with a keyword, referred to as a "TAG". The syntax for arguments after the tag depend on the tag, but must all occur on the same line.
Like many Red Hat Enterprise Linux default conguration les, the le attempts to be self documenting and provides copious comments with default conguration values commented out. Usually, changing a
rha230-5.0-1-en-2008-01-21T07:12:18-0500
68
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 5. The Squid Proxy Server value to something other than the default involves uncommenting the default line, and changing its value (perhaps rst duplicating the line to preserve documentation of the default value). While the default conguration le is intimidating, weighing in at over 4300 lines, the relevant conguration is a mere 25 lines, as illustrated below.
[root@station ~]# wc /etc/squid/squid.conf
4325 25
For our purposes, we are only going to examine three relevant tags: http_port, acl, and http_access.
By default, squid binds to port 3128, although by convention, HTTP proxy servers usually use the port 8000 or 8080. An administrator could well want to add a line akin to the following.
http_port 8080
Note that, as the comment says, multiple http_port lines can be added, causing squid to bind to more than one port or interface, if necessary.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
69
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
The acl tag assigns a name to a specication. The tag itself has no observable effect, but may instead be referenced by other tags (such as http_access, below). Skimming the comments here and in the le itself, we nd that acl specications can involve a wide range of parameters, including the following. Table 5-1. Squid acl Specications Keyword
src dst port myip srcdomain dstdomain time
Parameter Requesting clients IP address Real servers IP address Real servers port squids IP address Requesting clients domain name Real servers domain name Time of day
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
70
Parameter Regular Expression matched against the Requested URL Proxied protocol (HTTP, FTP, etc.) Regular Expression matched against HTTP request headers Regular Expression matched against HTTP response headers
And this is only some of the parameters that can be specied. Obviously, squid is highly congurable in terms of who it will let connect, and what content it is willing to proxy. We now turn our attention to the default conguration, which are the uncommented values found a few lines below. Figure 5-4. /etc/squid/squid.conf: acl
#Recommended minimum configuration: acl all src 0.0.0.0/0.0.0.0 2395 acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl to_localhost dst 127.0.0.0/8 acl SSL_ports port 443 acl Safe_ports port 80 # http 2400 acl Safe_ports port 21 # ftp acl Safe_ports port 443 # https acl Safe_ports port 70 # gopher acl Safe_ports port 210 # wais acl Safe_ports port 1025-65535 # unregistered ports 2405 acl Safe_ports port 280 # http-mgmt acl Safe_ports port 488 # gss-http acl Safe_ports port 591 # filemaker acl Safe_ports port 777 # multiling http acl CONNECT method CONNECT
These lines dene the following names, which can be referred to later. Table 5-2. Default squid acl Denitions Name all manager localhost to_localhost Safe_ports CONNECT Members All requests squid internal cache management requests All requests originating from the loopback address All requests to the loopback address All requests to the well known ports of services squid is willing to proxy All requests to initiate an SSL encapsulated connection
As the Safe_ports acl illustrates, a name may be assigned multiple times, resulting in the values being "or"ed together (i.e., a match on any of the individual values is considered a match on the acl as a whole). Lastly, an access control policy is dened using multiple http_access tags which reference the acls dened above. On any client request, squid will use a "stop on rst match" policy while searching the following list of http_access controls. Order is important. Once squid nds a specication that
rha230-5.0-1-en-2008-01-21T07:12:18-0500
71
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 5. The Squid Proxy Server matches the client request, it stops searching and immediately implements the specied allow or deny policy. Figure 5-5. /etc/squid/squid.conf: http_access
# TAG: http_access # Allowing or Denying access based on defined access lists # # Access to the HTTP port: 2485 # http_access allow|deny [!]aclname ... # ... #Recommended minimum configuration: # # Only allow cachemgr access from localhost 2505 http_access allow manager localhost http_access deny manager # Deny requests to unknown ports http_access deny !Safe_ports ... # Example rule allowing access from your local networks. Adapt 2520 # to list your (internal) IP networks from where browsing should # be allowed #acl our_networks src 192.168.1.0/24 192.168.2.0/24 #http_access allow our_networks 2525 # And finally deny all other access to this proxy http_access allow localhost http_access deny all
To the experienced eye, the comments leave little more to add, but well walk through these lines just in case. The rst argument to the http_access tag is either the keyword allow or deny, followed by one or more acl names, each possibly preceded by a "!". The acl names are effectively "and"ed - all must apply to the client request for the http_access policy to apply. The presence of a "!" inverts the meaning of the acl. The rst line allows management requests, but only from the loopback address (i.e., from processes running on the proxy server). Notice that both the manager and localhost acls must apply for the policy to take effect. The second line denies management requests from all other sources. Any request for a port other than one for which squid is willing to proxy is denied. (Notice the convenient use of "!" to invert the meaning of the safe_ports acl.) This is where the good guys are dened. More on this in a second. Any requests from the loopback interface are considered good. Any request not meeting the above policies is prohibited by deny all.
Once we work our way through the default conguration, we realize that it only allows connections from the loopback address! If the proxy server is to be useful, the identities of the intended clients need to be specied. How should be evident from the comments. First, dene the our_networks acl to match
rha230-5.0-1-en-2008-01-21T07:12:18-0500
72
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 5. The Squid Proxy Server requests from the clients for whom squid should be willing to proxy. Second, add a http_access rule allowing connections that match the our_networks acl. (Of course, some name other than our_networks could have been used). Order is important. The matching rule should occur after requests for bad ports are ltered out, but before the deny all sledge hammer. For example, to allow clients to connect from the 192.168.0.0/24 subnet, we could add the following lines just beneath the our_networks comments.
acl our_networks src 192.168.0.0/255.255.255.0 http_access allow our_networks
Or, the equivalent IP subnet CIDR notation 192.168.0.0/24 could have been used. Of course, after modifying the conguration le, the squid service should be restarted.
[root@station ~]# service squid restart
[ [
OK OK
] ]
It took a while to understand why, but in the end, conguring squid to allow clients only involves a two line edit, both of which can be easily deduced from existing comments: one to dene who the good guys are, and another to modify the access control list chain to let them in.
Conguring Firefox
The refox web browsers proxy conguration is found by choosing the Connection Settings... button from Preferences Dialog, which is opened by choosing the Edit:Preferences menu item.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
73
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 5. The Squid Proxy Server Figure 5-6. Firefox Proxy Conguration
Once open, the dialog allows you to specify an independent proxy server for each of several protocols, or, conveniently, to set all protocols to use the same server. A list of domains and IP address for which the client should not proxy can also be specied, which is very useful for maintaining access to servers the proxy server might not be aware of (such as localhost or rha-server).
Conguring curl
Command line web clients are often congured to use proxy servers through command line switches or environment variables. Opening the curl man page, for example, and searching for proxy, one can (eventually) nd the following.
-x/- -proxy <proxyhost[:port]> Use specified HTTP proxy. If the port number is not specified, it is assumed at port 1080. This option overrides existing environment variables that sets proxy to use. If theres an environment variable setting a proxy, you can set proxy to "" to override it.
rha230-5.0-1-en-2008-01-21T07:12:18-0500
74
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
If
set
For example, to download the Red Hat home page using a the proxy server dened above, either of the following techniques could be used.
[root@station ~]# curl -x http://station:8080 http://www.redhat.com [root@station ~]# export http_proxy=http://station:8080 [root@station ~]# curl http://www.redhat.com
Notes: a. The Unix world (including Linux) conventionally records timestamps internally using "seconds since the epoch", with the epoch being January 1st, 1970. Using a signed 32bit integer, this conveniently records times from around 1900 until around 2038. The Unix world was not concerned about "Y2K" problems, but instead worries about "Y2038" problems. Your author feels this would be the perfect time to come out of retirement and consult for legacy Linux systems. Three sample log messages are found below.
1124596032.120
rha230-5.0-1-en-2008-01-21T07:12:18-0500
75
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
60355 192.168.0.25 TCP_MISS/200 1381 GET http://www.redhat.com/ - DIRECT/209 1 192.168.0.25 TCP_HIT/200 12115 GET http://www.redhat.com/ - NONE/- tex
The rst is from a client which was not accepted by the client access control conguration, and so received a TCP_DENIED. The second is a request from a client for data not already in the cache, a TCP_MISS. The third is a followup request (perhaps from a reload of the same page), whose data was already cached locally, generating a TCP_HIT. Notice that the only request which took a signicant amount of time to fulll was the cache miss, which consumed around 60000 milliseconds of cache time, as opposed to 1 or 2.
Exercises
Lab Exercise
Objective: Congure the Squid Proxy Server Estimated Time: 10 mins.
Specication
This lab will have you install, congure, and use the squid proxy server. A "real world" use of squid would require 3 machines: One to host the web server, one to host the proxy server, and of course the client machine running a web browser. Figure 5-7. Standard Squid Proxy Server Conguration
httpd 80 www.widgets.org 118.23.53.1 43523 firefox station1.example.com 192.168.0.1
The machine hosting the web server would need a publicly accessible IP address, as would the proxy server. It could well be the case, however, that the client machine does not, with squid running on a
rha230-5.0-1-en-2008-01-21T07:12:18-0500
76
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Chapter 5. The Squid Proxy Server multi-homed host. The squid application would receive requests from client over a private IP address, and forward them to the Internet through its public IP address. For our lab, we will instead run the web server, the client, and the squid proxy server on the same machine. The concepts map directly to the real world scenario. In the following diagram, 192.168.0.1 should be replaced with your eth0 IP address. Figure 5-8. Lab Squid Proxy Server Conguration
student station httpd 127.1.1.2:80 squid 192.168.0.1:8080 firefox 127.1.1.2:35476
1. Congure the squid proxy server. a. Ensure that the squid package is installed. b. As a precaution, make a backup of the le /etc/squid/squid.conf, copying it to /etc/squid/squid.conf.orig, for example. c. In the le /etc/squid/squid.conf, search for the http_port option, around line 54. Set the http_port to 8080. d. In the le /etc/squid/squid.conf, search for term our_network, around line 1860(!). Administrators are expected to set local access control policies at this location. Following the commented out examples, dene an acl our_networks, which matches all requests sourced from your eth0 interface. For example, if ifcong eth0 reports your IP address as 192.168.0.5 and your network mask as 255.255.255.0, then the following line would be appropriate. (If in doubt, you can specify your IP address directly, with a mask of 255.255.255.255). Once dened, add a http_access directive which allows the acl.
acl our_networks src 192.168.0.0/255.255.255.0 http_access allow our_networks
e. Use the standard service and chkcong commands to start the squid service, and enable the service to start automatically on reboots. You might want to use the netstat command to conrm that squid is LISTENing for connections on port 8080.
2. Monitor squid and httpd requests. In two separate windows (or two separate virtual consoles), use less to open the les /var/log/httpd/access_log and /var/log/squid/access_log, respectively. Within less, hit SHIFT-F to enter "follow" mode. As new requests are made for each service, you should see a log line generated within the respective le. (Pressing CTRL-C will return less to normal browsing mode.) 3. Congure refox to use the proxy server. Using the refox browser, open the Edit: Preferences dialog, and follow the path to General and Connection Settings.... In the resulting dialog, choose
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
77
Chapter 5. The Squid Proxy Server Manual proxy conguration, and set the HTTP Proxy to be your eth0 IP address, port 8080. Also, remove any text from the No Proxy For text entry. OK your way out of the various dialogs. 4. Browse your webserver. Now use refox to browse the content of your webserver. If some of your previous labs are still in place, you may try http://localhost/relativity, http://www.peanutbutterisgood.rha, or http://www.jamisgood.rha. Otherwise, simply create a le in your document root directory, and reference it. With each request, you should see a line similar to the following in your /var/log/squid/access_log le.
1132416737.269 699 192.168.0.1 TCP_MISS/304 200 GET http://localhost/reading s/the_god_of_mars.html - DIRECT/127.0.0.1 -
If not, make sure you reload a page from within the browser. If the page is in the browsers cache, then it will not actually generate a request.
Deliverables
1. A running squid server, bound to port 8080, which allows requests over the IP address assigned to the eth0 interface. 2. The squid service is congured to start automatically upon reboot.
Challenge Exercises
1. Assuming your neighbors have set access control conguration appropriately, you should be able to use your proxy server to browse a neighbors website, or a neighbors proxy server to browse your website, or a neighbors proxy server to browse another neighbors website. Explore. 2. Congure your access control specications so that one particular neighbor may access your squid proxy server, but another may not. 3. Notice the following line in the /etc/squid/squid.conf conguration le.
2512 # We strongly recommend the following be uncommented to protect innocent # web applications running on the proxy server who think the only # one who can access services on "localhost" is a local user #http_access deny to_localhost
What concern is this addressing? In order to convince yourself that denying the to_localhost acl is a good idea, enable the /server-status location within your Apache web server, but take the precaution of only allowing requests from the loopback address 127.0.0.1. Then have a neighbor use your proxy server to access http://localhost/server-status from their machine. (Realize, of course, that xing this security hole by denying requests matching the to_localhost acl would break the original lab.)
rha230-5.0-1-en-2008-01-21T07:12:18-0500
78
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Questions
Use the /etc/squid/squid.conf excerpts below to answer the next 6 questions:
acl all src 0.0.0.0/0.0.0.0 acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl public_terminal src 192.168.0.100/255.255.255.255 acl public_hours time M-F 09:00-17:00 acl intranet src 192.168.0.0/24 acl vpn src 10.0.1.0/24 acl media_files url_regex \.mp3$ acl media_files url_regex \.avi$ acl media_files url_regex \.mpeg$ acl media_files url_regex \.wma$ acl media_files url_regex \.wmv$ acl hostile dstdomain cracker.org acl to_localhost dst 127.0.0.0/8 acl SSL_ports port 443 563 acl Safe_ports port 80 acl Safe_ports port 21 acl Safe_ports port 443 563 acl Safe_ports port 70 acl Safe_ports port 210 acl Safe_ports port 1025-65535 acl Safe_ports port 280 acl Safe_ports port 488 acl Safe_ports port 591 acl Safe_ports port 777 acl CONNECT method CONNECT ... http_access allow manager localhost http_access deny manager http_access deny !Safe_ports http_access deny CONNECT !SSL_ports http_access allow localhost http_access deny media_files http_access deny public_terminal !public_hours http_access allow intranet http_access deny all http_access allow vpn
1. What is the likely purpose of the "mediales" acl and associated http_access rule? ( ) a. To speed up access to music and video les by caching them ( ) b. To make it impossible to download music and video through the proxy ( ) c. To make downloading music and video through the proxy more difcult by blocking common le extensions. ( ) d. To stop external systems from retrieving audio and video les from internal systems ( ) e. None of the above
rha230-5.0-1-en-2008-01-21T07:12:18-0500
79
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
2. What would happen if a request for http://www.somesite.com/files/hit_song.mp3 was sent to the proxy server from 127.0.0.1? ( ) a. Not enough information to tell ( ) b. Access would be granted ( ) c. Access would be denied, but other URLs might work ( ) d. Access would be denied unless the le was already cached ( ) e. Access would be denied for any destination
3. What would happen if a request for http://www.somesite.com/files/hit_song.mp3 was sent to the proxy server from 192.168.0.5? ( ) a. Not enough information to tell ( ) b. Access would be granted ( ) c. Access would be denied, but other URLs might work ( ) d. Access would be denied unless the le was already cached ( ) e. Access would be denied for any destination
4. What would happen if a request for http://www.somesite.com/files/hit_song.mp3 was sent to the proxy server from 209.132.177.60? ( ) a. Not enough information to tell ( ) b. Access would be granted ( ) c. Access would be denied, but other URLs might work ( ) d. Access would be denied unless the le was already cached ( ) e. Access would be denied for any destination
5. To what extent could the system with IP address 10.0.1.5 use this proxy server? ( ) a. It would be able to access any url ( ) b. It would be able to access some URLs ( ) c. It would not be able to use the proxy at all ( ) d. Not enough information to tell ( ) e. None of the above
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
80
Chapter 5. The Squid Proxy Server 6. To what extent could the system with IP address 192.168.0.100 use this proxy server? ( ) a. It would be able to access any url ( ) b. It would be able to access some URLs ( ) c. It would not be able to use the proxy at all ( ) d. Not enough information to tell ( ) e. None of the above
7. What port does squid listen on by default? ( ) a. 8080 ( ) b. 443 ( ) c. 4400 ( ) d. 8139 ( ) e. None of the above
8. What is another common port for proxy servers to use? ( ) a. 8080 ( ) b. 443 ( ) c. 4400 ( ) d. 8139 ( ) e. None of the above
9. How does one congure proxy settings in the Firefox web browser? ( ) a. Tools:Proxies ( ) b. Edit:Preferences:Web Features:Proxies ( ) c. Tools:Settings:Connection Settings ( ) d. File:Use Proxy ( ) e. Edit:Preferences:General:Connection Settings Use the following excerpt from a squid log to answer the next question:
1137789430.068 50405 192.168.0.50 TCP_MISS/200 1381 GET http://academy.redhat.com/ - DIRECT/209.132
rha230-5.0-1-en-2008-01-21T07:12:18-0500 Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
81
Chapter 5. The Squid Proxy Server 10. What does this log message indicate? ( ) a. The client was denied access to the requested url ( ) b. The clients request is being "held" pending approval by an administrator ( ) c. The client was granted access to the url, which was found in cache ( ) d. The client was granted access to the url, which was not in cache and will be retrieved by the proxy ( ) e. None of the above
rha230-5.0-1-en-2008-01-21T07:12:18-0500
82
Copyright (c) 2003-2007 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.