Apache Basics and simple CGI scripts

From docwiki
Revision as of 18:48, 31 March 2020 by Mond (talk | contribs) (.htaccess files)
Jump to: navigation, search


Motivation

The web is ubiquitous today. Everything comes with built-in web servers and there are many free/open source web-servers available. For a long time Apache was the most used web server and it is still rather prominent today. It has recently been surpassed by nginx[1] (pronounced: engine-x). Still the versatility of Apache and the time-tested security makes it a good choice for many applications.

Here you will learn the basics to get you started with running an apache web server, yet many of the concepts will apply to any web server.

Appache Configuration

Global Configuration

Apache has one main configuration file. This is depends on how apache was started and the default is dependent on your Linux distribution. E.g. in debian it is /etc/apache2/apache2.conf and in redhat it is: /etc/httpd/conf/httpd.conf

Within the main configuration there are usually Include statements that include other files. E.g.

IncludeOptional sites-enabled/*.conf

Which would include any .conf file from the sub-directory sites-enabled. Usually the configuration is split up into many files. E.g. one for each module that is included and one for each virtual web server that is hosted.

Independent whether this is in the main config file or in included files. Some directives are global: They change parameters of the server itself. E.g.

Listen 80
Listen 443
Listen 127.0.0.1:9980

The above would tell apache to create listen sockets of 443 and 80 on all ports and one additional port 9980 that is only available on the localhost.


Virtual Web Servers

A web server without encryption answers on port 80. If you have https encryption then it answers on port 443. If you have more then one IP you can choose which IP address the socket binds to.

When a web client connect they will ask for the URI part (the part behind the host name) but after the request the host name that should be sent is also transmitted (in http/1.1 requests). Thus the server can present different web pages depending on the host name.

So the server can discern what the client wants, either be the IP address and/or by the host name that the client requested. So we speak of IP-based and name-based virtual hosts.

With https protected services there is a little chicken-and-egg type problem: When the SSL connection is established the server needs to present the certificate for that server. If it has more virtual servers it does not know which, since the host name is only sent within the established session. To avoid this and to allow more then one virtual server with https protection on the same IP address the SNI was invented. SNI (server name indication) is supported by all modern browsers. With SNI the server name is already sent within the SSL handshake.

<VirtualHost 10.11.12.13:80>

ServerName www.test.example.org
ServerAlias test.example.org


ServerAdmin admin@example.org
DocumentRoot /var/www/testsever/


ErrorLog /var/log/apache/test.error_log
TransferLog /var/log/apache/test.access_log

RedirectPermanent /wuwien http://www.wu.ac.at

Alias /projectdata/ /home/anna/projectx/data/

</VirtualHost>

The above example defines a virtual host (you might want to place that in its own config file - but it works in the main file as well). The virtual host is on that private IP 10.11.12.13 and accepts requests on port 80. This configuration will only be used if the hostname that is sent matches the name in ServerName or ServerAlias. Documents will be served from the DocumentRoot and the config specifies the location of the log files.

If someone browses to http://www.test.example.org/projectdata/ they will actually see what is in the /home/anna/projectx/data/ directory - but only if the user that the web-server uses has permissions on that directory.

You could specify many other directives within that block. One exmaple here is the RedirectPermanent. If a uses goes to http://www.test.exmaple.org/wuwien they will be redirected to another server.

Directory and Location Configuration

Often we want special settings that only apply to one directory (where the files are on the server) or one location (the part specified in the URL).

For this you can specify settings that are only valid in these directories. Of course this can be nested within VirtualHost blocks. E.g.

<Location /server-status>
 SetHandler server-status
</Location>

This would tell apache to server server-status pages (If the module is enabled) under the URI /servers-status.

In most cases it is better to use Directory. E.g.

<Directory "/opt/some/data/">
    Options -Indexes 
    AllowOverride AuthConfig
</Directory>

The above example turns off the indexing of directories. (That is: if you browse to a directory instead of a file, then apache can create a listing of the content. This is turned off here). It also says that the AuthConfig can be specified in a different place: In so called .htaccess files:

The AuthConfig specifies if you need a password to access the web page.

.htaccess files

When you place a file with the name .htaccess in a directory you can change some settings of the configuration just for that directory (and sub-directories). This only works if the class of settings that you want to change is allowed to be changed there. See the above example. Most of the time this is used to password protect access:

AuthName Streng-Vertraulich
AuthType Basic
AuthUserFile /opt/mypp/webusers
require valid-user

The above apache directives tell the server that for access it should ask for a password. In the password Dialog "Streng-Vertraulich" is told to the user. The users and passwords are checked against the given file. Any user in that file has access.

$ touch webusers
$ htpasswd  -B webusers anna
New password:
Re-type new password:
Updating password for user anna
cat webusers
anna:$2y$05$amEPdHfhgbggHblFGUx2ZeuVGNFKbSZoc1kamltBZJrj.YoX1YEwW

References