Understanding the HTTP Transaction

HTTP is rather misleadingly named hypertext transport protocol, but is used all the time for much more than text. It is also known as an "application layer protocol", which means that its purpose is to allow applications to communicate and pass information to each other over the network. For this reason HTTP is best understood as a conversation between a client and a server. That conversation is governed by certain rules, things that can be said and things that can't. How well a client or server conforms to those rules is how HTTP compliant they are said to be. The complete HTTP exchange is called a transaction, because the point is for the client and the server to exchange a resource.

Successful transactions have two parts, headers and bodies. The header is the conversation held between the client and server, while the body is the resource itself.

The basic form of an HTTP header might comprise multiple lines like this:

 


Open: 216.103.122.37 Thu 08 Jul 1999 02:23:41
Request: GET /myscript.cgi?ti=&au=king HTTP/1.0
Status: HTTP/1.0 200 OK
Close: Thu 08 Jul 1999 02:23:48

The first two header lines are sent by the client, the last two by the server.

All parts of these conversations are sent over the Internet in plain text, just like you see above. The conversation begins with the client identifying itself and giving the date of its request, followed by the actual request on a second line, which contains the URL of the resource the client wants, the method it wants to use for requesting the resource, and the version of HTTP it's using. HTTP supports many methods for transactions, but we'll only dsicuss two: GET and POST.

 

A port number can also be specified in the request. If no port is specified, then port 80 is assumed.

The server replies with its answer in the status line - first with the version of HTTP it will use, then a three digit numeric code to identify the type of answer it is sending, followed by a text string that is an explanation of that code. If everything goes well and the server has the requested resource available the resource is sent as the body of the message. Finally the server closes the connection with a close line. The resource itself is always a stream of bytes, conforming to an Internet MIME type. The server puts a blank line between the last header line and the body of the transaction.

Of course, in reality it's a little more complicated than this. Other headers are sent in this transaction, and elaborate rules govern everything from what headers can be sent and when, to the exact number of spaces in a date string. Unless you are coding your own web server (or client) these details are not very important.

You can read the entire HTTP 1.0 specification at http://www.w3.oreg/Protocols/HTTP/HTTP2.html, though be aware that this specification is out of date, originally implemented in 1992. Most clients and servers today are switching to the HTTP 1.1 specification http://www.rfc-editor.org/rfc/rfc2616.txt. HTTP 1.0 is a lot easier to understand than HTTP 1.1, so we recommend that you read that document first as an introduction if this is all new to you.

Meta Information Headers

In a sense all header lines describe the nature of the body of an HTTP transaction, and are therefore 'meta' information. But usually people refer to meta headers as those headers included in the transaction which are not a part of the basic mechanics of making the request, but rather provide additional information about the content sent in the body of the message.

Meta information headers can contain any information at all, and the exact collection you will see depends on your particular server and client pair. These header lines are placed into environmental variables (as we'll explore throughout this article), but some of the more important ones are detailed here.

The content-type Header

One of the most important meta headers a server sends to a client is the content-type. This header identifies the kind of content the server is sending in the response, and without this header some clients will be unable to display the content at all. In other words the content-type header describes what kind of content the stream of bytes that makes up the body of the transaction is supposed to be, whether an image, formatted text or whatever.

All content on the Internet is identified by MIME type. When a server sends HTML documents it is able to identify the MIME type all by itself by various means, usually file extension. But when you use a CGI script to output HTML you must tell the server explicitly the MIME type of the data you are about to send by printing this header line.

Thus simple output from a server will typically look something like this:

 


HTTP/1.0 200 OK
Content-type: text/html

<html><head><title>My page</title>
....

Why are Headers Useful for CGI scripts?

Every time a script is run in Linux (or Unix) it executes within something called a process, which is really just a fancy way of referring to a series of connected instructions in a CPU. A process runs within an environment, which is the name given to describe the properties of that process. Environmental variables are set by the operating system and contain all kinds of information made available to your process.

In addition, a Perl process has three important handles set automatically that are known as STDIN, STDOUT and STDERR ('standard in', 'standard out' and 'standard error'). Handles are just a way of describing an input/output (I/O) stream of bytes to or from a process. These handles refer to the places that a Perl process looks for data, the default place it sends its data, the default place it sends any errors.

When a Perl process is invoked by a web server the STDOUT is set automatically to the server. In other words, to output back to the web server you just print the information you want sent, and Perl sends it to the server by default. Of course, you can send output to other places and override these defaults, as you might need to when saving data to a file, for example.

When a server sees that the requested resource is a script to run it feeds information from the HTTP headers into that script's process. The server puts most of that information into environmental variables, except in the case of a POST request when information is also fed into the program's STDIN handle.

And that, in a nutshell, is the mechanism of CGI - how a client is able to pass information to a server, a server is able to pass that information along to a script and get a response back.

Non-parsed Headers

Unlike static pages, CGI scripts always print out some HTTP headers lines - at a minimum the content-type line. If your server supports it you also have the option to print out the entire HTTP header from a CGI script. The principle advantage to doing this is to speed up your web site - if the server doesn't have to scan all the output and add headers the resource gets returned to the client that much faster. In addition different servers have different buffers and caching mechanisms, so if your script does server push countdowns or animations you may need to override that scanning of headers and send data to a client directly.

Returning your own complete headers with non parsed headers scripts is a complex business which requires a fairly complete understanding of the HTTP protocol, and is a task made all the more difficult by the rigorous requirements of HTTP 1.1

To find out more about non-parsed headers consult your server documentation.