[Previous] [Next] [Up] [Top] [Search] [Index]

CGI Scripts


CGI stands for Common Gateway Interface. It provides a standard for Web servers to interact with programs or scripts which are not part of the server but may produce output which you wish to serve.

16.1 Do You need a CGI script?

Many functions which are done by CGI scripts on other servers are built in features of WN. If your needs can be met by these features then not only will you save yourself considerable effort in creating, setting up, and maintaining scripts or programs, but the built in feature will perform much more efficiently and much more securely than a CGI script.

These features include the ability to respond with different text or entirely different documents based on the the client request, the client's hostname, IP address, user-agent, or the "referer", the document containing the link. For information about this see the chapter on parsed text. Also support for "imagemaps" or clickable images is built in so there is no need to use CGI for this. See the chapter on imagemaps. Finally WN supports a variety of methods of searching your data including by title, keyword, or full text. See the chapter on searches.

16.2 How Does the Server Recognize a CGI Script?

It would be nice if one could simply indicate in the appropriate index file that a particular file is a CGI program which should be executed rather than served. Unfortunately, the CGI protocol makes it impossible to implement this in an efficient way.

There are two mechanisms in fairly common use with other servers for indicating that a file is a CGI script and WN supports them both. The first is to give the file name a special extension (by default it is ".cgi") which indicates that it is a CGI script. Thus any file you serve with the name "something.cgi" will be treated as a CGI script. The special extension ".cgi" can be redefined by editing the file config.h and recompiling.

The second mechanism is to have specially named directories with the property that any file in that directory will be assumed to be a CGI script. The default for this special name is "cgi-bin". Thus, if you have a directory /cgi-bin in your hierarchy the server will assume that any file served from that directory is a CGI script. Of course, as always, only files listed in that directory's index file will be servable. Only the files in the directory "cgi-bin" will be assumed to be CGI scripts, not files in subdirectories. Thus the file "/cgi-bin/dir/foo" will not be treated as a CGI script.

There is no need for cgi-bin/ to be at the top of your hierarchy. It could be anywhere in the hierarchy. And, in fact, you can have as many directories named "cgi-bin" as you like. They will all be treated the same. The name "cgi-bin" can be changed by editing config.h and recompiling.

16.3 How Does a CGI script work?

It is beyond the scope of this document to provide an extensive tutorial in writing CGI scripts. There is an online tutorial at www.charm.net and another available from NCSA. A collection of links to CGI information is available at www.stars.com.

We will provide only a simple example of a CGI script written in perl. More examples can be found in the /docs/examples directory of the WN distribution.


#!/usr/local/bin/perl
# Simple example of CGI script.

print "Content-type: text/html\r\n";
# The first line must specify content type. Other
# optional headers might go here.

print "\r\n";
# A blank line ends the headers. All header lines should
# end with CRLF ("\r\n"), but other lines don't need to.

# From now on everything goes to the client

print "<body>\n";
print "<h2>A few CGI environment variables:</h2>\n\n";

print "REMOTE_HOST = $ENV{REMOTE_HOST}<br>\n";
print "HTTP_REFERER = $ENV{HTTP_REFERER}<br>\n";
print "HTTP_USER_AGENT = $ENV{HTTP_USER_AGENT}<br>\n";
print "QUERY_STRING = $ENV{QUERY_STRING}<br>\n";
print "<p>\n";

print "</body>\n";

Notice that the first thing the script does is provide the HTTP "Content-type:" header line. It may be followed by other optional headers you want the server to send. The end of these is indicated by a blank line. Of course the server will add additional headers. By default the WN server assumes that the output of any CGI script which does not have a QUERY_STRING or POST data is "dynamic" or different each time the script is run. Hence the server behaves as if the Attributes=dynamic directive had been used and attempts to dissuade clients and proxies from caching the output. If, in fact the output of your script is always the same, and you wish it to be cached you can use the "Attributes=non-dynamic" directive. If the output depended on a QUERY_STRING or POST data this is not done. If you wish to discourage the caching of the output of such a CGI script then add the Attributes=dynamic directive to its entry in the index file.

The script above should be marked dynamic as it prints out the client's hostname, user agent and the URL of the document which contains the link to this CGI script. The CGI script gets this information about the client from environmental variables set by the server. A complete list of the standard CGI environment variables and a description of what they contain plus a description of some additional non-standard ones supplied by the WN server can be found in Appendix D: Environment Variables.

In addition to setting these environment variables appropriately the server will change the current working directory of the CGI process to the directory in which the CGI script is located.

Note: In general a CGI script has complete control over its output, so it is responsible for doing things which the server might do for a static document. This means that you cannot use many of the WN features with CGI output. In particular the server will not use a filter or parse it for , etc. The CGI script must do these things for itself. Also the server will not provide ranges specified in the Range: header. Instead the contents of this header is passed to the script in the environment variable HTTP_RANGE, so the script can do the range processing.

16.4 How can CGI scripts be made safe?

This is an extremely important issue, but one which is beyond the scope of this document. I highly recommend the CGI security FAQ maintained by Paul Phillips and the WWW Security FAQ maintained by Lincoln Stein.


John Franks <john@math.nwu.edu>
[Previous] [Next] [Up] [Top] [Search] [Index]