Blue Ribbon Campaign for Free Speech Footnotes for "CGI Made Really Easy"

Home > Web Technology Made Really Easy > CGI Made Really Easy > Footnotes

Donate


  1. Sample CGI Programs
  2. CGI Mailer Script
  3. Security with CGI Scripts
  4. Placing Your Script on the Server
  5. Sending an Existing File Back as a Response
  6. Other Useful CGI Environment Variables
  7. Returning an Image or Other Non-HTML Response from a CGI Script
  8. What is the difference between GET and POST?
  9. Gaining More Control, with Non-Parsed Header Scripts


Sample CGI Programs

By request, here are some "hello, world" CGI programs to get you started. The simple version demonstrates CGI output only, and the longer (such as it is) version will echo back any input fields you pass to it. Both Perl and C versions are provided, with source.

Back to top of page


CGI Mailer Script

One of the most common uses of a CGI script is to mail form data to an email address. So here's a simple script that does just that, written in Perl, called mailer.pl.

Make these changes to the script before putting it in place:

Mailing Form Data Without CGI

There's a poor-man's way of mailing form data that uses just HTML: in the <form> tag, set the action to a "mailto:" URL, and the enctype attribute to "text/plain". Most browsers handle it correctly, i.e. send the form data in a mail message. For example,

<form action="mailto:me@myhost.com" enctype="text/plain">

There are disadvantages: you can't control the format of the mailed text, and you can't send a response back to the user. Also, not all browsers support this style of the <form> tag.

Back to top of page


Security with CGI Scripts

Think about it-- a CGI script is a program that anyone in the world can run on your machine. Accordingly, look out for security holes as you write your script.

Mostly, don't trust the user input. In particular, don't put user data in a shell command without verifying the data carefully, lest a hacker drive a virtual truck through this security hole.

Let's say you have a CGI program that lets users run "finger" on your host. Such a Perl script might have a line like

system "finger $username" ;

But if a malicious user enters "james; rm -rf /" as the username, your program runs

system "finger james; rm -rf /" ;
which erases as many of your files as possible, probably not what you intended. So verify that the username is valid, with something like
$username!~ /[^\w.-]/   || die "Whoa!  Nice try, buddy." ;
or use a different form of the system command:
system("finger", $username) ;
or come up with another way to solve the problem.

It's easy for a hacker to send any form variables to your script, with any values (even non-printable characters). Your security shouldn't rely on fields having certain values, or even existing or not existing.

Back to top of page


Placing Your Script on the Server

Different Web servers are configured differently. Some let you put your CGI scripts in the same directory as your Web pages, with filenames ending in ".cgi". Other servers make you put all CGI scripts in a specific directory, usually called "cgi-bin". Your webmaster has the final answers.

You need to set the right permissions on the program file. In Unix, the Web server (like any process) runs under some username. Your CGI program must be executable by that username, plus readable if it's a Perl or shell script. In Unix, set the correct permissions with "chmod 750 *.cgi" (or "chmod 755 *.cgi", if your server doesn't have group access to your files-- try both, or ask your webmaster).

If your script doesn't run:

Back to top of page


Sending an Existing File Back as a Response

If your HTML response is always the same, or if you want to respond with one of several existing files, you may find the "Location:" response header useful. Use it to redirect the browser to another URL.

By way of example, if your CGI script prints the following line to STDOUT:

Location: response.html
followed by a blank line, then the remote browser will retrieve response.html and treat it as the response from your CGI script. You can redirect the browser to either a relative or absolute URL.

In this situation, do not print the "Content-type:" response header.

Back to top of page


Other Useful CGI Environment Variables

CGI scripts have access to 20 or so environment variables, such as QUERY_STRING and CONTENT_LENGTH mentioned on the main page. Here's the complete list at NCSA.

A few others you may find handy:

REQUEST_METHOD
The HTTP method this script was called with. Generally "GET", "POST", or "HEAD".

HTTP_REFERER
The URL of the form that was submitted. This isn't always set, so don't rely on it. Don't go invading people's privacy with it, neither.

PATH_INFO
Extra "path" information. It's possible to pass extra info to your script in the URL, after the filename of the CGI script. For example, calling the URL
http://www.myhost.com/mypath/myscript.cgi/path/info/here
will set PATH_INFO to "/path/info/here". Commonly used for path-like data, but you can use it for anything.

SERVER_NAME
Your Web server's hostname or IP address (at least for this request).
SERVER_PORT
Your Web server's port (at least for this request).
SCRIPT_NAME
The path part of the URL that points to the script being executed. It should include the leading slash, but certain older Web servers leave the slash out. You can guarantee the leading slash with this line of Perl:
$ENV{'SCRIPT_NAME'}=~ s#^/?#/# ;

So the URL of the script that's being executed is, in Perl,

"http://$ENV{'SERVER_NAME'}:$ENV{'SERVER_PORT'}$ENV{'SCRIPT_NAME'}"

The complete URL the script was invoked with may also have PATH_INFO and QUERY_STRING at the end.

Once again, see them all at NCSA's complete list.

Back to top of page


Returning an Image or Other Non-HTML Response from a CGI Script

Most CGI scripts return HTML data, but you can return whatever kind of data you want. Just use the right MIME type in the "Content-type:" line, followed by the required blank line, followed by the raw data of the resource you're sending back. In the case of HTML files, that raw data is the HTML text. In the case of images, audio, or video, it's raw binary data. For example, to respond with a GIF file, use:

Content-type: image/gif

GIF89a&%*$@#--- binary contents of GIF file here ---$(*&%(*@#......

Your HTML page can load a script-generated image with

<img src="gifmaker.cgi?param1=value1&param2=value2">

One of my favorite examples of this was the Interactive Graphics Renderer, which rendered 3-D icons according to the colors, shape, texture, lighting, etc. that you define. You could use the resulting icons on your Web pages, as custom list bullets and horizontal rules. Note: This site lost its home long ago; this is supposed to be a mirror, but it doesn't work for me. If you find a working mirror, please send me the URL.

MIME Types

MIME types are standard, case-insensitive strings that identify a data type, used throughout the Internet for many purposes. They start with the general type of data (like text, image, or audio), followed by a slash, and end with the specific type of data (like html, gif, or jpeg). HTML files are identified with text/html, and GIFs and JPEGs are identified with image/gif and image/jpeg. Here's a pretty good list of commonly-used MIME types.

Back to top of page


What is the difference between GET and POST?

GET and POST are two different methods defined in HTTP that do very different things, but both happen to be able to send form submissions to the server.

Normally, GET is used to get a file or other resource, possibly with parameters specifying more exactly what is needed. In the case of form input, GET fully includes it in the URL, like

GET is how your browser downloads most files, like HTML files and images. It can also be used for most form submissions, if there's not too much data (the limit varies from browser to browser).

The GET method is idempotent, meaning the side effects of several identical GET requests are the same as for one GET request. In particular, browsers and proxies can cache GET responses, so two identical form submissions may not both make it to your CGI script. So don't use GET if you want to log each request, or store data or otherwise take an action for each request.

Normally, POST is used to send a chunk of data to the server to be processed, whatever that may entail. (The name POST might have come from the idea of posting a note to a discussion group or newsgroup.) When an HTML form is submitted using POST, your form data is attached to the end of the POST request, in its own object (specifically, in the message body). This is not as simple as using GET, but is more versatile. For example, you can send entire files using POST. Also, data size is not limited like it is with GET.

All this is behind the scenes, however. To the CGI programmer, GET and POST work almost identically, and are equally easy to use. Some advantages of POST are that you're unlimited in the data you can submit, and you can count on your script being called every time the form is submitted. One advantage of GET is that your entire form submission can be encapsulated in one URL, like for a hyperlink or bookmark (though see the AutoPOST utility to do this with POST).

Back to top of page


Gaining More Control, with Non-Parsed Header Scripts

Normally, when your CGI script prints the "Content-type:", "Location:", or other headers, the server parses those headers and generates an appropriate HTTP response for the user. Occasionally, you may want finer control over the HTTP response. Most Web servers support non-parsed header (or "NPH") scripts, which generate a complete HTTP response and bypass the normal parsing by the server.

To use these, you need to know some HTTP-- specifically, the formats of the status line and header lines.

In your non-parsed header script, just print the complete HTTP status and header lines where a normal script would print the "Content-Type:" line. Include the trailing blank line. Whatever your script prints is sent to the user verbatim, as the complete HTTP response, with no modification by the server.

Name your NPH scripts something starting with "nph-", like "nph-myscript.cgi"; every script whose name starts with "nph-" will be be handled as an NPH script. This works on most servers, including Apache and NCSA. Other servers may use different schemes to identify NPH scripts; read the server docs or ask your webmaster.

For an example of a NPH script, see CGIProxy.

If this is confusing, don't worry. In the unlikely event you ever need an NPH script, this will all make sense.

Back to top of page


Back to Main Page


© 1996-1998, 2002 James Marshall
(comments welcome; for questions, please scan the FAQ first)

Last Modified: April 12, 2002 http://www.jmarshall.com/easy/cgi/cgi_footnotes.html