The Fool's Guide to CGI.pm,
the Perl module for CGI scripting

No, this guide is not for fools but by a fool, who had kept wondering how CGI.pm is supposed to work and finally got a clue. So here's how (I think) it works in simplest cases. I hope this might make some other people's learning curve a little better.

Content:

Prerequisites

You need

Hello world

OK, first do this: Check that you have the above-mentioned local information. Double-check that you have understood it. You should now know where to put your Perl programs in order to run them as CGI scripts and what you might need to do with them (setting file protections, perhaps). Fine, then test this with the following script:

print <<END;
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<title>Hello</title>
<h1>Hello world!!</h1>
END

That is, write that into a file, perhaps adding something into it as needed according to local instructions (such as a "magic" #! line that tells the path name of the Perl interpreter on the server) store the file where you need to put your CGI scripts, and try accessing it via a link or by directly typing the URL to a browser. It should work basically the same way as the following link:
http://jkorpela.fi/cgi-bin/hw.pl

You will probably need some help to make the script work. There are several practical things that need to be set properly, and they vary from one server to another. There's a useful (though Unix-oriented) general resource: the checklist The Idiot's Guide to Solving Perl CGI Problems.

You haven't used CGI.pm yet. The trivial program above just writes some fixed text. When executed as a CGI script, the first lines of the output (up to the first empty line) will be taken as HTTP headers and sent to the browser, and the rest is sent to the browser as actual data (HTML document, in this case, and quite often).

Just echoing data

Next we'll actually use CGI.pm but for a very simple purpose: we have a form with a text input field and we echo back the data that the user has typed. Trivial, but it's the start.

Our form is the following:

<form action="http://jkorpela.fi/cgi-bin/echo1.cgi"">
<div>Please type some text: <input name="sample" size="20"></div>
<div><input type="submit"></div>
</form>

It looks like the following:

Please type some text:

Try it!

The Perl program referred to in the action attribute is the following:

use CGI qw(:standard);
$data = param('sample') || '<i>(No input)</i>';

print <<END;
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<title>Echoing user input</title>
<h1>Echoing user input</h1>
<p>You typed: $data</p>
END

So how does it work? The line
use CGI qw(:standard);
is a prelude needed for using CGI.pm (in a particular mode). And the line after it,
$data = param('sample') || '<i>(No input)</i>';
is in this case the only part of the program that uses CGI.pm, and it uses it only by calling the param function. That function gives the value of the form field whose name is passed as parameter. Note that the value might be undefined, even if a field with that name is present in the form markup; if the user does not fill out the text input field, there will be no such field in the form data. (In Perl we can handle undefined values conveniently; if the value is undefined, the code above uses a fixed string instead.)

This works irrespectively of the method ("get" or "post") of the form. More generally, we need not worry about the mechanisms of form data transmission, such as "URL encoding".

You might have tried submitting text like <b>foo</b> and noticed that the HTML markup "works". This is simply because the script writes the data as such into a document that will be sent to the browser and treated as an HTML document. To prevent characters from being treated as part of HTML markup, you could add $data = escapeHTML($data) into the script (before printing anything out).

Processing the data

If we'd like to store the data, we could do it using Perl's normal file operations. So this isn't interesting from the CGI.pm point of view, though of course it can be very relevant for practical purposes. The same applies to other processing of data, like sending it by E-mail.

But for illustration, here's how to set up a simple form that collects data and saves it onto a file:

#!/usr/bin/perl
use CGI qw(:standard);

print <<END;
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
END

$regfile = '../perl/registrations.tsv';

$name = param('name');
$email = param('email');
$food = param('food');

open(REG,">>$regfile") or fail();
print REG "$name\t$email\t$food\n";
close(REG);

($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
    gmtime(time);
$now = sprintf "%4d-%02d-%02dT%02d:%02dZ\n",
    1900+$year,$mon+1,$mday,$hour,$min;

print <<END;
<title>Thank you!</title>
<h1>Thank you!</h1>
<p>Your fake registration to Virtual Nonsense Party
has been recorded at $now as follows:</p>
<p>Name: $name</p>
<p>E-mail: $email</p>
<p>Food preference: $food</p>
END

sub fail {
   print "<title>Error</title>",
   "<p>Error: cannot record your registration!</p>";
   exit; }

The form is simple:

<form action="http://jkorpela.fi/cgi-bin/collect.pl"
 method="post">
<div>Fake registration to virtual party:<div>
<div>Name: <input name="name" size="30"></div>
<div>E-mail: <input name="email" size="25"></div>
<div><input type="radio" name="food" value="" checked>
Food preference (select one):<br>
<input type="radio" name="food" value="Swedish meatballs">
 Swedish meatballs<br>
<input type="radio" name="food" value="Fish sticks">
 Fish sticks<br>
<input type="radio" name="food" value="Falafel">
 Falafel (this is vegetarian food)<br>
<input type="radio" name="food" value="no food">
 None (i.e., will not eat)<div>
<div><input type="submit"></div>
</form>

OK, you can try it:

Fake registration to virtual party:
Name:
E-mail:
Food preference (select one):
Swedish meatballs
Fish sticks
Falafel (this is vegetarian food)
None (i.e., will not eat)

Notes:

Checking data

One of the great advantages of using CGI.pm is that it provides very simple tools for checking data acceptability and requesting the user to fix the data. For example, we might wish to require that some name be given (i.e. the name field is not empty or blank), that an E-mail address satisfying some formal requirements is given, and that some food preference is selected. In real-life cases, you would probably want to do some real checking, like verifying a user name and password combination. But the basic techniques are perhaps best illustrated with an almost trivial example. And let's say that the E-mail is checked only against a missing "@" sign; this at least filters out some (accidental) erroneous submissions.

(Since you are about to ask anyway: No, you can't really check an E-mail address except by trying to send E-mail to it. See entry Can I verify the email addresses people enter in my Form? in the CGI Programming FAQ.)

The fundamental idea is to generate a form dynamically by a CGI script and do this so that the same script also processes the submitted data. Confused? Well, let's take an example (you could also see the example in action):

#!/usr/bin/perl
use CGI qw(:standard);

$regfile = '../perl/registrations.tsv';

print header;

if(param()) {
    $name = param('name');
    $email = param('email');
    $food = param('food');
    if(ok()) {
    open(REG,">>$regfile") or fail();
    print REG "$name\t$email\t$food\n";
    close(REG);
    print <<END;
<title>Thank you!</title>
<h1>Thank you!</h1>
<p>Your fake registration to Virtual Nonsense Party
has been recorded as follows:</p>
<p>Name: $name</p>
<p>E-mail: $email</p>
<p>Food preference: $food</p>
END
    exit; } }

%labels = (
  '' => 'Food preference (select one):',
  'Fish sticks' => 'Fish sticks',
  'Falafel' => 'Falafel',
  'no food' => 'None (i.e., will not eat)' );

print start_form, 'Fake registration to virtual party',br,
  'Name: ', textfield('name'), br,
  'E-mail: ', textfield('email'), br,
  radio_group(-name=>'food', -values=>\%labels, -linebreak=>'true',
              -default=>''),
  submit, end_form;

sub fail {
   print "<title>Error</title>",
   "<p>Error: cannot record your registration!</p>";
   exit; }

sub ok() {
    $fine = 1;
    if(!$name) { print 'Your name is required!', br; $fine = 0; }
    if(!$email) { print 'Your E-mail address is required!', br; $fine = 0; }
    elsif(!($email =~ m/\@/))
       { print 'An E-mail address must contain the @ character!', br;
         $fine = 0; }
    if(!$food) { print 'A food preference (even if none) is required!',
       br; $fine = 0; }
    if(!$fine) { print 'Please fix the data and resubmit', hr; }
    return $fine; }

Instead of writing a form which is statically part of an HTML document, we have a script which generates an HTML document (which contains a form). We can refer to the script using a normal link; the URL is in this case http://jkorpela.fi/cgi-bin/coll.cgi and what happens when that URL is referred to depends on the context:

Some key ingredients in the script:

The details will not be explained here. This was basically to give an idea of how things work with CGI.pm. Note that if you wish to have a form embedded into a static HTML page, you need something little more complicated. In effect, you would need to duplicate things by writing a static form element and the Perl code that generates a corresponding element dynamically in the form handler when an error in data has been detected and the form needs to be presented to the user. The reason is that dynamic generation is the only way to use previous user input as default values for fields so that the user does not need to type everything anew. (Well, technically, you could save the data onto the server, but that would be more complicated.)

Quick reference to CGI.pm functions

The following table lists some basic CGI.pm functions for generating form fields and related elements. Note that the statement use CGI qw(:standard); is needed to make these work and that the invocations just generate HTML constructs as strings; you need to use the print function to make them actually appear in a document generated by your CGI script.

Basic CGI.pm functions
function invocation meaning
startform($method,$action,$encoding) starts a form
start_multipart_form($method,$action) starts a form that may contain a file input field
endform() ends a form
textfield(-name=>'field_name', -default=>initial value, -size=>number, -maxlength=>number) single-line text input field
textarea(-name=>'field_name', -default=>initial value, -rows=>number, -cols=>number) multi-line text input field
password_field(-name=>'field_name', -size=>number, -maxlength=>number) "password" input field
filefield(-name=>'field_name', -default=>initial filename) file input field
popup_menu(-name=>'menu_name', -values=>array or hash reference, -default=>initial selection) select element
scrolling_list(-name=>'menu_name', -values=>array or hash reference, -default=>initial selection(s), -size=>number, -multiple=>'true') select element with size larger than 1
checkbox_group(-name=>'group_name', -values=>array or hash reference, -default=>initial selection(s), -linebreak=>'true') group of checkboxes with the same name
checkbox(-name=>'checkbox_name', -checked=>'checked', -value=>internal value, -label=>visible label) a standalone checkbox
radio_group(-name=>'group_name', -values=>array or hash reference, -default=>initial selection(s), -linebreak=>'true') group of radio buttons that work together
submit(-name=>'button_name', -value=>button text) submit button
reset( -value=>button text) reset button
defaults( -value=>button text) reset to original defaults
hidden(-name=>'field_name', -default=>array reference) hidden field
image_button(-name=>'button_name', -src=>image URL, -align=>alignment, -alt=>text, -value=>text) image submit button
button(-name=>'button_name', -value=>button text, -onClick=>script) button element, for client-side scripting

Note: The parameters -multiple=>'true' and -checked=>'checked' are optional and used to change the default (from allowing a single choice only and from being initially unchecked, respectively).

For resetting fields, the reset() function is seldom useful. The defaults() function can be used to set all fields to their very initial values as specified in the script itself; the script will in fact be called as if it were called the first time. In contrast, the button created with reset() will clear all changes that the user has made since the last invocation of the script. Thus, if your code contains textfield(-name=>'foo',-value='42'), then defaults() will always set foo to 42; but if the user had changed the input box content to 100 and submitted the form and got, in addition to other data, the same form in response, then that form will (normally) contain the latest value 100 as response, and reset() would set it back to 100, not the original 42.

An array reference could consist just of an "anonymous array", like ['none','apples,'oranges,'kiwis'], or a reference to a named array, such as \@foo. In order to make the texts seen by users as different from the strings used internally, which is often a good idea, you would use a hash reference instead, e.g. -values=\%val with %val defined by
%val = ( '0' => 'none', 'ap' => 'apples', 'or' => 'oranges', 'ki' => 'kiwis' );
But beware that in this case the order of the fields in the generated HTML markup may not correspond to the order of the hash constructor (since hashes are essentially unordered. Thus, to control the order, use a -values array and, when desired, a separate -labels hash.

Further reading