A tutorial on the Perl programming language.
For sample programs, there are links (named e.g. "this program") to plain text documents containing the programs, to let you download and try them more easily.
Perl is a programming language which can be used for a large variety of tasks. Both Perl interpreters and Perl documentation are freely available for Windows, Unix/Linux, and Macintosh e.g. from the Perl.org site.
A typical simple use of Perl would be for extracting information from a text file and printing out a report or for converting a text file into another form. In fact, the name Perl was formed from the expression Practical Extraction and Report Language. But Perl provides a large number of tools for quite complicated problems, including system administration tasks, which can often be programmed rather portably in Perl.
Perl has powerful string-manipulation functions. On the other hand, it eclectically combines features and purposes of the C language and many command or scripting languages. For such reasons, Perl looks rather odd on first sight. But once you have learned Perl, you will be able to write programs in Perl much faster than in most other languages. This makes Perl especially suitable for writing programs which are used once only.
The following simple Perl program reads a text file consisting people's names like "John Smith", each on a line of its own, and it prints them out so that the first and second name have been swapped and separated with a comma (e.g. "Smith, John").
while(<>) { split; print "$_[1], $_[0]\n"; }
As you can see, Perl is compact - and somewhat cryptic, until you learn some basic notations. (There will be a rather similar example, which is annotated in some detail.)
Perl has become popular for programming handlers for World Wide Web forms and generally as glue and gateway between systems, databases, and users.
Perl is typically implemented as an interpreted (not compiled) language. Thus, the execution of a Perl program tends to use more CPU time than a corresponding C program, for instance. On the other hand, computers tend to get faster and faster, and writing something in Perl instead of C tends to save your time.
Programs written in Perl are often called
Perl scripts,
especially
in the context of CGI programming,
whereas the term
the perl program refers to the system program
named perl
for executing Perl scripts. (What, confused already?)
If you have used Unix shell scripts or awk
or sed
or similar utilities
for various purposes,
you will find that you can normally use Perl for those and many other purposes,
and the code tends to be more compact.
And if you haven't used such utilities but have started thinking you might
have need for them, then perlhaps what you really need to learn is Perl
instead of all kinds of futilities.
Perl is relatively stable; the current major version, Perl 5, was released in 1994. Extensions to the language capabilities are mainly added though the construction of modules, written by different people and made available to other Perl programmers. For major general-purpose Perl applications, particularly CGI scripts and client or server applications, see CPAN documentation.
This course presumes that you have access to a system with Perl 5 installed You are also assumed to know or to find out how to invoke the Perl interpreter on the system they use. (Typically, this means giving, on a command prompt, a command of the form perl filename.)
By completing this course and its homework, you should be able to:
To keep this a short course, we won't explain object-oriented concepts and some other facilities appropriate for large projects.
Aiming at brevity, this course is written so that you need to use your intuition at times, guessing what some constructs might mean. The reason is that learning a programming language systematically, rigorously starting from all the basic concepts and constructs in detail, would take a much longer course. Naturally, you need to learn to consult references; actual programming shouldn't be based on guesswork.
Perl, perhaps more than any other computer language, is full of alternative ways to do the same thing; we tend to show only one or two. We will try to stimulate by examples of useful bits of code, results, and questions. Turn to the reference materials for further explanation.
When studying a new programming language, the crucial point is to learn to write and execute a program which prints Hello world! (or some other fixed string, but that's the traditional one). The reason is that in order to be able to do this, you need to know quite a lot of simple things about the language itself, and language manuals might not present them compactly enough. Moreover, you need to find out how to work with the language in your particular installation, e.g. how to start the language compiler or interpreter.
So let us get started:
lk-hp-23 perl 195 % cat >hello print "Hello world!\n"; lk-hp-23 perl 196 % perl hello Hello world! lk-hp-23 perl 197 %Explanations:
cat
command to create a file
named hello
and containing a very simple Perl
script. Normally one uses one's favorite editor (such as
Emacs or jEdit
or Notepad) to create Perl scripts, of course.
(You might wish to use names ending with .pl
for your Perl programs, partly in order to be able
to see which of your files are such programs.
If you use Emacs, there will be the additional benefit
that Emacs automatically enters Perl mode (if installed)
when a file with suffix .pl
is edited.)
\n
which stands for newline.
In Perl strings, many control characters can be represented in this
way, using the backslash character \
and a letter.
(This is the same convention which is used in the C programming
language.)
perl
command, which invokes the Perl
interpreter, and I gave
the name of the file as a command argument.
(In your system, the command name might be different
from perl
, but usually it isn't.)
In Perl (as in C), a simple statement is usually terminated by a semicolon. See rules for simple statements in the manual.
The following Perl script illustrates the use of simple (scalar) variables. It also introduces some other basic features of Perl.
The script prints out its input so that each line begins with a running line number:
$line = 1; while (<>) { print $line, " ", $_; $line = $line + 1; }
The scalar variable $line
is the line counter.
It is initialized to 1 at
the beginning, and it is incremented by 1 within
a loop which processes each input line at a time.
The loop construct is of the form
while (<>) {
process one line of input }
and although it looks cryptic at first sight, it is really very
convenient to use. You need not worry about actual input operations;
just use the construct shown above, and use the predefined variable
$_
to refer to the input line.
The print
statement in our example contains three arguments,
one for getting the line number printed, one for getting a blank printed,
and one for getting the input line printed. We do not have an argument
for getting a newline printed, since the value of
$_
is the entire input line,
including the trailing newline.
In fact, you could make your code even shorter: you could write the script as
$line = 1; while (<>) { print $line++, " ", $_; }Here the statement contains
$line++
instead of just
$line
,
since in Perl (as in C) you can increment a variable (after its old
value has been used) by appending the operator ++
to it.
You might wish to have the line numbers right-adjusted, e.g. each in a
fixed field of five characters, padded with blanks on the left.
This would be pretty easy, if you know the C language output formatting tools.
You could just replace the print
statement with
printf "%5d %s", $line++, $_;
Where does a Perl script read its input from?
By default, i.e. in the absence of any specification of input source,
the input comes from the so-called standard input stream
(often called stdin
)
By default this means user's keyboard.
Normally you want your script to read input from a file.
Simply write the name of the file as a command-line argument, i.e.
append the name to the command you use to start the script.
Thus, for example, if you had written
our simple script (the shorter version of it) into a file named
lines
, you could test it by using it as its own test data
(do you find this confusing?) as follows:
lk-hp-23 perl 251 % perl lines lines 1 $line = 1; 2 while (<>) { 3 print $line++, " ", $_; } lk-hp-23 perl 252 %
You can also write several file names as command-line arguments, e.g.
perl lines foo bar zapwhich would mean that the script
lines
takes as input
the contents of files foo
, bar
, and zap
as if you had concatenated the contents into a single file and given its name
as argument.
Quite useful Perl programs can be short. Suppose we want to
change the same text in many files. Instead of editing each
possible file or constructing some cryptic
find
, awk
, or sed
commands, you could issue a single command on Unix:
perl -e 's/gopher/World Wide Web/gi' -p -i.bak *.html
This command, issued at the Unix prompt, executes the short Perl program specified in single quotes. This program consists of one
Perl operation: it substitutes for original word "gopher" the phrase "World Wide Web" (globally,
ignoring case).
The
command-line options
imply
that the Perl program should run for each file ending in
.html
in the current directory. If any file blah.html
needs changing, a backup of the original is made as file
blah.html.bak
.
The book Programming Perl
lists additional handy one-liners.
The amazing one-liner relies on the behavior
of Unix command language processors (shells), which expand
a
wildcard
notation like *.html
into a list of file names,
before invoking the script. Thus, on other systems, you need to handle
such expansion in the Perl script, as in
the following example.
The following
script is a more universal variant of the amazing one-liner
discussed above. It works fine in Windows environments, too, because
it internally loops through all files with names ending with
.html
. (As a minor modification, this script
uses the name blah.bak
rather than
blah.html.bak
for the backup files.)
while(<*.html>) { $oldname = $_; open(OLD,"<$oldname") || die "Can't open input file $oldname $!"; s/\.html$/\.new/; $newname = $_; open(NEW,">$newname") || die "Can't open output file $newname $!"; while(<OLD>) { s/gopher/World Wide Web/gi; print NEW; } close(OLD); close(NEW); $backupname = $oldname; $backupname =~ s/\.html$/\.bak/; unlink $backupname || die "Can't delete old backup file $backupname $!"; rename $oldname, $backupname || die "Can't rename $oldname to $backupname $!"; rename $newname, $oldname || die "Can't rename $newname to $oldname $!"; }
The following a script is yet another variation of our theme. It may look structurally more familiar to those accustomed to "classic" procedural programming (in e.g. C or Pascal). Notice that this, as well as the original one-liner, does not handle any wildcard expansion in the script itself.
Note: Anything following a
number sign (#
) on a line in a
Perl program is a comment: ignored by a Perl
interpreter, hopefully useful to a human reader of the code.
# File: go2www # This Perl program in classic programming style changes # the string "gopher" to "World Wide Web" in all files # specified on the command line. $original='gopher'; $replacement="World Wide Web"; $nchanges = 0; # The input record separator is defined by Perl global # variable $/. It can be anything, including multiple # characters. Normally it is "\n", newline. Here, we # say there is no record separator, so the whole file # is read as one long record, newlines included. undef $/; # Suppose this program was invoked with the command # go2www ax.html big.basket.html candle.html # Then builtin list @ARGV would contain three elments # ('ax.html', 'big.basket.html', 'candle.html') # These could be accessed as $ARGV[0] $ARGV[1] $ARGV[2] foreach $file (@ARGV) { if (! open(INPUT,"<$file") ) { print STDERR "Can't open input file $file\n"; next; } # Read input file as one long record. $data=<INPUT>; close INPUT; if ($data =~ s/$original/$replacement/gi) { $bakfile = "$file.bak"; # Delete old backup file if existent unlink $bakfile; # Abort if can't backup original or output. if (! rename($file,$bakfile)) { die "Can't rename $file $!"; } if (! open(OUTPUT,">$file") ) { die "Can't open output file $file\n"; } print OUTPUT $data; close OUTPUT; print STDERR "$file changed\n"; $nchanges++; } else { print STDERR "$file not changed\n"; } } print STDERR "$nchanges files changed.\n"; exit(0);
!
means, as in:
if (! open(OUTPUT,">$file") ) { die "Can't open output file $file\n"; }
>
probably mean here? Compare with
open(INPUT
...)
.
die
do?
$nchanges++
do?
The Perl Creed is, "There is more than one way!" This noble freedom of expression however results in the first of the four Perl Paradoxes: Perl programs are easy to write but not always easy to read. For example, the following lines are equivalent!
if ($x == 0) {$y = 10;} else {$y = 20;} $y = $x==0 ? 10 : 20; $y = 20; $y = 10 if $x==0; unless ($x == 0) {$y=20} else {$y=10} if ($x) {$y=20} else {$y=10} $y = (10,20)[$x != 0];
if ($#ARGV >= 0) { $who = join(' ', @ARGV); } else { $who = 'World'; } print "Hello, $who!\n";
First you need to make sure that the directory containing the Perl interpreter is in your search path as set up at the command level. Please consult operating system specific information and local documents for details on this.
Let us assume that the above lines are stored in a
Unix file ~/bin/hello
. (That's in your home
directory, subdirectory bin
, file
hello
.)
You can then run the program by entering a command like
one of the following:
perl ~/bin/hello perl ~/bin/hello Citizens of Earth perl hello
(The last one works only you're in the
~/bin
directory.)
If you expect to use this program a lot and want to execute it as a command on Unix, then you need to do five things.
#!
followed by the full pathname
of the Perl interpreter
(the perl
command), typically something like
#!/usr/local/bin/perl
or #!/usr/bin/perl
.
You may append
command-line options
like
-w
(warn about possible inconsistencies),
to that line.
chmod 700 ~/bin/helloTo make it executable and readable by all enter a Unix command like the following:
chmod a+rx ~/bin/helloYou may also need to use
chmod a+x
on the directories ~
and ~/bin
.
See man chmod
for details and the security implications.
~/.cshrc
or ~/.login
to make directory ~/bin
part of the path Unix
searches for executables, with a line like this:
set path = ($path ~/bin)
.cshrc
file) or login
(.login
file). If you want it
to take effect immediately, enter the above
set path
command at
the Unix prompt or execute your .cshrc
or
.login
file with
the source
command. If you are using sh, bash, ksh or some
other shell, alter ~/.profile
or some other
file to set the path at login.
rehash
to rescan the path.
If you perform (1)-(5), then you can execute your program via a command like this:
hello
The rest of these notes will refer to the Perl manual, highlighting and expanding on important points. In this section, the relevant part of the manual is section perldata.
The value of a variable in Perl can be a number or a string, among other things. Variables are not typed. You can, for example, assign a string to a variable and later a number to the same variable. Variables are not declared at all. You can simply start using a variable.
An attempt to use an
uninitialized
variable causes a zero or an
empty string or the truth value false (depending on the context)
to be used.
However, using the command-line
switch -w
you can ask the Perl interpreter to issue
warnings,
such as
reporting uses of undefined values.
In addition to simple (scalar) variables and constants
Perl has two kinds of data structures: arrays (lists), and
associative arrays ("hashes").
Scalar variable names always begin with the dollar sign, e.g.
$myvar
.
(The simple program listing a file with
line numbers used a scalar variable for keeping
track of the line number.)
Names for arrays (and array slices) always begin with the commercial at sign,
e.g. @myarray
.
Names for hashes always begin with the percent sign,
e.g. %myhash
.
Perl allows combinations of these, such as lists of lists and associative arrays of lists. (See The Perl Data Structures Cookbook for illustrations of such advanced topics.)
Scalars can be numeric or character as determined by context:
123 12.4 5E-10 0xff (hex) 0377 (octal) 'What you $see is (almost) what \n you get' 'Don\'t Walk' "How are you?" "Substitute values of $x and \n in \" quotes." `date` `uptime -u` `du -sk $filespec | sort -n` $x $list_of_things[5] $lookup{'key'}
Different delimiters around a string have different effects:
''
(single
quotes), no substitutions are made
except for
\\
and
\'
which denote \
and '
.
""
(double quotes), variables like $x
and control codes
like \n
(newline) are evaluated and replaced by
their values.
``
("backquotes")
also allow substitution, then try to execute the
result as a system command, returning as the final value whatever
the system command outputs.
Arrays (also called lists) consist of sequentially-arranged scalars:
('Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday') (13,14,15,16,17,18,19) equivalent to (13..19) (13,14,15,16,17,18,19)[2..4] equivalent to (15,16,17) @whole_list
A notation like (2, 3, 5, 7)
can be called an array
constructor. It can be assigned to a an array variable in order to
initialize it:
@primes = (2, 3, 5, 7);
Associative arrays (also called hashes) resemble arrays but can be indexed with strings:
$DaysInMonth{'January'} = 31; $enrolled{'Joe College'} = 1; $StudentName{654321} = 'Joe College'; $score{$studentno,$examno} = 89; %whole_hash
If you have an array,
e.g. @myarr
, you can form
indexed variables
by appending an index in brackets (as in the C language)
but changing @
into $
.
The reason for the change is that the indexed
variable is a scalar.
Indexes are integer numbers, starting with 0 (as in C, but
unlike in many other languages).
For example, $myarr[5]
denotes the 6th element
of the array @myarr
. And if you have assigned
@wday = ('Sun','Mon','Tue','Wed','Thu','Fri','Sat');then
$wday[1]
equals 'Mon'
.
(Negative subscripts count from the end so that e.g.
$wday[-1]
would be 'Sat'
.)
Associative arrays are indexed with curly braces enclosing a string. $whatever, @whatever, and %whatever are three different variables.
You can also form
array slices,
for example
@myvar[5..10]
, which is an array (therefore, denoted
using @
) consisting of those components of
@myvar
which have an index between 5 and 10, inclusively.
Hashes,
on the other hand, can be indexed e.g. with strings, since
indexing method is different.
Conceptually, a Perl interpreter performs a search
from a hash, using the index as the search key.
For hashes, the index is in braces,
e.g. $myhash{'foobar'}
. Notice that in this case, too,
the indexed variable is a scalar and therefore begins with $
.
For example, the predefined hash name %ENV
denotes
the collection of so-called
environment variables.
Thus, you could refer e.g. to the value of the environment
variable HOST
with the expression
$ENV{"HOST"}
.
In Perl, you can easily split data into fields without coding the
details. You simply specify what you want, using the built-in
split
function, optionally with some arguments.
For instance, the statement
split;first splits the current input line into blank-separated fields and then assigns the fields to components of the predefined array variable
@_
.
You can then access the fields using indexed variables.
The special variable
$#_
contains information about the number of
fields: the value of that variable is the number of fields minus one.
(More generally, for any array variable @
a,
the variable $#
a contains the last index
of the array.)
Assume, for example, that you have some data where each line consists of blank-separated items (which might be strings or numbers) and you want to write a Perl script which picks up the second item from each line. (Such filtering is often needed to extract useful information from a large data file.) This is simple:
while (<>) { split; print $_[1], "\n"; }
Notice that you must use an index value of 1 to get the 2nd field, since array indexing begins at 0 in Perl.
$ identifier | simple (scalar) variable |
---|---|
@ identifier | list (normal array) |
% identifier | associative array |
& identifier | subroutine or function |
IDENTIFIER | filehandle |
Every variable kind (scalar, array, hash)
has its own namespace.
This
means that $foo
and @foo
are two different variables. It also means that
$foo[1]
is a part of
@foo
,
not a part of $foo
. This may seem a bit weird,
but that's okay, because it is weird.
Notice, in particular, that there are two important
predefined variables
$_
and @_
which are quite distinct from each other,
and e.g. $_[2]
is a component of @_
.
The case of letters is significant in variable names (as in Unix
commands and in the C language), e.g. $foo
and
$Foo
are distinct variables.
@days = (31,28,31,30,31,30,31,31,30,31,30,31); # A list with 12 elements. $#days # Last index of @days; 11 for above list $#days = 7; # shortens or lengthens list @days to 8 elements @days # ($days[0], $days[1],... ) @days[3,4,5] # = (30,31,30) @days{'a','c'} # same as ($days{'a'},$days{'c'}) %days # (key1, value1, key2, value2, ...)
If a letter or underscore is the
first character after the $
,
@
, or %
, the rest of the name
may also contain digits and underscores. If this character
is a digit, the rest must be digits. Perl has several dozen
special
(predefined)
variables, recognized from their
second character being non-alphanumeric.
For example, $/
is the input record separator, newline
"\n"
by default.
See section
Special Variables
in
Perl 5 Reference Guide
for a handy list.
The variable $_
is presumed (defaulted)
by Perl in many contexts when needed variables
are not specified. Thus:
<STDIN>; # assigns a record from filehandle STDIN to $_ print; # prints the current value of $_ chomp; # removes the trailing newline from $_ @things = split; # parses $_ into white-space delimited # words, which become successive # elements of list @things.
$_
, $/
, $1
,
and other implicit variables contribute to
Perl Paradox Number Two:
What you don't see can help you or hurt you.
The words "subroutine" and "function" are used interchangeably when discussing Perl. There really is no difference, but often a subprogram is called "function" if it returns a value and "subroutine" if it does not. On the other hand, quite often "function" means a builtin subprogram whereas "subroutine" means a subprogram which is defined in a Perl program.
Subroutines/functions
are referenced with
names containing
an initial
&
,
which is optional if reference is obviously a
subroutine/function such as following the sub
,
do
, and sort
directives.
An example of a simple function (which returns the square of
its argument), and a sample invocation:
sub square { return $_[0] ** 2; } print "5 squared is ", &square(5);
Inside a function, the special variable @_
contains the list of actual arguments, so $_[0]
refers to the first argument (which is the only one in the
example above).
Filehandles
don't start with a special character, and so
as to not conflict with reserved words are most reliably
specified as uppercase names:
INPUT
, OUTPUT
, STDIN
,
STDOUT
,
STDERR
, etc.
print '007',' has been portrayed by at least ', 004, ' actors. '; print 7+3, ' ', 7*3, ' ', 7/3, ' ', 7%3, ' ', 7**3, ' '; $x = 7; print $x; print ' Doesn\'t resolve variables like $x and backslashes \n. '; print "Does resolve $x and backslash\n"; $y = "A line containing $x and ending with line feed.\n"; print $y; $y = "Con" . "cat" . "enation!\n"; print $y;
This produces:
007 has been portrayed by at least 4 actors. 10 21 2.33333333333333 1 343 7 Doesn't resolve variables like $x and backslashes \n. Does resolve 7 and backslash A line containing 7 and ending with line feed. Concatenation!
In fact, most of the output "runs together", into one line. (The long line has been split above to keep the width of this document reasonable.) Can you guess why?
The following
example
illustrates, in addition to comparisons, the
<<
mechanism which is very useful
when a program has to write out a multi-line string
(e.g. in conjunction with CGI scripts).
$x = 'operator'; print <<THATSALL; A common mistake: Confusing the assignment $x = and the numeric comparison $x ==, and the character comparison $x eq. THATSALL $x = 7; if ($x == 7) { print "x is $x\n"; } if ($x = 5) { print "x is now $x,", "the assignment is successful.\n"; } $x = 'stuff'; if ($x eq 'stuff') { print "Use eq, ne, lt, gt, etc for strings.\n"; }
This produces:
A common mistake: Confusing the assignment operator = and the numeric comparison operator ==, and the character comparison operator eq. x is 7 x is now 5, the assignment is successful. Use eq, ne, lt, gt, etc for strings.
@stuff = ('This', 'is', 'a', 'list.'); print "Lists and strings are indexed from 0.\n"; print "So \$stuff[1] = $stuff[1], ", "and \$#stuff = $#stuff.\n"; print @stuff,"\n"; print join('...',@stuff),"\n"; splice(@stuff, 3, 0, ('fine', 'little')); print join('...',@stuff),"\n";
This produces:
Lists and strings are indexed from 0. So $stuff[1] = is, and $#stuff = 3. Thisisalist. This...is...a...list. This...is...a...fine...little...list.
The following program prompts for a date in a numeric (ISO 8601) format and reads and parses it.
print "Enter a date numerically: year-month-dayyear\n"; $_ = <STDIN>; chomp; ($year,$month,$day) = split('-');
Complete this program
so that it checks whether the date is valid.
Print an error message if the month is
not valid. Print an error message if the day is not valid for
the given month (31 is ok for January but not for February).
See if you can avoid using conditionals (if
,
unless
, ?
,...)
statements but instead use data structures.
Approach this incrementally. On the first draft, assume that the user enters three numbers separated by hyphens and that February has 28 days. Subsequent refinements should account for bad input and leap year. Finally, find a Perl builtin function that converts a date to system time, and see how to use that to validate time data generally.
Start with a few assignments like:
$name{12345} = 'John Doe'; $name{24680} = 'Jane Smith';Print these scalars. What is the value of an associative array element that has never been assigned? What happens if you assign an associative array to a scalar? What happens if you assign an associative array to a normal array?
$blunk = %name; @persons = %name; print '$blunk=',$blunk,', @persons=', join(', ',@persons),"\n";What happens if you assign a normal array to an associative array?
Perl has a rich set of control structures.
See section perlsyn
in the manual
for the full list.
Theoretically, and very
often practically too, you can use just
if
statements for branching
and
while
statements for looping.
Within control structures you specify the actions to be conditionally
or repeatedly executed as
blocks.
A block is simply
a sequence of statements surrounded by braces. Notice that
braces {}
are always required
(unlike in C).
The simplest if
statement is of the form
if(
expression)
block
which means that the expression is evaluated, and if the result is true, the block is executed.
For example, the statement if($i < 10) {$j = 100;}
sets the value of $j
to 100 if the value of
$i
is less than 10. As mentioned above, braces are required
(even if there is a single statement within them), and the parentheses
around the condition expression are obligatory, too.
A two-branch if
statement is of the form
if(
expression)
block1 else
block2
which means that the expression is evaluated, and if the result is true, block1 is executed, otherwise block2 is executed.
The while
statement is of the form
while(
expression)
block
which means that the expression is evaluated, and if the result is true, the block is executed, then the expression is re-evaluated and the process is repeated until the expression evaluates to false.
As a simple
example
of using
the while
statement is the following script, which
splits input lines into fields (in a manner described above) and
prints out the fields in reverse order.
while (<>) { split; $i = $#_; while($i >= 0) { print $_[$i--], " "; } print "\n"; }The control in the (inner)
while
loop is based on
using an auxiliary variable $i
, which is initialized
to the index of the last field and decremented (using the C-style
embedded decrement operator --
) within the loop until
it reaches zero, i.e. all fields have been processed.
The operator >=
has the obvious meaning
'is greater than or equal to'.
The for
statement is of the form
for(
initialization;
condition;
updating)
block
If you are familiar with the for
statement in C,
you probably want to use for
in Perl too, but
you might as well use just while
as the loop
construct. However, in Perl there is also a foreach
statement, which will be illustrated by the next example
(and was already used in
a previous example).
print "$#ARGV is the subscript of the ", "last command argument.\n"; # Iterate on numeric subscript 0 to $#ARGV: for ($i=0; $i <= $#ARGV; $i++) { print "Argument $i is $ARGV[$i].\n"; } # A variation on the preceding loop: foreach $item (@ARGV) { print "The word is: $item.\n"; } # A similar variation, using the # "Default Scalar Variable" $_ : foreach (@ARGV) { print "Say: $_.\n"; }Demonstration run of this program
> perl example5.pl Gooood morning, Columbia! 2 is the subscript of the last command argument. Argument 0 is Gooood. Argument 1 is morning,. Argument 2 is Columbia!. The word is: Gooood. The word is: morning,. The word is: Columbia!. Say: Gooood. Say: morning,. Say: Columbia!.
The following program illustrates simple interaction with user.
print STDOUT "Tell me something: "; while ($input = <STDIN>) { print STDOUT "You said, quote: $input endquote\n"; chomp $input; print STDOUT "Without the newline: $input endquote\n"; if ($input eq '') { print STDERR "Null input!\n"; } print STDOUT "Tell me more:\n"; } print STDOUT "That's all!\n";
Note 1: The while
statement's condition is an
assignment statement: assign the next record from standard
input to the variable $input
. On end of file, this will assign
not a null value but an "undefined" value.
On keyboard input, end of file can be simulated in different ways on
different systems; for example, on Unix the method is traditionally
control-D, while on DOS it is control-Z followed by a newline (enter).
An undefined value
in the context of a condition evaluates to "false". So the
while ($input = <STDIN>)
does three things:
gets a record, assigns it to $input
, and tests whether
$input
is undefined. In other contexts, Perl treats an undefined
variable as null or zero. Thus, if $i
is not
initialized, $i++
sets $i
to 1.
Perl Paradox Number Three:
Side effects can yield an elegant face or a pain in the rear.
Note 2:
Data records are by default terminated by a newline
character "\n" which in the above example is included as the last
character of variable $input. The
chomp
function removes the trailing
end-of-line (newline) indicator (if present), which is defined in
the special variable $/
.
(The
chomp
function was introduced Perl 5. Old programs often use
the less safe function
chop
,
which simply
removes the last character, whatever it is.)
Demonstration:
> perl example6.pl Tell me something: I'm warm. You said, quote: I'm warm. endquote Without the newline: I'm warm. endquote Tell me more: Can I have some water? You said, quote: Can I have some water? endquote Without the newline: Can I have some water? endquote Tell me more: You said, quote: endquote Without the newline: endquote Null input! Tell me more: ^D That's all!
for (;;) { print '(',join(', ',@result),")\n? "; last unless $input = <STDIN>; $? = ''; $@ = ''; $! = ''; @result = eval $input; if ($?) { print 'status=',$?,' ' } if ($@) { print 'errmsg=',$@,' ' } if ($!) { print 'errno=',$!+0,': ',$!,' ' } }
This
reads a line from the terminal and executes it as a Perl
program. The
for (;;) {
...}
construct
makes an endless loop. The last unless
line
might be equivalently specified:
$InPuT = <STDIN>; # Get line from standard input. if (! defined($InPuT)) {last;} # If no line, leave the loop.
The eval
function in Perl evaluates a string as a Perl program.
The special variable
$@
contains the Perl error message from the last
eval
or do
.
Demonstration: (note that the statements
system 'date'
and
$x=`date`
invoke a system command named date
and are therefore
system-dependent and work (in a useful way)
mostly on Unix):
perl perls.pl () ? Howdy (Howdy) ? 2+5 (7) ? sqrt(2) (1.4142135623731) ? $x=sqrt(19) (4.35889894354067) ? $x+5 (9.35889894354067) ? 1/0 errmsg=Illegal division by zero at (eval 6) line 3, <STDIN> chunk 6. () ? system 'date' Fri Feb 5 15:33:47 EET 1999 (0) ? $x=`date` (Fri Feb 5 15:34:06 EET 1999 ) ? chomp $x (1) ? @y=split(' ',$x) (Fri, Feb, 5, 15:34:06, EET, 1999) ? @y[1,2,5] (Feb, 5, 1999) ? localtime() (39, 38, 15, 5, 1, 99, 5, 35, 0) ? foreach (1..3) {print sqrt(),' ';} 1 1.4142135623731 1.73205080756888 () ? exit
The following
program
illustrates reading from a file and writing to a file.
It also reads from character from standard input, in order to
let the user control what happens. Moreover, it illustrates
how
"short circuit"
operator ||
can be used so that
error processing can be written more conveniently.
An expression of the form A||
B
is evaluated so that A is always evaluated first,
and if the result is "true", the expression B
is not evaluated at all.
# Function: Reverse each line of a file # 1: Get command line values: if ($#ARGV !=1) { die "Usage: $0 inputfile outputfile\n"; } ($infile,$outfile) = @ARGV; if (! -r $infile) { die "Can't read input $infile\n"; } if (! -f $infile) { die "Input $infile is not a plain file\n"; } # 2: Validate files open(INPUT,"<$infile") || die "Can't input $infile $!"; if ( -e $outfile) { print STDERR "Output file $outfile exists!\n"; until ($ans eq 'r' || $ans eq 'a' || $ans eq 'e' ) { print STDERR "replace, append, or exit? "; $ans = getc(STDIN); } if ($ans eq 'e') {exit} } if ($ans eq 'a') {$mode='>>'} else {$mode='>'} open(OUTPUT,"$mode$outfile") || die "Can't output $outfile $!"; # 3: Read input, reverse each line, output it. while (<INPUT>) { chomp $_; $_ = reverse $_; print OUTPUT $_,"\n"; } # 4: Done! close INPUT,OUTPUT; exit;
This example produces a score summary report by combining data from a simple file of student info and a file of their scores.
Input file stufile
is delimited with colons.
Fields are Student ID, Name, Year:
123456:Washington,George:SR 246802:Lincoln,Abraham "Abe":SO 357913:Jefferson,Thomas:JR 212121:Roosevelt,Theodore "Teddy":SO
Input file scorefile
is delimited with blanks.
Fields are Student ID, Exam number, Score on exam. Note that Abe is missing exam 2:
123456 1 98 212121 1 86 246802 1 89 357913 1 90 123456 2 96 212121 2 88 357913 2 92 123456 3 97 212121 3 96 246802 3 95 357913 3 94
The desired report:
Stu-ID Name... 1 2 3 Totals: 357913 Jefferson,Thomas 90 92 94 276 246802 Lincoln,Abraham "Abe" 89 95 184 212121 Roosevelt,Theodore "Teddy" 86 88 96 270 123456 Washington,George 98 96 97 291 Totals: 363 276 382
The program that made this report:
# Gradebook - demonstrates I/O, associative
# arrays, sorting, and report formatting.
# This accommodates any number of exams and students
# and missing data. Input files are:
$stufile='stufile';
$scorefile='scorefile';
open (NAMES,"<$stufile")
|| die "Can't open $stufile $!";
open (SCORES,"<$scorefile")
|| die "Can't open $scorefile $!";
# Build an associative array (%name) of student info
# keyed by student number
while (<NAMES>) {
($stuid,$name,$year) = split(':',$_);
$name{$stuid}=$name;
if (length($name)>$maxnamelength) {
$maxnamelength=length($name);
}
}
close NAMES;
# Build an assoc. array (%score) from the test scores:
while (<SCORES>) {
($stuid,$examno,$score) = split;
$score{$stuid,$examno} = $score;
if ($examno > $maxexamno) {
$maxexamno = $examno;
}
}
close SCORES;
# Print the report from accumulated data!
printf "%6s %-${maxnamelength}s ",
'Stu-ID','Name...';
foreach $examno (1..$maxexamno) {
printf "%4d",$examno;
}
printf "%10s\n\n",'Totals:';
# Subroutine "byname" is used to sort the %name array.
# The "sort" function gives variables $a and $b to
# subroutines it calls.
# "x cmp y" expression returns -1 if x lt y, 0 if x eq y,
# +1 if x gt y. See the Perl documentation for details.
sub byname { $name{$a} cmp $name{$b} }
# Order student IDs so the names appear alphabetically:
foreach $stuid ( sort byname keys(%name) ) {
# Print scores for a student, and a total:
printf "%6d %-${maxnamelength}s ",
$stuid,$name{$stuid};
$total = 0;
foreach $examno (1..$maxexamno) {
printf "%4s",$score{$stuid,$examno};
$total += $score{$stuid,$examno};
$examtot{$examno} += $score{$stuid,$examno};
}
printf "%10d\n",$total;
}
printf "\n%6s %${maxnamelength}s ",'',"Totals: ";
foreach $examno (1..$maxexamno) {
printf "%4d",$examtot{$examno};
}
print "\n";
exit(0);
The foreach $stuid
... loop first
calls the predefined function
sort
,
passing
the name of the ordering
subroutine
byname
as
the first parameter. That function returns an array,
and the loop iterates over that array so that $stuid
gets each of the values in the array in succession.
More advanced applications could be written using the
feature that
Perl allows an associative array to be "tied"
(using the
tie
function)
to a genuine
database, such that expressions like
$record = $student{$key}
use the database.
In this section, the first example illustrates how Perl can be used in a system-independent way for system-related tasks like renaming files. For more information, see especially section Functions for filehandles, files, or directories of perlfunc in the manual. The second example illustrates a job which is more deeply system-related and therefore needs system-specific (Unix-specific, in the example) methods.
Unix users often get frustrated when they need to rename files e.g. so
that all file names ending with some suffix (like .for
)
are renamed by changing the suffix (e.g. to .f
).
In some operating systems this is easy, but in normal Unix
command interpreters there is no direct way to do it.
(A novice user often tries mv *.for *.f
but it does not
work at all in the way you would like.)
No problem, it's easily done in Perl, for example as follows:
while(<*.for>) { $oldname = $_; s/\.for$/\.f/; rename $oldname, $_; }
This works on any system with reasonably normal file naming conventions, not just Unix.
The while
statement is different from what we have seen before.
It means that all file names matching the pattern within the angle
brackets (here *.for
) are processed and assigned, each in turn,
to the variable $_
. In fact, the meaning of $_
is not simply 'the current input line' as told before but more generally
'the current data being processed', and the context defines in each case
what this exactly means.
Within the loop, the file name is copied to variable $oldname
and then modified using a construct which performs a
substitution.
One might try to use simply
s/.for/.f/;
instead of
s/\.for$/\.f/;
.
Although the simpler version works in most cases,
it is buggy, because the symbol . stands for
any character, not just the period, and
because there is no requirement that the string .for
must appear at the end of the file name only.
Thus, the code would rename e.g.
zapfor.for
to za.f.for
.
To refer to
the period character, one
must use
"escape" notation by prefixing it with a backslash.
Moreover, if the trailing $
(denoting end of line) is
omitted,
the code would
apply to the first appearance of .for
in the filename.
Finally, the rename operation is performed using a Perl built-in function,
rename
,
which takes two file names as arguments.
Alternatively, we could also use the following:
system "mv $oldname $_";which does the same operation (less efficiently, and in a system-dependent manner) by asking the Unix system to execute its
mv
command.
This program works under Unix only.
The following program
reports disk usage by the files specified as arguments.
The Unix command du -sk
...
(on BSD Unix, du -s
...)
produces a series of lines like
1942 bin 2981 etclisting the Kbytes used by each file or directory. It doesn't show other information, such as the modification date or owner. This program gets
du
's Kbytes and filename, and merges
this info with other useful information for each file.
$files = join(' ',@ARGV); # The trailing pipe "|" directs command output # into our program: if (! open (DUPIPE,"du -sk $files | sort -nr |")) { die "Can't run du! $!\n"; } printf "%8s %-8s %-16s %10s %s\n", 'K-bytes','Login','Name','Modified ','File'; while (<DUPIPE>) { # parse the du info: ($kbytes, $filename) = split; # Call system to look up file info like "ls" does: ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev, $size,$atime,$mtime,$ctime) = stat($filename); # Call system to associate login & name with uid: if ($uid != $previous_uid) { ($login,$passwd,$uid,$gid,$quota,$comment, $realname,$dir,$shell) = getpwuid($uid); ($realname) = split(',',substr($realname,0,20)); $previous_uid = $uid; } # Convert the modification-time to readable form: ($sec,$min,$hour,$mday,$mon,$myear) = localtime($mtime); $mmonth = $mon+1; $myear = 1900 + $myear; printf "%8s %-8s %-16s %4d-%02d-%02d %s\n", $kbytes, $login, $realname, $myear, $mmonth, $mday, $filename; }
Demonstration output:
K-bytes Login Name Modified File 40788 c527100 Fred Flintstone 1995-10-05 c527100 32685 c565060 Peter Parker 1995-10-05 c565060 24932 c579818 Clark Kent 1995-10-06 c579818 15388 c576657 Lois Lane 1995-10-06 c576657 9462 c572038 Bruce Wayne 1995-10-06 c572038 8381 c517401 Eric McGregor 1995-10-05 c517401 7022 c594912 Asterisk de Gaul 1995-10-05 c594912
The pattern matching and substitution operators are described in detail in section Regexp Quote-Like Operators of perlop in the manual. See also Regular expressions in Perl for a tabular summary with examples.
tr
Perl has powerful tools for string manipulation. But let is first take a simple example. One often wants to convert letters in input data to lower case. That's easy:
tr /A-Z/a-z/;This can be read as follows: "translate all characters in the range from A to Z to the corresponding characters in the range from a to z".
The operation is applied to the value of $_
,
typically the current
input line. If you would like it to be applied to the value of a
variable $foo
, you should write
$foo =~ tr /A-Z/a-z/;Thus, the syntax is odd-looking, but once you get accustomed to it, the Perl string manipulation tools are easy to use.
=~
The =~
operator performs pattern matching.
For example, if:
$s = 'One if by land and two if by sea';then:
if ($s =~ /if by la/) {print "YES"} else {print "NO"}prints YES, because the value of
$s
matches the simple constant
pattern "if by la".
if ($s =~ /one/) {print "YES"} else {print "NO"}prints NO, because the string does not match the pattern. However, by adding the
i
option to ignore case of letters, we would get a
YES
from the following:
if ($s =~ /one/i) {print "YES"} else {print "NO"}
Matching involves use of patterns called regular expressions. This, as you will see, leads to Perl Paradox Number Four: Regular expressions aren't. See section perlre in the manual.
Patterns can contain a mind-boggling variety of special directions that facilitate very general matching. For example, a period matches any character (except the "newline" \n character).
if ($x =~ /l.mp/) {print "YES"}would print YES for
$x
= "lamp" or "lump" or "slumped",
but not for $x
= "lmp" or "less amperes".
Parentheses
()
group pattern elements. An asterisk *
means
that the preceding character, element, or group of elements may occur zero
times, one time, or many times. Similarly, a plus
+
means that
the preceding element or group of elements must occur at least
once. A question mark ?
matches zero or one times. So:
/fr.*nd/ matches "frnd", "friend", "front and back" /fr.+nd/ matches "frond", "friend", "front and back" but not "frnd". /10*1/ matches "11", "101", "1001", "100000001". /b(an)*a/ matches "ba", "bana", "banana", "banananana" /flo?at/ matches "flat" and "float" but not "flooat"
Square brackets []
match a class of single characters.
[0123456789] matches any single digit [0-9] matches any single digit [0-9]+ matches any sequence of one or more digits [a-z]+ matches any lowercase word [A-Z]+ matches any uppercase word [ab n]* matches the null string "", "b", any number of blanks, "nab a banana"
[^class]
matches those characters which do not match
[class]
(i.e., ^
denotes negation here -
but something quite different outside brackets,
see below):
[^0-9] matches any non-digit character.
Curly braces {}
allow more precise specification of repeated
fields. For example [0-9]{6}
matches any sequence
of 6 digits, and [0-9]{6,10}
matches any sequence of
6 to 10 digits.
Patterns float, unless anchored.
The circumflex ^
(outside []
)
anchors a pattern to the beginning, and dollar sign
$
anchors
a pattern at the end, so:
/at/ matches "at", "attention", "flat", & "flatter" /^at/ matches "at" & "attention" but not "flat" /at$/ matches "at" & "flat", but not "attention" /^at$/ matches "at" and nothing else. /^at$/i matches "at", "At", "aT", and "AT". /^[ \t]*$/ matches a "blank line", one that contains nothing or any combination of blanks and tabs.
Other characters simply match themselves, but the characters
+?.*^$()[]{}|\
and usually /
must
be escaped with a backslash \
to be taken
literally.
/10.2/ matches "10Q2", "1052", and "10.2" /10\.2/ matches "10.2" but not "10Q2" or "1052" /\*+/ matches one or more asterisks /A:\\DIR/ matches "A:\DIR" /\/usr\/bin/ matches "/usr/bin"If a backslash precedes an alphanumeric character, this sequence takes a special meaning, typically a short form of a
[]
character class. For example,
\d
is the same as
the [0-9]
digits character class.
/[-+]?\d*\.?\d*/ is the same as /[-+]?[0-9]*\.?\d*/Either of the above matches decimal numbers: "-150", "-4.13", "3.1415", "+0000.00", etc.
A simple \s
specifies "white space", the same as
the character class [ \t\n\r\f]
(blank, tab,
newline, carriage return, formfeed). A character may be
specified in
hexadecimal
as a \x
followed by two
hexadecimal digits which specify the
Ascii
code of the character; for example,
\x1b
is the ESC character.
A vertical bar
|
means "or".
if ($answer =~ /^yes|^yeah/i ) { print "Affirmative!"; }prints Affirmative! for
$answer
equal to "yes" or "yeah" (or
"YeS", or "yessireebob, that's right", but not "yep").
The =~
operator can be used for making
substitutions in strings.
An expression of the form
$variable =~ /pattern/
tests whether the value of
variable matches pattern.
Normally such an expression is used as a condition (test) in
an if
statement or other control structure.
But
an expression of the form
$variable =~
s/pattern/pattern2/
first tests for a match, and if there is a match,
replaces, within the value of variable,
the string that matched pattern by
pattern2 or, if it contains special notations,
by a string generated from pattern2.
If you wish to modify the value of the
predefined variable
$_
, you can write simply
s/pattern/pattern2/
When you include parentheses
()
in a matched
string, the matching text in the parenthesis may subsequently be
referenced via variables
$1
, $2
, $3
, ... for each left
parenthesis encountered. These matches can also be assigned as
sequential values of an array.
Example. Assume that we have a text file containing
notations like U+0123
which we wish to modify
by slapping the strings
<code>
and
</code>
around them.
The exact format of
those notations is U+
followed by one or
more
hexadecimal characters. Thus, the
following
program
would do the job:
while(<>) {
s?U\+([0-9a-fA-F]+)?<code>U\+$1</code>?g;
print;}
Note: Although we normally use the slash
character /
when specifying substitutions, here
it cannot be used, since the slash occurs in the patterns.
We can then pick almost any character which does not occur
in the patterns and use it as a separator; here we use the
question mark ?
.
(Alternatively, we could "escape" the /
as \/
in the patterns.) Notice that the plus sign +
must
be "escaped" (as \+
) when it needs to stand for itself
and not act as a special character in a
regular expression.
The following program parses dates in a strange old format.
$s = 'There is 1 date 10/25/95 in here somewhere.'; print "\$s=$s\n"; $s =~ /(\d{1,2})\/(\d{1,2})\/(\d{2,4})/; print "Trick 1: \$1=$1, \$2=$2, \$3=$3,\n", " \$\`=",$`," \$\'=",$',"\n"; ($mo, $day, $year) = ( $s =~ /(\d{1,2})\/(\d{1,2})\/(\d{2,4})/ ); print "Trick 2: \$mo=$mo, \$day=$day, \$year=$year.\n"; ($wholedate,$mo, $day, $year) = ( $s =~ /((\d{1,2})\/(\d{1,2})\/(\d{2,4}))/ ); print "Trick 3: \$wholedate=$wholedate, \$mo=$mo, ", "\$day=$day, \$year=$year.\n";
Results of above:
$s=There is 1 date 10/25/95 in here somewhere. Trick 1: $1=10, $2=25, $3=95, $`=There is 1 date $'= in here somewhere. Trick 2: $mo=10, $day=25, $year=95. Trick 3: $wholedate=10/25/95, $mo=10, $day=25, $year=95.Note that if patterns are matched in an array context as in Tricks 2 and 3, special variables
$1
, $2
, ..., and
$`
, $'
, and $&
are not set.
Using a combination of Tricks 1 and 3, we can write the following program which processes its input by replacing notations like 10/25/95 (where the month appears first) by ISO 8601 conformant notations like 1995-10-25. The program uses a conditional operator to add 1900 to the year if it was less than 100.
while(<>) { while( m/((\d{1,2})\/(\d{1,2})\/(\d{2,4}))/ ) { $year = $4 < 100 ? 1900+$4 : $4; $newdate = sprintf "%04d-%02d-%02d", $year, $2, $3; s/$1/$newdate/; } print; }
Consider the simple regular expression
k.*
which means 'the letter k followed by
a sequence of any characters'. For a string containing k,
there would be several possible matches if the language definition
did not say how the matching is performed. In Perl, the
definition is that the longest possible string is taken;
in our example, that means that the expressions matches
the substring which extends from the first occurrence of k to
the end of the string.
Regular expressions are greedy, seeking the longest possible match not the shortest match.
This rule applies to matches involving the
repetition specifier *
or +
.
It does not apply to selecting between
alternatives separated with |
in a regular
expression. If we have, say, k.*|zap
then
the first substring that matches k.*
or
zap
is taken, so if zap
occurs first,
it does not matter that a match to k.*
would give
a longer match. But if a k
is found before
any zap
is found, then matching to k.*
is done the normal way, taking the largest among possible matches
(i.e. taking the rest of the string).
In the following example we try to match whatever is between "<" and ">":
$s = 'Beware of <STRONG>greedy</strong> regular expressions.'; print "\$s=$s\n"; ($m) = ( $s =~ /<(.*)>/ ); print "Try 1: \$m=$m\n"; ($m) = ( $s =~ /<([^>]*)>/ ); print "Try 2: \$m=$m\n";This results in:
$s=Beware of <STRONG>greedy</strong> regular expressions. Try 1: $m=STRONG>greedy</strong Try 2: $m=STRONG
Thus, by using a more specific match (which says that the
string between "<" and ">" must not contain ">") we get the
result we want. In Perl 5, it would also be possible to
use *?
instead of *
to request the
shortest match to be made. That would mean using
/<(.*?)>/
in our case. (This special meaning for
the question mark in a specific context is rather strange, but
useful.)
1. See preceding
"Grade Book" example. Using the same
stufile
input, print a list of students ordered by family name,
with any quoted nickname listed in place of the given name, and
family name last. Produce output like this:
Student-ID Year Name 357913 JR Thomas Jefferson 246802 SO Abe Lincoln 212121 SO Teddy Roosevelt 123456 SR George Washington
To avoid wasting your time, please check - from applicable local documents or by contacting local webmaster - whether you can install and run CGI scripts written in Perl on a Web server. At the same time, please check how to do that in detail - specifically, where you need to put your CGI scripts.
Depending on the Web server where your pages are, you might, or might not, be able to install and use Perl programs as CGI scripts. This means that you can then e.g. set up HTML forms so that the submitted data is passed as input to the script (that is, your Perl program), which is executed so that its standard output is sent back to the browser from which the form was submitted.
If you know C, you may wish to take a look at Getting Started with CGI Programming in C for comparison, before or after studying how to write CGI scripts in Perl.
This document used to contain an example that was based on
the old
cgi-lib.pl
module.
Despite its simplicity, it can hardly be recommended even
to novices.
There is a more modern module for the purpose:
CGI.pm
, available from
CPAN.
I have now a separate document that introduces to simple CGI in Perl: The Fool's Guide to CGI.pm.
Use the
command-line option
compiler -w
to
warn
about
identifiers that are referenced only once, uninitialized scalars,
predefined subroutines, undefined file handles, probable confusion
of ==
and eq
, and other things.
On Unix, this can be coded in the
first line:
#!/usr/local/bin/perl -w
where you need to replace the path by one that is applicable
in your environment. (Cf. to section
Making it command-like on Unix.)
Section perlrun
in the manual
explains
how to simulate #!
on non-Unix systems.
As you write your program, put in print
statements to display variables as you proceed.
Comment them out using #
when you feel you don't need to see their output.
CGI scripts require some special attention in testing. In addition to checking server-dependent things, make sure you know where the problem is; you probably need to use a simple "echoing" script to see whether the problem is on the HTML document containing the form, or in a browser, or in your script.
Adapted from Programming Perl, page 361. For more traps, see See section perltrap in the manual.
print
scaffolding
to
dump values and show progress.
-w
switch.
$
or @
or
%
from the front of a variable.
()
or {}
or
[]
or
""
or
''
``
or
<>
.
''
with
quotation marks ""
or slash /
with backslash \
.
==
instead of eq
,
!=
instead of ne
, =
instead of ==
, etc. ('White' == 'Black')
and
($x = 5)
evaluate as (0 == 0)
and
(5)
and thus are true!
else if
instead of elsif
.
`date`
)
or not chomping input:
print "Enter y to proceed: "; $ans = <STDIN>; chomp $ans; if ($ans eq 'y') { print "You said y\n";} else { print "You did not say 'y'\n";}
$_
, $1
,
or other side-effect variables, then modifying
the code in a way that unknowingly affects or is affected by these.
This document is largely based on Introduction to Perl by Greg Johnson of MU Campus Computing, a document that now seems to have disappeared from the Web. Changes were made to make the presentation less Unix-specific. Some formulations were modified, links fixed, and so on. Most of the material from my Introduction to Perl was added. This happened in 2001 and 2002. As this document is still used by people, I made a basic cleanup (removing or fixing links that didn't work etc.) in 2011, without adding content.
Date of last update: 2014-03-16.
Jukka Korpela.