Perl Basics
This short guide will cover the basics of the Perl language. We will cover certain features in greater depth later on, but feel free to ask for clarification now. Also, please refer to the following resources for better introductions to Perl:
- UNIX and Perl Primer for Biologists – Perl content starts on p50, but the UNIX section is very useful. Highly recommended!
- Beginning Perl – free book for beginners
Table of Contents
What is Perl?
Perl is a high-level programming language widely used by the bioinformatics community. Check out the following links at your leisure:
In this class, you will instruct your computer via statements in Perl. You might need a simple program to do this (in fact, you will for Project 1):
Take the reverse complement of a DNA sequence
Perl can do this, but it needs some help – you need to give it instructions in something it understands (Perl code). Much like English, the Perl language is comprised of a number of constructs and rules, but unlike English, it is consistent.
How do I speak Perl?
Perl programs are made up of statements, which in turn are comprised of
- comments,
- variables,
- data types,
- functions (subroutines in Perl),
- and conditionals.
Statements
Statements in Perl are like sentences in English, but less complicated. They look like this:
<SOMETHING>;
The important part is the semicolon at the end. This tells Perl your instruction has ended. Here are some examples:
my $var = 'Blah'; # set a variable print 'wah-wah-wahhhh'; # call a print() function delete_my_files(); # call a function
Where do you omit semicolons?
if
,elsif
,else
statements- subroutine definitions
Comments
Leave a note for yourself explaining the purpose of a code block. Place a pound/hash symbol on a line, and everything after it will not be executed by Perl.
# parse Vienna notation into something useful # ...
Variables
These are symbols representing a value, like in math. In Perl, these are mutable, so you can have \(x = 2\) and then change it to \(x = 4\) later on.
In Perl, there are three kinds of variables you'll be dealing with:
Variable | Purpose |
---|---|
scalar | Holds a single value |
array | Holds a list of values |
hash | Holds a table of data |
my $cat = 'Martin'; # a scalar my @dogs = ('Gretel', 'Shep', 'Lola'); # an array my %groceries = ( # a hash 'eggs' => 6, 'bread' => '1 loaf', 'milk' => '1 gal', );
Variable restrictions
The variable identifier must start with a letter (a-z) or underscore.
Variable | Valid? |
---|---|
$1blah | N |
$blah1 | Y |
$blah'! | N |
$_Blah | Y |
Data Types
- A classification of data like words, letters, integers
- Perl has its own data types:
- strings
- sequence of characters
- integers
- {1,2,…,∞}
- floats
- 3.14159
- arrays
- a list
- hashes
- table of data
- objects
- user-defined data type (not covered in this class)
Strings
my $str = "The quick brown fox jumped over the lazy dog."; my $borat = "Wow-wow-wowee!111!!\n" # how do you put 2+ strings together? with the concatenation operator # (`.' operator) print $str . ' ' . $borat; #=> The quick brown fox jumped over the lazy dog. Wow-wow-wowee!111!!\n
Integers
my $num_sequences = 30;
Floats
my $pizza_radius = 8.0; my $area_of_pizza = 3.14159 * $pizza_radius**2;
Arrays/Lists
- These can contain other data types
- Important: These are indexed, but the first element is at position 0, not 1!
my @list = ('eggs', 'bread', 'milk'); # positions: 0 1 2 # ...or use the shorthand trick for string lists my @list = qw(eggs bread milk); my @useless_list = (1, 2, 3, 4);
You can access individual elements like so:
my @list = qw(eggs bread milk); print "I need some "; print $list[0]; # we use a $ to access a scalar inside the array print "\n\n"; # some space print "There are "; print $#list; # print number of items print "items on my list";
Hashes
my %itemized_list = ( 'eggs' => 6, 'bread' => '1 loaf', 'milk' => '1 gal', );
Subroutines
In mathematics, you would write:
\[ f(x) = x^2 + 2x - 1 \] \[ f(3) = 14 \]In Perl, you would write:
sub f { my $x = shift; return $x**2 + 2*$x - 1; } print f(3); #=> 14
Built-in functions
Perl comes with a ton of built-in functions. Here are a few of them:
Function | Purpose | Example |
---|---|---|
print | Print a string to screen or a file | print "Hi Mom!"; |
uc | Uppercase a string | print uc("Hi Mom!") |
sqrt | Return the square root of a number | print sqrt(4) |
User-defined functions
- In \(f(x)\), \(x\) is the parameter
- In \(f(2)\), 2 is the argument
- In Perl, \(x\) is passed in as an array of arguments to \(f()\), so it's like writing \(f((x))\)
- The array is available as
@_
, or you canshift
each argument - It's a quirk of the language
sub fetch_dog { my $name = shift; return uc($name) . "!"; } print fetch_dog('Gretel'); # prints "Gretel!"
sub fetch_dog
: names the functionfetch_dog
shift
: takes the first argument from the list of input ("Gretel")return
: Return a value, in this case, an uppercased string with an exclamation point on the end
Conditionals
Sometimes you need to do things based on the value of a variable or output of a function:
if ($on_phone) { print 'Corporate accounts payable Nina speaking. Just a moment!'; } elsif ($day eq 'Monday') { print "Sounds like somebody's got a case of the Mondays"; } else { print '...'; }
- The stuff inside the parentheses must evaluate to
TRUE
orFALSE
- List of comparison operators
Loops
If you have a list of something, you can iterate (loop) over the items in a few ways:
# the handy way, if you don't need to know the index my @list = qw(eggs bread milk); for my $item (@list) { print $item; print "\n"; } # another way that involves keeping track of the index for (my $i = 0; $i < $#list; $i++) { print $list[$i]; print "\n"; } # another way: my $i = 0; while ($i < $#list) { print $list[$i]; $i += 1; }
Running Perl
As we saw in the last tutorial, you can run Perl scripts from the command-line like so:
perl my_file.pl
Always save your Perl scripts with the pl
extension.
A silly example
- Print a list of languages, "shouting" the first one
RUBY! Shell PHP MySQL PHP
sub format_languages { my @langs = @_; for (my $i = 0; $i < $#langs; $i++) { if ($i == 0) { print uc($langs[$i]) . '!'; } else { print $langs[$i]; } print "\n"; } } format_languages(qw(Ruby Shell PHP MySQL Perl));
Alternate definitions
sub format_languages { my ($favorite, @langs) = @_; print uc($favorite) . "!\n"; for (my $i = 0; $i < $#langs; $i++) { print $langs[$i] . "\n"; } } format_languages(qw(Ruby Shell PHP MySQL Perl));
or…
sub format_languages { my @langs = @_; $langs[0] = uc($langs[0]) . '!'; for (my $i = 0; $i < $#langs; $i++) { print $langs[$i] . "\n"; } } format_languages(qw(Ruby Shell PHP MySQL Perl));
or…
sub format_languages { my @langs = @_; $langs[0] = uc($langs[0]) . '!'; print join("\n", @langs); } format_languages(qw(Ruby Shell PHP MySQL Perl));
Important boilerplate code to include
Perl can allow some really sloppy coding that can cause obnoxious, subtle errors. Make sure you put this at the top of every Perl file:
use diagnostics; use strict; use warnings; # code goes here
Debugging Perl
If you want to know what a variable holds, print it out with
Data::Dumper
. Put this at the top of your file:
use Data::Dumper;
Then use it like so:
print Dumper(%my_really_big_hash_table);
Closing
Go forth and write some Perl!