Introduction to Perl
Table of Contents
Purpose
For this project, our goal is to
Save the reverse complement of a sequence to a file
Break it down!
Let's decompose this into input, output, and process:
- input
- The sequence we need to reverse complement. This comes in
the form of
sequence.txt
for this project, but it could be a fasta text file, a database, an image, or something else. - output
- The reverse complement of the input sequence saved as a
file (
rev_complement.txt
).
For the process:
- Get/read
sequence.txt
into a variable- It's easier to work with files if we read them into variables. Better yet, since we're dealing with letters, we should read the file into a string variable.
- In Perl, the
open()
function creates a file handle you can use to access the contents of the file. You can read the contents of the file after you've opened it. - You need to
close()
each file you open too.
- Take the reverse complement of the variable
- Perl excels at manipulating strings, and it even provides us with
a perfect operator,
tr//
. This is the transliterate operator, and we can use it like so:$reverse = tr/acgt/tgca/
- Perl excels at manipulating strings, and it even provides us with
a perfect operator,
- Save the file as
rev_complement.txt
open()
a file for writingprint()
to the fileclose()
the file after writing
Useful Idioms
How do I read the contents of a file?
You open files like this:
my $fn = 'sequence.txt'; # filename to read open my $in, '<', $fn; # ...do stuff with $in... close $in; # close the file
- The
<
character is used for reading, whereas the>
character is used for writing. - The
my $in
line makes the file handle available for use.
You can read the contents of the file as an array like this:
my $fn = 'sequence.txt'; # filename to read open my $in, '<', $fn or die("Could not open $fn"); my @lines = <$in>; # special perl syntax close $in;
How do I convert my array into a string?
With the join()
function:
my @list = qw(d n a); my $list_as_a_string = join("", @list); #=> 'dna'
How do I get rid of the "\n" character on the end?
The \n
character is a newline character, and it can be removed with
the chomp()
function.
my $str = "dna\n"; chomp $str; #=> "dna"
How do I complement a string of DNA?
my $str = 'cat'; my $comp = $str; my $comp =~ tr/acgt/tgca/; #=> "gta"
How do I reverse a string?
my $str = 'cat'; my $rev = reverse $s; #=> "tac"
How do I write to a file?
my $out_fn = 'output.txt'; open my $out, '>', $out_fn or die("Could not open $out_fn"); print $out "Hello, World!"; close $out; # open output.txt on your computer to verify
Final remarks
Submit the project on Moodle.