Snippets
Table of Contents
- How do I read the contents of a file?
- How do I write to a file?
- How do I print a row to a CSV file?
- How do I convert my array into a string?
- How do I get rid of the "\n" character on the end?
- How do I complement a string of DNA?
- How do I reverse a string?
- How do I remove the first element from an array?
- How do I get rid of newline characters (
\n
) in an array? - How do I split a string into a letter-by-letter array?
- How do I increment or decrement something easily?
- How do I make reading files into an array easier?
- How do I make reading files into a string easier?
- How do I reuse subroutines without copy-pasting?
How do I read the contents of a file?
You open files like this:
my $fn = 'sequence.txt'; # filename to read open my $in, '<', $fn; # ...do stuff with $in... close $in; # close the file
- The
<
character is used for reading, whereas the>
character is used for writing. - The
my $in
line makes the file handle available for use.
You can read the contents of the file as an array like this:
my $fn = 'sequence.txt'; # filename to read open my $in, '<', $fn or die("Could not open $fn"); my @lines = <$in>; # special perl syntax close $in;
How do I write to a file?
my $out_fn = 'output.txt'; open my $out, '>', $out_fn or die("Could not open $out_fn"); print $out "Hello, World!"; close $out; # open output.txt on your computer to verify
- IMPORTANT Do not place a comma after the file handle
(
$out
in the above example) – this will just print out the file reference (some GLOB(0xAB01) garbage)
How do I print a row to a CSV file?
Use printf
to make your life simpler:
printf $out_fh "%d, %d\n", $x, $y; # %d is a placeholder for digits
Here's a more complete example:
my @x = (0, 1, 2, 3, 4); my @y = (3, 1, 4, 1, 5); open my $fh, '>', 'amazing_discovery.csv' or die "ERROR: $!\n"; # print a header print $fh "Solar Radiation, Stock Prices\n"; for (my $i = 0; $i < $#x; $i++) { printf $fh "%d, %d\n", $x[$i], $y[$i]; } close $fh;
How do I convert my array into a string?
With the join()
function:
my @list = qw(d n a); my $list_as_a_string = join("", @list); #=> 'dna'
How do I get rid of the "\n" character on the end?
The \n
character is a newline character, and it can be removed with
the chomp()
function.
my $str = "dna\n"; chomp $str; #=> "dna"
How do I complement a string of DNA?
my $str = 'cat'; my $comp = $str; my $comp =~ tr/acgt/tgca/; #=> "gta"
How do I reverse a string?
my $str = 'cat'; my $rev = reverse $s; #=> "tac"
How do I remove the first element from an array?
Oh, so you have a pesky header line from a FASTA file? Let's get rid
of it with shift
:
# your array, typically from reading the fasta file my @seq = ("> Obscure organism\n", "ACTGAAA\n", "AAAA"); shift @seq; # removes the first element # @seq is now: ("ACTGAAA\n", "AAAA")
How do I get rid of newline characters (\n
) in an array?
The laziest way is to convert the array to a string and run a substitution command on it. It is not the most efficient, but it is perfect when you already planned on converting the array to a string.
my @seq_ary = ("ACTGAAA\n", "AA\n\n\n\nAA\n"); # convert it to a string my $seq = join('', @seq_ary); # replace all newlines $seq =~ s/\n//g;
How do I split a string into a letter-by-letter array?
So you want your sequence ACTG
to become ('A', 'C', 'T', 'G')
?
There's a recipe for that:
my $seq = 'ACTG'; # split without a regular expression (thing between the slashes) # puts each character into an array my @nucs = split //, $seq; #=> ('A', 'C', 'T', 'G')
How do I increment or decrement something easily?
Use the ++
and --
operators.
my $x = 0; # the annoying-to-type way: $x = $x + 1; # x = 1 # the less annoying way: $x += 1; # x = 2 # the lazier way: $x++; # x = 3 $x--; # x = 2;
How do I make reading files into an array easier?
We read files into arrays quite a bit. Let's make our code less
repetitive by encapsulating all those open
, <$fh>
, and close
statements into a function.
# Returns a file's contents as an array of lines. # # USAGE: # my @lines = file_to_array('my_fasta_file.fa'); # # @param string filename # @return array sub file_to_array { my $fn = shift; open my $fh, '<', $fn or die "ERROR: Could not read $fn\n"; my @lines = <$fh>; close $fh; return @lines; }
How do I make reading files into a string easier?
Let's write another subroutine to use our last subroutine:
# Returns a file's contents as a string. # # USAGE: # my $contents = file_to_string('my_fasta_file.fa'); # # @param string filename # @return string sub file_to_string { my $fn = shift; my @lines = file_to_array($fn); # we can "glue" each line together with the join() function my $string = join('', @lines); return $string; }
How do I reuse subroutines without copy-pasting?
- Put all of the subroutines into a separate
PL
file calledlib.pl
(or whatever you want). For example, place thefile_to_array
definition inlib.pl
. - In the script you want to use your subroutines in, add a call to
do
:
# my_script.pl use warnings; use strict; do 'lib.pl'; my @lines = file_to_array('sequence.fa');