Perl Programming Exercises 1 - 'A B C'
Perl Programming Exercises 1 - 'A B C'
For all of your programs, please use the modules strict and warnings to force declaration of
variables and to catch unintended errors. Both modules make it easier to produce error-free
programs that do exactly what they are intended to do.
Syntax Exercises
Exercise 1 - First print statement
#!/usr/bin/perl
use strict;
use warnings;
print "My first Perl program\n"; #try single quotes
print "First line\nsecond line and there is a tab\there\n";
Notes
1. I always use strict; and use warnings;, even on the shortest programs. Mighty
warts from tiny programs grow.
2. I always end a program with exit(); even though it is not necessary. Why? It
immediately tells me where the program ends and that I have copied it completely
from wherever I got it.
Exercise 2 - Numerical variables and operators
#!/usr/bin/perl
use strict;
use warnings;
#assign values to variables $x and $y and print them out
$x = 4;
$y = 5.7;
print "x is $x and y is $y\n";
#example of arithmetic expression
$z = $x + $y**2;
$x++;
print "x is $x and z is $z\n";
#evaluating arithmetic expression within print command
print "add 3 to $z: $z + 3\n"; #did it work?
print "add 3 to $z:", $z + 3,"\n";
Notes
1. within "strings", variables are interpolated, but not evaluated!
2. however, within 'strings', variables are neither interpolated nor evaluated.
Exercise 5 - Arrays
#!/usr/bin/perl
use strict;
use warnings;
#initialize an array
my @bases = ("A","C","G","T");
#print two elements of the array
print $bases[0],$bases[2],"\n";
#print the whole array
print @bases,"\n"; #try with double quotes
#print the number of elements in the array
print scalar(@bases),"\n";
=~
=~
=~
=~
s/T/A/g;
s/A/T/g;
s/G/C/g;
s/C/G/g;
Exercise 9 - Subroutines
#!/usr/bin/perl
use strict;
use warnings;
#TASK: Make a subroutine that calculates the reverse
#complement of a DNA sequence and call it from the main program
#body of the main program with the function call
my $DNA = "GATTACACAT";
my $rcDNA = revcomp($DNA);
print "$rcDNA\n";
exit;
#definition of the function for reverse complement
sub revcomp{
my($DNAin) = @_;
my $DNAout = reverse($DNAin);
$DNAout =~ tr/ACGT/TGCA/;
return ($DNAout);
}
10
my @bases = split(//,$DNA);
#step through the array and count the occurrences of G and C
for (my $i=0;$i<$DNAlength;$i++){
if($bases[$i] =~ /[GC]/){
$count++;
}
}
#return percentage of GC bases
return $count/$DNAlength;
}
11
Contents
Tasks
o
o
o
o
o
o
o
o
o
o
o
o
Hello World
cat()
lc()
max
max (with subroutine)
anagram
anastring
anastring (with commandline input)
sort
fastaParser
factorial
factorial(recursive)
Here are programming exercises that focus on translating a concept into a working script.
The preceding section covers a section I have called syntax examples ... they are simple tasks
that ask you to write functioning code syntactically correct.
For each task you will find
1. a description of the task your code should achieve;
2. some hints how to go about solving it - which functions you might use or which
strategy; and
3. sample code for reference if you are stuck. Should you really need to look up the
samples, carefully study the code, put it away and then write your own script from
scratch, with different code and perhaps some variation in function. If you merely
copy code, or read it with mild interest and move on, you will probably be wasting
your time.
12
Tasks
Hello World
Executable program
Write a Perl program helloWorld.pl that prints out "Hello World" (or whatever you fancy)
to the terminal. Make your program executable (chmod u+x helloWorld.pl) so that you
don't need to invoke the Perl interpreter explicitly from the command line (i.e. just "$
./helloWorld.pl" should run it, you shouldn't need to type "$ perl helloWorld.pl").
[Collapse]
Hints:
Simply use the print(); function.
[Collapse]
Code:
#!/usr/bin/perl
use warnings;
use strict;
print("Hello World !\n");
exit();
13
cat()
Keyboard input
Write a Perl program cat.pl that prints to the terminal a single line that you type at the
keyboard.
[Collapse]
Hints:
Use the diamond operator to read from STDIN, assign this to a variable, then print the
contents of the variable. Just one statement, no loop is required.
[Collapse]
Code:
#!/usr/bin/perl
use warnings;
use strict;
my $line;
$line = <STDIN>;
print( $line, "\n");
exit();
lc()
Write a Perl program lc.pl that reads one or many lines from STDIN, converts them to
lowercase and prints them to the terminal. Use this interactively, typing input (end by typing
<ctrl>D), then use this by redirecting a textfile to your program, then "pipe" the output of the
Unix "ls" command into your program.
[Collapse]
Hints:
Use a while loop to test the successful assignment of <STDIN> to a variable as its loop
condition. This way thee loop runs until STDIN reads EOF (End of File). Use the perl lc();
function to change case. Assign the return value to a variable and print it.
[Collapse]
Code:
#!/usr/bin/perl
use warnings;
use strict;
while (my $line = <STDIN>) {
$line = lc($line);
print( $line, "\n");
}
exit();
max
Condition
Write a Perl program max.pl that prompts for and reads two numbers from STDIN, and
outputs the larger of the two numbers to the terminal. Remember to consider the case that the
numbers may be equal.
15
[Collapse]
Hints:
You need an
if (condition) { do ... }
construction to print one or the other numbers, depending on the result of the comparison.
Remember the difference between numeric and alphanumeric comparisons! You have to
chomp(); your input variables, to be able to compare them as numbers.
[Collapse]
Code:
#!/usr/bin/perl
use warnings;
use strict;
print("Enter a number: ");
my $input1 = <STDIN>;
# User inputs
16
Subroutine
Rewrite max.pl so the comparison is done in a subroutine: pass the two numbers as
arguments into a subroutine and return the larger of the two. Such a program may be a useful
framework for comparing two datasets with a non-trivial metric. Instead of simply picking
the larger value, the subroutine could compare according to some sophisticated algortithm
[Collapse]
Hints:
Remember that Perl uses the default array "@_" to pass values into subroutines. You need to
assign the contents of @_ to variables (or other arrays) in order to be able to use the values.
The easiest way to do this, is to assign the array to values in a list - e.g.
my ($a) = @_; or ...
my ($a, $b) = @_;
If you would do this, you would be assigning an array "@" to a scalar "$". The problem is
that this is legal, the compiler does not complain or warn, but this does not assign the first
value in the array, it assigns the integer value of the number of fields the array uses ! This is a
fine case of a statement being syntactically correct but logically wrong. If in doubt whether
you are doing the right thing, always print your values from within the subroutine, as a
development test, to make sure they are what you expect them to be.
[Expand]
Code:
anagram
17
Array
Write a Perl program anagram.pl that reads a string from STDIN and returns ten random
permutations of this string. This will require a number of concepts and techniques of working
with arrays - defining an array, assigning values to an array, or to individual fields of an array,
using a variable as an index to an array in order to read from or write to specific fields, and
more. First split your string into individual elements of an array. Use a subroutine that
randomizes this array by looping over every position of the array, and swapping the contents
of this position with a randomly chosen other position of the array, except itself. Write this in
pseudocode first. The Perl functions you will need are split(); and rand();.
[Collapse]
Hints:
You have to chomp(); your input in order not to shuffle the newline character (return
character) into your randomized strings; otherwise youll end up with strangely shortened
versions of your randomized string, split into two parts. To get the array size, use the index of
the last array position plus one (remember that array positions are numbered starting at 0, not
1 ! ). To split a string into individual elements of an array, use split(//, $input); with no
delimiter, i.e. with no other characters in between the slashes, not even a space. Assigning the
result of split(); to an array puts every character of the string into its own array field. When
randomizing the array, note that rand(); returns a random rational number, not an integer, so
you may need to use int(); to truncate the result of rand(); and just return the integer part. Use
variables to store values from the array before the swap, otherwise the original value stored in
a given array position will be lost before it can be copied over to the new array position that
you want to swap it to. Also note that all array positions should be switched, so you need to
consider the case that your random integer is the same as the position of the original value.
When you are done, see what happens when you comment out the chomp(); function, for
effect.
[Collapse]
Code:
#!/usr/bin/perl
use warnings;
use strict;
# Constants
my $COUNT = 10;
18
# Declare variables
my $stringInput;
# Initial input string
my @stringArray;
# Input string array after splitting into
# an array that stores each character in one array
element
print("Enter a string to randomize: ");
$stringInput = <STDIN>;
chomp($stringInput);
string
19
$randArray[$j] = $arrayPos2;
# end for (iterating through elements in the array)
The construct
int(rand($j+1))
deserves some comment. $j starts as the index of the last element in the array. $rand(n)
returns a random, rational number from the interval [0,n[ i.e. 0 number < n. Assume our
array had four elements: $rand(3+1) would return numbers from 0.000... to 3.999... Since
int() does not round the number, but just truncates its decimals and returns its integer part, we
return random integers from 0 to 3, each with uniform probability. That happens to be exactly
the range of elements that can be used to randomly point somewhere into our array.
anastring
Substring calisthenics
Copy anagram.pl to a file named anastring.pl and use the Perl strlen(); and substr(); functions
to permute the string (in place !) instead of shuffling fields of an array. Make a point of
programming this incrementally step by step, writing output as you go along to make sure
you are doing it right. Of course you could also shuffle using the split() and join() functions
on a string ... but that would not be "in place".
[Collapse]
Hints:
This is similar to anagram.pl but uses substr(); on the original string instead of shuffling
fields of an array. Remember that substr(); can be used to extract defined substrings as well as
to replace them. As with anagram.pl, use variables to store the characters that you want to
20
swap, to prevent the original character from being lost when you overwrite one of the two
positions in the string. Use int(); on the result of rand(); to get a random position in the string
and think carefully about the range of numbers that this should produce. The range is
obviously a function of the string-length - but does it start at 0 or 1 and does it extend to the
length itself, or more or less ? Test whether the range you produce is correct.
[Collapse]
Code:
#!/usr/bin/perl
use warnings;
use strict;
my $count = 10;
print
my $string = <STDIN>;
chomp($string);
my $len = length($string);
my $pos;
for (my $i=0; $i < $count; $i++) {
for (my $j=0; $j < $len; $j++) {
$pos = int(rand($len));
while ($pos == $j) {
$j ...
$pos = int(rand($len));
}
store
swap pos to j
swap tmp to
string
Commandline arguments
Modify anastring.pl so you can pass the number of permutations to the program in the
commandline ( via @ARGV ), make sure that the default is 1, if no argument is given. This
tool could be part of a routine to generate random data to test statistical significance.
[Collapse]
Hints:
The whole commandline that you give to a Perl program is stored in the array-variable
named @ARGV. $ARGV[0] is the first argument $ARGV[0] is the second, and so on. To
check whether some variable is defined, use the function defined($someVariable); in an if
statement. If no command line argument has been typed, $ARGV[0] will be undefined.
[Collapse]
Code:
Simply change:
my $count = 10;
to:
my $count = 1;
if (defined($ARGV[0]) ) { $count = $ARGV[0] };
assuming you have a file named test.txt with the contents you want to randomize, or
$ echo "acdefghiklmnpqrstvwy" | anastring.pl 100
22
sort
Sorting
Write a Perl program sort.pl that takes in strings (e.g. names) from STDIN, stores them in an
array, sorts them in alphabetical order, using the Perl sort(); function and prints them out to
the terminal.
[Collapse]
Hints:
Declare a variable to use as an array index and initialize it with the value 0. Assign the entire
input string to the current array position $array[$index], then increment the index variable so
it points to the next available field. (The field of an array can hold integers, floats, strings,
other arrays, hashes, references to arrays, ...) Sort the array using the Perl sort(); function.
[Collapse]
Code:
#!/usr/bin/perl
use warnings;
use strict;
my
my
my
my
$index = 0;
$currentInput;
@arrayOfStrings;
@sortedArray;
#
#
#
#
@sortedArray = sort(@arrayOfStrings);
exit();
23
# Store in array
# increment index
fastaParser
[Collapse]
Hints:
To parse out the definition line of a FASTA file, use substr(); to get the first character of
each line and test to see if it is ">". Read in each line of the FASTA file and store it as an
array, character by character (as with anagram.pl). Loop over the contents of the array and
retrieve the three-letter code for the amino acid, using a hash that maps one-letter amino acid
codes to three-letter amino acid codes.
Hint about the hash: its similar in concept to the amino acid code hash that was used in one
of the programs written in class think about which way the amino acid code mapping was
applied with that hash and try to apply the principles here.
[Collapse]
Code:
#!/usr/bin/perl
use warnings;
use strict;
# Declare variables
my $line;
24
my $char;
my @oneLetterLine;
letters
my %oneToThree;
} # end for
} # end if
# end while
print("\n");
exit();
# =================================================================
# Subroutine to generate the hash that maps one-letter amino acid
# code to three-letter amino acid code
sub mapOneToThree {
$oneToThree{'A'}
$oneToThree{'C'}
$oneToThree{'D'}
$oneToThree{'E'}
$oneToThree{'F'}
$oneToThree{'G'}
$oneToThree{'H'}
$oneToThree{'I'}
$oneToThree{'K'}
$oneToThree{'L'}
$oneToThree{'M'}
$oneToThree{'N'}
$oneToThree{'P'}
$oneToThree{'Q'}
$oneToThree{'R'}
$oneToThree{'S'}
$oneToThree{'T'}
$oneToThree{'V'}
$oneToThree{'W'}
$oneToThree{'Y'}
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
'Ala';
'Cys';
'Asp';
'Glu';
'Phe';
'Gly';
'His';
'Ile';
'Lys';
'Leu';
'Met';
'Asn';
'Pro';
'Gln';
'Arg';
'Ser';
'Thr';
'Val';
'Trp';
'Tyr';
25
# end sub
factorial
[Collapse]
Hints:
Remember to think about all types of outcomes when designing your conditions in if/else
statements: a negative factorial is undefined, and both 0! and 1! are equal to 1. Use die();
rather than exit(); to indicate that an unexpected input has been entered that the program
cannot handle. Both cause the program to terminate, but die(); allows you to enter an error
message on program exit, e.g. die(Negative factorial is undefined.);. Use a for loop to
multiply out the factorial of the input number, and use a variable to store the value of the
factorial during intermediate steps in calculation.
[Collapse]
Code:
#!/usr/bin/perl
use warnings;
use strict;
my $number = <STDIN>;
chomp($number);
print(fact($number),"\n");
exit();
26
# ========================================================
sub fact {
my ($n) = @_;
my $factorial = 1;
if ($n < 0) {
die("panic: fact($n) negative factorial is undefined. ");
} elsif ($n == 0 or $n == 1) {
return 1;
} else {
for (my $i = 2; $i <= $n; $i++) {
$factorial = $factorial * $i;
}
} # end if
}
return $factorial;
# end sub
factorial(recursive)
(OPTIONAL) ...the second way is to use a subroutine recursively to yield the factorial of a
number. Try programming it this way as well.
[Collapse]
Hints:
Recursion means a function calls itself. Such a subroutine or program needs defined base
cases, for which the subroutine can return a value without having to call itself again
(allowing the program or subroutine to terminate, otherwise it would just go deeper, and
deeper...). The base cases for factRecurse are exactly the same as for factorial.pl negative
factorial should return an error, and both 0! and 1! should return 1. In place of the for loop
used in factorial.pl, each recursion of the subroutine in factRecurse.pl performs one small
step (the small step that would be performed with each iteration of the for loop) and then
applies it to the next subroutine call. (e.g. $resultOfSomeStep + subRoutine($currentCall
1)).
[Collapse]
Code:
27
#!/usr/bin/perl
use warnings;
use strict;
my $number = <STDIN>;
chomp($number);
print(fact($number),"\n");
exit();
# ========================================================
sub fact {
my ($n) = @_;
if ($n < 0) {
die("panic: fact($n) negative factorial is undefined. ");
} elsif ($n == 0 or $n == 1) {
return(1);
} else {
return( $n * fact($n-1) ); # recursive: subroutine calls itself
}
# end sub
Log in
Page
Discussion
Read
View source
View history
Applied Bioinformatics
Bioinformatics
28
Privacy policy
About "A B C"
Disclaimers
29