PERL Programming Basic
PERL Programming Basic
Presented By
Viswanatha Yarasi
viswanatha_yarasi@satyam.com
(9849743894)
PERL Programming
• Objectives
– Introduction to PERL Programming.
– Where to get perl.
– Writing a perl script
– How to execute the perl intrepreter.
– Variables
• Scalars
• Arrays
• Hashes
Objectives Contd.
– Using strict.
– Built in help – perldoc.
– Conditional and looping statements
– Built in functions.
– Regular expressions.
– File/Directory handling.
– Input/Output
– Functions
What Is Perl
– Practical Extraction and Report
language
– Pathologically Eclectic Rubbish Lister?
– Perl is a High-level Scripting language
– Released in 1987 by Larry Wall
– Faster than sh or csh, slower than C
– More powerful than C, and easier to use
– No need for sed, awk, tr, wc, cut, …
Larry Wall
Larry Wall Quotes
• List processing
• Database access
• System language
What Is Perl Bad For?
– Compute-intensive applications (use C)
– Hardware interfacing (device drivers…)
Where To Get Perl
– Latest release is 5.8.5
– Most used is 5.6.1
– Download from http://www.cpan.org/ports/
for the OS being used
– Easy installation of Perl on Windows from
• http://www.cygwin.com
– For Linux/Windows/Solaris
• http://www.activestate.com/Products/ActivePerl/
• http://www.perl.com/download.csp
Perl is reasonably well
documented!
• Programming Perl
– Wall&Schwartz; O’Reilly/Nutshell
– the “camel book”
• Programming Perl
– Wall, Christiansen,&Schwartz; O’Reilly
– the other camel book
• www-cgi.cs.cmu/cgi-bin/perl-man
– html-based manual
Perl is an interpreted language
• Program is text file
• Perl loads it, compiles into internal form
• Executes the intermediate code
Perl scripts
– Writing a perl script
#!/usr/bin/perl -w
Statements(;)
Comments(#)
while(defined($_ = <STDIN>)) {
chomp($_);
# other operations with $_ here
}
IF/ELSE
• A control expression that IF the condition is true,
one statement block isexecuted, ELSE a
different statement block is exected (ifelse.pl).
if (control_expression is TRUE) {
do this;
and this;
}
else {
do that;
and that;
}
ELSIF
• if/else is great for yes/no decisions. If you want to test mutltiple
statements you can combine else and if to make 'elsif' (elsif.pl).
if (condition 1 is TRUE) {
do this;
}
elsif (condition 2 is TRUE) {
do that;
}
elsif (condition 3 is TRUE) {
do the other;
}
else { #all tests are failed
do whatever;
}
WHILE
• Lets say you want to do a series of actions
whilst a certain condition is true (while.pl):
• Perl knows that if you use foreach (@list) that it is going to assign
each element to a scalar - so it will use $_ by default.
foreach $_ (@list) {
do this;
do that;
do the_other; #until no more $_'s
}
FOREACH(4)
%hash = (Gabor => 123, Peter => 78, Adam => 10);
#EXAMPLE
%h = (Gabor => 123, Peter => 78, Adam => 10);
Output:
Adam 10
Gabor 123
Peter 78
Array Functions
• pop – remove from right hand side
• push – add to right hand side
• shift – remove from left hand side
• unshift – add to left hand side
Array Functions
• Script4 -pop and push 8 54 78 2 5 6 4
• #create an array
0 1 2 3 4 5 6
@an_array = (8,54,78,2,5,6,4)
8 54
• POP into variable (variable=4) 78 2 5 6
#create an array
@an_array = (8,54,78,2,5,6,4) 0 1 2 3 4 5 6
OUTPUT:
applepeach555
String Concatenation
• Strings can be concatenated with '.'
$string1 = “This”;
$string2 = “ is”;
$string3 = “ easy”;
$string4 = “ so far”;
print $string1.$string2.$string3.$string4;
# prints This is easy so far
Changing Case on Strings
• Applications
– when comparing two strings, compare case-
insensitively
• force the case, then compare the strings.
– keyword recognition in configuration files
– usernames, email addrs, …
• wrong: if ($email eq "pab\@sedona.intel.com")
• better: $email =~ tr/A-Z/a-z/;
if ($email eq "pab\@sedona.intel.com")
Changing Case on Strings
• Well written programs observe this rule:
– If humans might try it,
your program ought to understand it.
• ignore case where it should be ignored
• respect case where it should be respected
– output to the user
– rewriting config files
Don’t program dangerous!
• $variable
• @variable
• %variable
OUTPUT:
• In sub printMsg, $my_var:1
$dna_strand =
“AGCTATCGATGCTTTAAACGGCTATCGAGTTTTTTTT";
print "My DNA strand is: $dna_strand\n";
print "If we split this using TTTAAA we get the
following fragments:\n";
@dna_fragments = split(/TTTAAA/,$dna_strand);
foreach $fragment (@dna_fragments) {
print "$fragment\n";
}
JOIN
• join is the conceptual opposite of split. Lets think of it
interms of a DNA ligation with a linker sequence (join.pl):
my ($ligated_fragments);
my (@dna_fragments);
@dna_fragments=("AGGCTT", "AGCCCAAATT",
"AGCCCCATTA");
$ligated_fragments = join ("aaattt", @dna_fragments);
print "The fragments have been ligated with an aaattt
linker:\n";
print "$ligated_fragments\n";
LENGTH
• length - finds the length of a scalar (or a bit of DNA!)
(length.pl).
#!/usr/bin/perl -w
use strict;
my ($genome, $genome_length);
$genome =
"AGATCATCGATCGATCGATCAGCATTCAGCTACTAGC
TAGCTGGGGGGATCATCTATC";
$genome_length = length($genome);
print "My genome sequence is:\n$genome\nand is
$genome_length bases long\n"
SUBSTR
• substr extracts a specified part of a scalar (substr.pl).
• substr($scalar, $start_position, $length)
#!/usr/bin/perl -w
use strict;
my ($dna_sequence, $substring);
$dna_sequence =
"AGCTATACGACTAGTCTGATCGATCATCGATGCTGA";
$substring = substr ($dna_sequence, 0, 5);
print "The first 5 bases of $dna_sequence
are:\n$substring\n";
UC/LC
• uc (uppercase) and lc (lowercase) simply change the
case of a scalar (uclc.pl).
#!/usr/bin/perl -w
use strict;
my ($mixed_case, $uppercase, $lowercase);
$mixed_case = "AgCtAAGggGTCaCAcAAAAaCCCcATTTgcCC";
$uppercase = uc ($mixed_case);
$lowercase = lc ($mixed_case);
print "From $mixed_case we get:\n";
print "UPPERCASE: $uppercase\n";
print "lowercase: $lowercase\n";
S/// - SUBSTITUTE
• This is proper Perl :-)
• The obvious difference between DNA and RNA
is the replacement of T with U.
• Lets mimic the transcription of DNA to RNA with
our new found Perl skills.
• We can use the substitution operator 's'.
• This can convert one element in a scalar to
another element.
• This takes the form s/[one thing]/[for another
thing]/
• Let's see it in action (transcription.pl).
S/// - SUBSTITUTE (2)
#!/usr/bin/perl -w
use strict;
my ($dna_molecule, $rna_molecule);
$dna_molecule =
"AGCTATCGATGCTTTCGATCACCGGCTATCGAGTTTTT
TTT";
print "My DNA molecule is $dna_molecule\n";
$rna_molecule = $dna_molecule;
$rna_molecule =~ s/T/U/g;
print "My RNA molecule is $rna_molecule\n";
exit();
=~
• What is that crazy =~ sign?
• This is called the "=~ operator".
• Allows you to specify the target of a pattern matching
operation (FYI the /[whatever]/ bit is a "matching
operator").
• By default matching operators act on $_ ie. if you just
saw s/T/U/g; in a program on its own it is acting on $_
• We have $rna_molecule =~ s/T/U/g; - which means
perform the s/T/U/g on $rna_molecule. We have re-
assigned the effect of the matching operator from $_ to
$rna_molecule.
• If you want $rna_molecule to remain unchanged - but
alter it in someway - assign it to another scalar first.
REVERSE and TR
• So substitution allows you to change one thing ito
another. This is great – we could use the same
technique to get the complement of a DNA strand!
• All we have to do is change all the A's to T's, all the G's
to C's, all the T's to A's and all the C's to G's.
• Then if we reverse it we get the reverse complement! Or
do we? See wrong_revcom.pl.
• I guess the game is given away in the filename that
there's something up with this.
• Look closely.
• Think about what each line is going to do to the scalar
$DNA.
• Tell me why the code is wrong.
REVERSE and TR (2)
#!/usr/bin/perl –w
$DNA = "AAAAGGGGCCCCTTTAGCTAGCT";
$DNA_UNTOUCHED = $DNA;
print "After no substitutions: DNA is : $DNA\n";
#substitute all the A's to T's
$DNA =~ s/A/T/g;
print "After A-T substitution: DNA is : $DNA\n";
#substiutute all the G's to C's
$DNA =~ s/G/C/g;
print "After G-C substitution: DNA is : $DNA\n";
#substitute all the C's to G's
$DNA =~ s/C/G/g;
print "After C-G substitution: DNA is : $DNA\n";
#subsitute all the T's to A's
$DNA =~ s/T/A/g;
print "After A-T substitution: DNA is : $DNA\n";
$DNA = reverse ($DNA);
print "$DNA_UNTOUCHED reverse complemented is:\n$DNA\n";
REVERSE and TR (3)
The answer
opendir(DIR,“sStartDirectory/$cdir");
@f = readdir(DIR);
closedir(DIR);
if ($_ =~ "hello") {
print "\nhello found\n\n";
}
}
foreach ('hickory','dickory','doc') {
print;
}
OUTPUT:
print "1\n";
what is this?
syntax error at (eval 1) line 3, at EOF
but we run fine
1
print "1\n";