Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
PerlScripting
1. Writing Perl Scripts
Boyce Thompson Institute for Plant
Research
Tower Road
Ithaca, New York 14853-1801
U.S.A.
by
Aureliano Bombarely Gomez
2. Writing Perl Scripts:
1. Four mandatory lines.
2. Useful modules I: Files.
3. Useful modules II: Options.
4. Documentation and being verbose.
5. Exercise: Assembly stats.
3. Writing Perl Scripts:
1. Four mandatory lines.
2. Useful modules I: Files.
3. Useful modules II: Options.
4. Documentation and being verbose.
5. Exercise: Assembly stats.
4. 1. Four mandatory lines.
1.LINE: #!/usr/bin/perl
Where ? At the beginning of the script.
Why ? It says to the operating system what
program needs to use to executate the script .
2.LINE: use warnings;
Where ? Before declare the modules and variables.
(sooner is better).
Why ? It will print any compilation warnings.
3.LINE: use strict;
Where ? Before declare the modules and variables.
(sooner is better).
Why ? It will check any gramatical error and it will not
Let run scripts with errors.
5. 1. Four mandatory lines.
4.LINE: 1;
Where ? At the end of the script.
Why ? It says to the operating system that the script
It is done.
#!/usr/bin/perl
use strict;
use warnings;
###########
## MY CODE
###########
1;
6. Writing Perl Scripts:
1. Four mandatory lines.
2. Useful modules I: Files.
3. Useful modules II: Options.
4. Documentation and being verbose.
5. Exercise: Assembly stats.
7. 2. Useful modules I: Files
JUST A REMINDER: How open/Read/Write/Close files.
1. OPEN FUNCTION.
open (FILEHANDLE, MODE, REFERENCE);
FILEHANDLES: undefined scalar variable autovivified.
MODE: read, input only: <
write, output only: >
append to a file: >>
read/write update access: +<
write/read update access +>
read/append update access +>>
REFERENCE: Filename or reference to open
8. 2. Useful modules I: Files
JUST A REMINDER: How open/Read/Write/Close files.
1. OPEN FUNCTION.
open (my $ifh, '<', $input_filename);
open (my $ofh, '>', $output_filename);
SUGGESTIONS: “use autodie” instead “OR die(“my error”)”;
open (my $ifh, '<', $input_filename)
OR die(“ERROR OPENING FILE: $!”);
9. 2. Useful modules I: Files
JUST A REMINDER: How open/Read/Write/Close files.
2. READING OPENED FILES.
while(<FILEHANDLE>) {
## BLOCK USING $_ as LINE (don't forget chomp)
}
SUGGESTIONS: “Know the status of the file”
my @filelines = <FILEHANDLE>;
my $L = scalar(@filelines);
my $l = 0;
foreach my $line (@filelines) {
$l++;
print STDERR “Reading line $l of $L lines r”;
}
10. 2. Useful modules I: Files
JUST A REMINDER: How open/Read/Write/Close files.
3. WRITE OVER OPENED FILES.
print $ofh “This to print over the file”;
4. CLOSE FILES.
close($ofh);
11. 1. Useful modules I: Files
a) File::Basename;
Parse file paths into directory, filename and suffix.
use File::Basename;
my ($name, $path, $suffix) = fileparse($fullname,@suffixlist);
my $name = fileparse($fullname, @suffixlist);
my $basename = basename($fullname, @suffixlist);
my $dirname = dirname($fullname);
12. 2. Useful modules I: Files
b) File::Spec;
Operations over filenames.
use File::Spec;
my $currdir = File::Spec->currdir();
my $tempdir = File::Spec->tempdir();
my $path = File::Spec->catfile($currdir, $filename);
13. Writing Perl Scripts:
1. Four mandatory lines.
2. Useful modules I: Files.
3. Useful modules II: Options.
4. Documentation and being verbose.
5. Exercise: Assembly stats.
14. 3. Useful modules II: Options
Usual way to pass options: Using $ARGV
user@comp$ myscript.pl argument1 argument2
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
my ($arg1, $arg2) = @ARGV;
1;
15. 3. Useful modules II: Options
Usual way to pass options: Using $ARGV
PROBLEM:
When there are multiple arguments can be confusing.
Mandatory arguments are difficult to check !!!
SOLUTION:
Use modules GetOpt::Std or GetOpt::Long
16. 3. Useful modules II: Options
GetOpt::Std;
Process single-character arguments from the command line
user@comp$ myscript.pl -i argument1 -o argument2 -V -H
use GetOpt::Std;
our( $opt_i, $opt_o, $opt_V, $opt_H);
getopts(i:o:VH);
## i: and o: expect something aftter the switch.
my $input = $opt_i || die(“ERROR: -i <input> was not supplied.”);
my $output = $opt_i || die(“ERROR: -o <output> was not supplied.”);
## V and H don't expect anything after the switch.
if ($opt_H) {
print $help;
}
17. Writing Perl Scripts:
1. Four mandatory lines.
2. Useful modules I: Files.
3. Useful modules II: Options.
4. Documentation and being verbose.
5. Exercise: Assembly stats.
18. 4. Documentation and being verbose
Three types of documentation:
1) Document code with #.
GOOD: Useful for developers.
BAD: Inaccessible for users if they not open the script.
2) Document using perldoc.
GOOD: Clear and formated information.
BAD: perdoc is not always installed in the system.
3) Document using an inside print function.
GOOD: Frecuently easy to access. Intuitive.
BAD: ??? Well increase the size of your script.
19. 4. Documentation and being verbose
Three types of documentation:
1) Document code with #.
GOOD: Useful for developers.
BAD: Inaccessible for users if they not open the script.
2) Document using perldoc.
GOOD: Clear and formated information.
BAD: perdoc is not always installed in the system.
3) Document using an inside print function.
GOOD: Frecuently easy to access. Intuitive.
BAD: ??? Well increase the size of your script.
20. 4. Documentation and being verbose
Documenting through a function;
sub help {
print STDERR <<EOF;
$0:
Description:
My program description.
Synopsis:
myscript.pl [-H] [-V] -i <input>
Arguments:
-i <input> input file (mandatory)
-H <help> print Help.
-V <verbose> be verbose
EOF;
Exit(1);
}
21. 4. Documentation and being verbose
Calling help;
use GetOpt::Std;
our( $opt_i, $opt_o, $opt_V, $opt_H);
getopts(i:o:VH);
## i: and o: expect something aftter the switch.
my $input = $opt_i || die(“ERROR: -i <input> was not supplied.”);
my $output = $opt_i || die(“ERROR: -o <output> was not supplied.”);
## V and H don't expect anything after the switch.
if ($opt_H) {
help();
}
22. 4. Documentation and being verbose
Being verbose;
use GetOpt::Std;
our( $opt_i, $opt_o, $opt_V, $opt_H);
getopts(i:o:VH);
## i: and o: expect something aftter the switch.
my $input = $opt_i || die(“ERROR: -i <input> was not supplied.”);
my $output = $opt_i || die(“ERROR: -o <output> was not supplied.”);
if ($opt_V) {
my $date = `date`;
chomp($date);
print STDERR “Step 1 [$date]:ntParsing -i $input file.n”;
}
23. 4. Documentation and being verbose
Being verbose;
my @filelines = <FILEHANDLE>;
my $L = scalar(@filelines);
my $l = 0;
foreach my $line (@filelines) {
$l++;
if ($opt_V) {
print STDERR “Reading line $l of $L lines r”;
}
}
24. Writing Perl Scripts:
1. Four mandatory lines.
2. Useful modules I: Files.
3. Useful modules II: Options.
4. Documentation and being verbose.
5. Exercise: Assembly stats.
25. 5. Exercise: Assembly Stats
GOAL: Create a script to calculate:
1) Number of sequence in a file.
2) Total BP of a file.
3) Longest sequence
4) Shortest sequence.
5) Average and SD.
6) N25, N50, N75, N90, N95 (length and indexes)
26. 5. Exercise: Assembly Stats
6) N25, N50, N75, N90, N95 (length and indexes)
Just a reminder:
N50 Length is the minimun length contained by the
50% of the size of the file (in bp) when it is ordered
by decreasing length.
N50 Index is the number os sequences contained by the
50% of the size of the file (in bp) when it is ordered
by decreasing length.