Index of /examples/perl/examples/advanced/ex05-fancyio

[ICO]NameLast modifiedSizeDescription

[PARENTDIR]Parent Directory   -  
[DIR]data_out/ 2021-05-21 09:33 -  
[DIR]data_out2/ 2021-05-21 09:33 -  
[   ]fancyio.pl 2021-05-21 09:33 7.4K 
[   ]fancyio_cmd.out 2021-05-24 11:06 4.2K 

RCS Perl Example #5 - I/O concepts and etc.

Directory Structure


Notes

This example is to show how to handle many basic functions in Perl, such as file I/O, passing commandline option, write functions, as well as loading SCC Perl module rather than the system default Perl in order to use some extra functions now available in system Perl.

There are six options to call this example,
STEP 1: run the following code first:
[advanced]$ cd ex05-fancyio/
[ex05-fancyio]$ module purge # start from a clean environment
[ex05-fancyio]$ perl fancyio.pl 
Can't locate Switch.pm in @INC (@INC contains: /usr2/collab/yshen16/lib/perl /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at fancyio.pl line 14.
BEGIN failed--compilation aborted at fancyio.pl line 14.
[ex05-fancyio]$ 

Explanation: The above command reports error. It's because it is using the system's perl, which has no module called 'Switch.pm' installed, but in this example, it is referred at line 14.


STEP 2: Now run the following:
[ex05-fancyio]$ module load perl/5.28.1
[ex05-fancyio]$ perl fancyio.pl 
Usage:
    ex05_fancy_io_w_cmdopt.pl 
        --example n [--in_dir xxx] [--in_file xxx] [--out_dir xxx] [--out_file xxx]

    where n=1 - use Perl default I/O setting, without explicitly specify anything;
          n=2 - introduce STDOUT, Perl standard output device. 
          n=3 - introduce STDERR, Perl standard error output device.
          n=4 - introduce die, exit more elegantly when encounter error.
          n=5 - introduce STDIN, Perl standard input device, which is keyboard generally speaking. 
          n=6 - introduce file I/O for input and output
          n=other - invalid option, report error
[ex05-fancyio]$ 

Explanation: The above command no longer reports error. However, it displays the help message coded in the script. The reason for that is, we use 'module load perl/5.28.1' to load perl 5.28.1 module on SCC, which has Switch module installed. So the original error was gone. But still, since it didn't supply the value for the mandatory option '--example', it displays the help message. This is the common way to handle invalid usage of the script.


STEP 3: Now run the script with option 1:

[ex05-fancyio]$ perl fancyio.pl --example 1 
Now get into Example 1:
entrez_gene_symbol,entrez_gene_id
FANCA,2175
FANCB,2187
FANCC,2176
FANCD1,675
FANCD2,2177
FANCE,2178
FANCF,2188
FANCG,2189
FANCI,55215
FANCJ,83990
FANCL,55120
FANCM,57697
FANCN,79728
FANCO,5889
FANCP,84464
Exit from Example 1.
[ex05-fancyio]$ 


Explanation: The above command now does something for us. It echos the content of file in ../data_in/fanconi_genes.csv on the screen. The reason we don't need to specify the input file, is that it is defined as the default input file.


STEP 4: Now run the script with option 2, explicitly specify STDOUT:

[ex05-fancyio]$ perl fancyio.pl --example 2
Now get into Example 2:
entrez_gene_symbol,entrez_gene_id
FANCA,2175
FANCB,2187
FANCC,2176
FANCD1,675
FANCD2,2177
FANCE,2178
FANCF,2188
FANCG,2189
FANCI,55215
FANCJ,83990
FANCL,55120
FANCM,57697
FANCN,79728
FANCO,5889
FANCP,84464
Exit from Example 2.
[ex05-fancyio]$ 


Explanation: The above command behaves almost exactly as option 1. And indeed they are same, since by default, 'print' function prints to STDOUT. So specify or not specify STDOUT, doesn't make any difference.


STEP 5: Now run the script with option 3 - output to STDERR instead:

[ex05-fancyio]$ perl fancyio.pl --example 3
Now get into Example 3:
entrez_gene_symbol,entrez_gene_id
FANCA,2175
FANCB,2187
FANCC,2176
FANCD1,675
FANCD2,2177
FANCE,2178
FANCF,2188
FANCG,2189
FANCI,55215
FANCJ,83990
FANCL,55120
FANCM,57697
FANCN,79728
FANCO,5889
FANCP,84464
Exit from Example 3.
[ex05-fancyio]$ 


Explanation: The above command behaves same as option 2, right? It prints the input content to the screen. Yes and No! 'Yes' is from the outside look, 'No' from more underlying mechanism - even though the content showed on screen is same, the source of it is different. Let's use the following steps to further explain this.


STEP 6: Now run the script with option 2 and option 3, respectively and redirect the output to file:

[ex05-fancyio]$ perl fancyio.pl --example 2 1> data_out/option2.out 2>data_out/option2.err
[ex05-fancyio]$ perl fancyio.pl --example 3 1> data_out/option3.out 2>data_out/option3.err
[ex05-fancyio]$ cd data_out
[data_out]$ ls -l *.err *.out
total 1
-rw-r--r-- 1 yshen16 ysapp   0 May 20 12:02 option2.err
-rw-r--r-- 1 yshen16 ysapp 251 May 20 12:02 option2.out
-rw-r--r-- 1 yshen16 ysapp 251 May 20 12:02 option3.err
-rw-r--r-- 1 yshen16 ysapp   0 May 20 12:02 option3.out
[data_out]$ 


Explanation: Did you see the differences? Now that output is redirected, there is no more output on screen. But if you get into the data_out directory and you will see four files option2.out, option2.err, option3.out and option3.err, and they are opposite, option2.err and option3.out are empty and option2.out and option3.err contain the script output.

Why is it so? It's exactly because in option 2, we ask to print output to STDOUT, and in option 3, we ask to print to STDERR and the script faithfully does so for us!

And by the way, in Linux, '1' stands for STDOUT, and '2' stands for STDERR, they are called 'file descriptor'.


STEP 7: Now run the script with option 4:

[data_out]$ cd ..
[ex05-fancyio]$ perl fancyio.pl --example 4
Now get into Example 4:
Sorry, can't open file ../data_in/fanconi_genes.csv1, No such file or directory at fancyio.pl line 147.
[ex05-fancyio]$ 

Explanation: The script shows some error message stating that it can't open file '../data_in/fanconi_genes.csv1'. Wait! indeed we don't have '../data_in/fanconi_genes.csv1', rather we only have '../data_in/fanconi_genes.csv'. The script deliberately appends '1' at the end of the file, so to cause to report error. And the program exits nicely after reporting this fatal error all because of 'die' function.


STEP 8: Now run the script with option 5:

[ex05-fancyio]$ perl fancyio.pl --example 5 < ../data_in/fanconi_genes.csv 
Now get into Example 5:
entrez_gene_symbol,entrez_gene_id
FANCA,2175
FANCB,2187
FANCC,2176
FANCD1,675
FANCD2,2177
FANCE,2178
FANCF,2188
FANCG,2189
FANCI,55215
FANCJ,83990
FANCL,55120
FANCM,57697
FANCN,79728
FANCO,5889
FANCP,84464
Exit from Example 5.
[ex05-fancyio]$ 

Explanation: The option 5 is designed to take the content from keyboard, which in above command, we feed it with the data file we meant to process, '../data_in/fanconi_genes.csv', and it shows how STDIN works.


STEP 9: Now run the script with option 6 without any other parameters:

[ex05-fancyio]$ perl fancyio.pl --example 6
Now get into Example 6:
The output is in data_out/fanconi_genes_dup.csv
Let's compare the in and output files
diff data_out/fanconi_genes_dup.csv ../data_in/fanconi_genes.csv
no output is good output, exit with 0
Exit from Example 6.
[ex05-fancyio]$


Explanation: The option 6 is designed to send output to file, instead of printing on the screen. In this example command, we didn't supply any user defined directory and filename for output, the script just uses the default one set in the code which is 'data_out' and 'fanconi_genes_dup.csv', respectively.


STEP 10: Now run the script with option 6, with commandline supplied values to override defaults:
[ex05-fancyio]$  perl fancyio.pl --example 6 --out_dir data_out2 --out_file option6.out
Now get into Example 6:
The output is in data_out2/option6.out
Let's compare the in and output files
diff data_out2/option6.out ../data_in/fanconi_genes.csv
no output is good output, exit with 0
Exit from Example 6.
[ex05-fancyio]$ 

Explanation: In this command, we used '--out_dir' and '--out_file' to pass the user defined values, and we saw that they replaced the default output directory and default output file.


The above steps of commands and screen output is also recorded in 'fancyio_cmd.out'.

Contact Information

Research Computing Services: help@scc.bu.edu

Note: RCS example programs are provided "as is" without any warranty of any kind. The user assumes the intire risk of quality, performance, and repai r of any defect. You are welcome to copy and modify any of the given examples for your own use.