Index of /examples/bioinformatics/tophat

Icon  Name                    Last modified      Size  Description
[DIR] Parent Directory - [   ] tophat_mt_job.qsub 27-Nov-2017 11:22 715 [   ] tophat_st_job.qsub 27-Nov-2017 11:23 625

Tophat package example

tophat_st_job.qsub
- the simple single thread tophat job

tophat_mt_job.qsub
- using -p option to run tophat with multithreads

To run tophat at the command line, first load all the prerequisites and tophat module:

scc4% module load intel/2016
scc4% module load bowtie2/2.3.3.1_intel-2016
scc4% module load boost/1.58.0
scc4% module load tophat/2.1.1


Then check the following command out:
scc4% tophat -o tophat_st_out $SCC_TOHHAT_EXAMPLES/test_data/test_ref $SCC_TOHHAT_EXAMPLES/test_data/reads_1.fq $SCC_TOHHAT_EXAMPLES/test_data/reads_2.fq


The above command will use the test data provided in Tophat examples directory to show the simple tophat command to align two paired read files (reads_1.fq, and reads_2.fq). The alignment result is put at 'tophat_st_out/' under current directory.


To submit a job, execute command:
scc4% qsub -P my_project my_tophat_job.qsub


Manual for Tophat:

https://ccb.jhu.edu/software/tophat/manual.shtml


NOTES worth to point out:

a.
When running TopHat with paired reads it is critical that the *_1 files an the *_2 files appear in separate comma-delimited lists, and that the order of the files in the two lists is the same.
Usage: tophat [options]* <genome_index_base> <reads1_1[,...,readsN_1]> [reads1_2,...readsN_2]

b.
As of the date, TopHat can align reads that are up to 1024 bp long.

c.
It's NOT recommended to mix pair-end and single end reads together in one run, for it will give sub-optimal result.

d.
TopHat's default values for paramteres are tuned for processing mammalian RNA-Seq reads. For other species/organism, it's recommended to set some of the parameters with more strict, conservative values than their defaults.