FLORA is a FAIR scalable workflow allowing the analysis of RNAseq data of the AMOCYST project using state-of-the-art transcriptomics tools and statistical methods to conduct reproducible analyses using Nextflow. FLORA starts processing by correct RNAseq raw reads using Rcorrector. Then uncorrectable reads are removed using a Python script. rRNA contamination are also removed using Bowtie2 and the SILVA database before a quality filtering process of the reads using Trim Galore. Finally, the transcriptome assembly is performed using Trinity
The FLORA workflow is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It automates the mapping of an individual metagenomic sample on a concatenated reference. It is functional on DATARMOR (Ifremer computing cluster) and uses local dependencies.
i. Download the pipeline
git clone https://gitlab.ifremer.fr/cn7ab95/FLORA.git
iii. Run the workflow
Standard launch:
nextflow run main.nf
Custom launch on our supercomputer DATARMOR:
qsub run-main.nf
See usage docs for a complete description of all of the options available when running the pipeline.
This workflow comes with documentation about the pipeline, found in the docs/
directory:
To use this workflow, it is necessary to:
- complete the parameters defined in theconfig file
- to have a SILVA database (SSU + LSU) indexed with Bowtie2
- to create a file allowing to identify the condition and the replicate of each sample like the following example:
cond_A cond_A_rep1 reads_A_rep1_R1.fq reads_A_rep1_R2.fq
cond_A cond_A_rep2 reads_A_rep2_R1.fq reads_A_rep2_R2.fq
cond_B cond_B_rep1 reads_B_rep1_R1.fq reads_B_rep1_R2.fq
cond_B cond_B_rep2 reads_B_rep2_R1.fq reads_B_rep2_R2.fq
FLORA is written by Cyril NOEL, bioinformatics engineer at SeBiMER, the Bioinformatics Core Facility of IFREMER.
For further information or help, don't hesitate to get in touch with the developper: