RNA-seq Pipeline Comparison and Analyses

A Snakemake workflow for running TALON, FLAIR, and pipeline-nanopore-ref-isoforms. Performs a comparative analyses of results using tools such as GFFcompare.

Flowchart

Dependencies

Snakemake 7.3.1
A full snakemake instalation is recommended
Singularity 3.7.0

Installation

Clone the repository to desired location.

How to run

Set parameters in config.yaml
run: snakemake -p --use-singularity --singularity-prefix "resources" --singularity-args "--bind *" --use-conda -j ** all --configfile "config/config.yaml"

Note * : You should provide your own directory for the --bind command so that the data is accesible from the singularity containers.
Note ** : Specify number of available threads here.

Snakemake report

You can run snakemake --report report.html AFTER the workflow finished to create a report containing results.

Notes

The GTF files located in the 03_combined and 05_matched_transcripts have a column called TPM. This is actuallu the raw number of counts. The attribute is hijacked to pass counts to GFFCompare.

When testing the workflow it took about 18 hours on 10 threads with 100g memory to process 6 Human samples. Running with a much smaller RNA-virus dataset it took about 8 hours for 6 samples.

The main bottleneck is TranscriptClean which requires many hours and high memory to correct all samples.

Troubleshooting

Transcriptclean

Transcriptclean requires the reference genome Fasta file to only have one string per header. In order to run TranscriptClean you must edit the headers.

Conda environment fails to build

There seems to be an issue with Snakemake 7.3.1 when building conda environments. If a time out error occurs you can try running the workflow with an older version of snakemake such as version 5.3.2.

License

MIT, see LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
config		config
log_files		log_files
workflow		workflow
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
RPCA.png		RPCA.png
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RNA-seq Pipeline Comparison and Analyses

Flowchart

Dependencies

Installation

How to run

Snakemake report

Notes

Troubleshooting

Transcriptclean

Conda environment fails to build

License

About

Releases

Packages

Languages

License

LUMC/RPCA

Folders and files

Latest commit

History

Repository files navigation

RNA-seq Pipeline Comparison and Analyses

Flowchart

Dependencies

Installation

How to run

Snakemake report

Notes

Troubleshooting

Transcriptclean

Conda environment fails to build

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages