git clone https://github.com/vitmy0000/SnaMP.git
module load python/anaconda2-4.2.0
conda create -c bioconda -m -p pyenvs/py35-snakemake python=3.5 pandas snakemake
module load python/anaconda2-4.2.0
source activate pyenvs/py35-snakemake
source deactivate
- Closed reference OTU picking
- BLAST agaist HOMD database
- Prepare sequencing data
Repalce the $SOURCE_FILES
with the zipped sequencing result, e.g. WHI_Repo/RT530_Batch2/*.gz
cd input
ln -s $SOURCE_FILES .
cd ..
- Launch jobs
The pipeline will utilize CCR resource to parallel execution. OTU table and statisics about merge rate, filter rate, hit rate wiil be placed under table
snakemake -p -j 100 --cluster-config cluster.json --cluster "sbatch --partition {cluster.partition} --time {cluster.time} --nodes {cluster.nodes} --ntasks-per-node {cluster.ntasks-per-node}"
- Results
Three result files are placed under table
directory.
- QC_table.txt
- raw_OTU_table_collapsed.txt
- raw_OTU_table_uncollapsed.txt
To remove generated files:
snakemake clean
Check snakemake version:
snakemake -v
# 3.13.3
- cluster time limit: if any of the jobs run of the time limit, it will be silently killed by CCR and the pipeline will keep waiting. To solve this, you need to terminate
Snakemake
, increase time limit specified incluster.json
accordingly and rerun the pipeline with extra option--rerun-incomplete
.