Instructions for the installation, set-up and use of the tool CRABs to create a custom database for the intergenic spacer trnL-trnF to use for amplicon sequencing analysis
As reported in the github https://github.com/gjeunen/reference_database_creator using conda
conda create -n CRABS
conda activate CRABS
conda install -c bioconda crab
crabs db_download --source ncbi --database nucleotide --query '((((trnL-trnF[All Fields] OR trnL[All Fields]) OR trnF[All Fields]) OR tRNA-Leucine[All Fields]) OR tRNA-Phenylalanine[All Fields]) OR trnL-F[All Fields] AND plants[filter]' --output lach.fasta --keep_original yes --email [email protected] --batchsize 5000
crabs insilico_pcr --input lach.fasta --output output.fasta --fwd GGTTCAAGTCCCTCTATCCC --rev ATTTGAACTGGTGACACGAG --error 4.5
For this file you need to first download and extract these two files from the NCBI:
'ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/nucl_gb.accession2taxid.gz'
'ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz'
crabs assign_tax --input lach.fasta --output output.tsv --acc2tax nucl_gb.accession2taxid --taxid nodes.dmp --name names.dmp
crabs dereplicate --input output.tsv --output wank.tsv --method uniq_species
crabs visualization --method diversity --input input.tsv --level class
crabs tax_format --input wank2.tsv --output output.fasta --format dad
library(dada2)
its2taxa <- assignTaxonomy(itsasvs, output.fasta, multithread=TRUE)