Skip to content

Reference sequence databases: UNITE

dickgroenenberg edited this page Jan 18, 2022 · 12 revisions

Create UNITE blast database

Download UNITE fasta file
The download link can be changed check this https://unite.ut.ee/repository.php for the latest UNITE+INSDC database.
Select the "all eukaryotes" repository (example below dd. 2021-05-10 is version 8.3). Make sure the filename
corresponds to the link; occassionaly presumed .gz files turned out to be .tar.gz

wget https://files.plutof.ut.ee/public/orig/28/88/28881015F8784D2A68C3F8C6CB851EAAE3803A8BFDA735C6390CE05DB4E34851.gz

gunzip 28881015F8784D2A68C3F8C6CB851EAAE3803A8BFDA735C6390CE05DB4E34851.gz 

mv 28881015F8784D2A68C3F8C6CB851EAAE3803A8BFDA735C6390CE05DB4E34851 input.fa

Change header and remove old input file

sed 's/>/>UNITE|/g' input.fa | sed 's/ë/e/g' | sed 's/×/x/g' > UNITE.fa

rm input.fa

Make blast database

makeblastdb2.8.1 -in UNITE.fa -dbtype nucl

Normally the above command would be run as makeblastdb -in UNITE.fa -dbtype nucl but because of
issues with v.2.9 we fell back on the previous version.