-
Notifications
You must be signed in to change notification settings - Fork 1
Reference sequence databases: UNITE
dickgroenenberg edited this page Jan 18, 2022
·
12 revisions
Download UNITE fasta file
The download link can be changed check this https://unite.ut.ee/repository.php for the latest UNITE+INSDC database.
Select the "all eukaryotes" repository (example below dd. 2021-05-10 is version 8.3). Make sure the filename
corresponds to the link; occassionaly presumed .gz files turned out to be .tar.gz
wget https://files.plutof.ut.ee/public/orig/28/88/28881015F8784D2A68C3F8C6CB851EAAE3803A8BFDA735C6390CE05DB4E34851.gz
gunzip 28881015F8784D2A68C3F8C6CB851EAAE3803A8BFDA735C6390CE05DB4E34851.gz
mv 28881015F8784D2A68C3F8C6CB851EAAE3803A8BFDA735C6390CE05DB4E34851 input.fa
Change header and remove old input file
sed 's/>/>UNITE|/g' input.fa | sed 's/ë/e/g' | sed 's/×/x/g' > UNITE.fa
rm input.fa
Make blast database
makeblastdb2.8.1 -in UNITE.fa -dbtype nucl
Normally the above command would be run as makeblastdb -in UNITE.fa -dbtype nucl but because of
issues with v.2.9 we fell back on the previous version.