Skip to content

Downloading Genomes

Sam Minot edited this page May 25, 2022 · 7 revisions

While gig-map can be used to align genes against a collection of genomes which you have already collected, it can also be useful sometimes to download a set of genomes from the NCBI database. Thankfully, the NCBI website provides a fantastic interface for selecting a group of genomes which can then be downloaded by gig-map.

Identify Genomes in NCBI

To use this feature, first visit the NCBI Genome Portal and select the collection of genomes which you would like to download. Once you have filtered down to the best set of genomes, click on the "Download" button (on the top-right corner of the genome list) to save a file (often called prokaryotes.csv) to your computer. This file can then be used as an input to the gig-map download utility to save that group of genomes to your computer.

Downloading Genomes or Genes

To download the set of genomes or genes which are found in each of the NCBI accessions in the CSV, use the tools download_genes or download_genomes. The only argument for each of these tools is the CSV containing the list of NCBI accessions. The output files will be downloaded to your working directory within the sub-folder genomes/ or genes/.

Inputs and Outputs

Inputs

  • genome_csv: A CSV file containing the list of genomes to download, as described above

Outputs

  • genomes/: A folder containing the downloaded genomes
  • genomes.annot.csv.gz: A table of annotations for all of the downloaded genomes

or

  • genes/: A folder containing the downloaded genes (one file per genome)
  • genes.annot.csv.gz: A table of annotations for all of the genomes which were used to download genes

Download Genomes

download_genomes

Download Genes

download_genes

Useful References

Other useful references may be: