Skip to content

Rendering Displays

Sam Minot edited this page Dec 14, 2021 · 4 revisions

Interactive Displays

After performing a gig-map alignment, it can be very helpful to render a visual display of the results. One of the nice features of the gig-map utility is the ability to generate an HTML file which uses the Plotly library to make an interactive map with features for zooming in and expanding specific regions of genetic space. In a previous step, the user should have used the alignment utility of gig-map to generate a set of detailed outputs. In this step, those alignment outputs will be transformed into an HTML file which can be opened and explored by the user.

Setting Up

To start, create or identify a folder which will be used to run this analysis. Next, download these two template files to help you set up the alignment process:

The render.params.json file allows you to specify which set of alignment results will be used, and in what location the output files will be placed. The render.sh file is a script which will launch the appropriate utility within gig-map using the parameters specified in render.params.json

To list the complete set of options available for the rendering utility, run the following command:

bash render.sh --help

By default, these files are set up to read the alignments saved to the raw binary file alignments/alignments.rdb and save an output file named gig-map.html in the output/ folder. Please modify any of the values in the render.params.json file as appropriate for your use-case.

Once you are satisfied that the render.params.json file is pointing to the right set of inputs and outputs, start the download process by running:

bash render.sh

Adding Custom Annotations

There are many parameters which can be modified when rendering a display. Some of the most useful additions to a gig-map display is a human-readable annotation of the genes and genomes that were used. To make it easy for the user to produce a customized display, the gig-map utility is set up to read in a CSV file which contains the exact text which the user would like to use for each gene and/or genome.

Annotation tables for genes and genomes can be added to render.params.json with the following params:

{
    "rdb": "alignments/alignments.rdb",
    "output_prefix": "gig-map",
    "output_folder": "output",
    "gene_annotations": "gene_annotations.csv",
    "genome_annotations": "genome_annotations.csv"
}

The gene annotation CSV must contain a column labeled gene_id which matches the name of the gene in the input, while the genome annotation CSV must contain a column named genome_id. Note that both of the gig-map utilities for downloading genes and genomes from NCBI will automatically create a suitable CSV with this format.

Formatting the Display

When rendering the gig-map display, there is a longer list of options which can be used to control many different aspects of its formatting. For reasons that are too tedious to mention, these options can be specified in the render.params.json file in a slightly more complex way. All of these additional display options can be provided inside a single options field with the following syntax:

{
    "rdb": "alignments/alignments.rdb",
    "output_prefix": "gig-map",
    "output_folder": "output",
    "gene_annotations": "gene_annotations.csv",
    "genome_annotations": "genome_annotations.csv",
    "options": "--min-pctid 95 --min-cov 95 --tree-width 0.2 --label-genes-by 'Combined Name'"
}

Note that in the example above, the name used to label each gene will be read from the column in gene_annotations.csv with the header Combined Name.

The complete list of options which can be used in the options field are:

  • --min-pctid: Minimum amino acid similarity threshold for displayed alignments (default: 90)
  • --min-cov: Minimum alignment coverage threshold for displayed alignments (default: 90)
  • --color-genes-by: Indicate a column from the gene annotation table to use for coloring genes
  • --label-genomes-by: Indicate a column from the genome annotation table used for labeling
  • --figure-height: Specify an integer number of pixels to set the total figure height
  • --figure-width: Figure width in pixels (default: 1200)
  • --max-genome-label-len: Maximum number of characters allowed for genome labels (default: 60)
  • --max-gene-label-len: Maximum number of characters allowed for gene labels (default: 60)
  • --label-genes-by: Indicate a column from the gene annotation table used for labeling
  • --clustering-method: Method used to cluster genomes, either "ani" or the name of a specific marker (default: ani)
  • --min-genes-per-genome: Do not display any genome found in fewer than this number of genes
  • --min-genomes-per-gene: Do not display any gene found in fewer than this number of genomes
  • --max-n-genomes: Set a maximum number of genomes to display, removing genomes with the fewest aligned genes
  • --query: Filter the genes for display based on a string containing boolean logic to be applied to gene annotations. For example, if the gene annotation file contains a column of numeric values with a header of length, then the query string "length >= 100" would limit the set of genes which are ultimately displayed to only those genes for which the value in the length column is >= 100.
  • --colorscale: Plotly colorscale used for heatmap
  • --tree-width: Proportional size of tree used for plotting (default: 0.4)
  • --skip-gene-resort: If specified, use the pre-computed gene order. This will prevent the time-consuming recalculation of linkage clustering
  • --show_hovertext: If specified, include hovertext in the HTML display (which increases the file size considerably, and is rather buggy)