Skip to content

read and write UCSC track and assembly hub files in node or the browser

License

Notifications You must be signed in to change notification settings

nucleotidy/ucsc-hub-js

 
 

Repository files navigation

ucsc-hub-js

read and write UCSC track and assembly hub files in node or the browser

Status

Build Status NPM version Greenkeeper status Coverage Status

Usage

Read about hub.txt, genomes.txt, and trackDb.txt files here: https://genome.ucsc.edu/goldenpath/help/hgTrackHubHelp.html

Files are essentially JavaScript Maps. A hub.txt file is a map with th keys as the first word in each line and the value as the rest of the line, like this:

Map {
  "hub" => "UCSCHub",
  "shortLabel" => "UCSC Hub",
  "longLabel" => "UCSC Genome Informatics Hub for human DNase and RNAseq data",
  "genomesFile" => "genomes.txt",
  "email" => "[email protected]",
  "descriptionUrl" => "ucscHub.html",
}

genomes.txt and trackDb.txt files are two-deep Maps where the keys are the values of the first line of each section and the value is a Map of the lines in that whole section, like this:

Map {
  "hg18" => Map {
    "genome" => "hg18",
    "trackDb" => "hg18/trackDb.txt",
  },
  "hg19" => Map {
    "genome" => "hg19",
    "trackDb" => "hg19/trackDb.txt",
  },
  "newOrg1" => Map {
    "genome" => "newOrg1",
    "trackDb" => "newOrg1/trackDb.txt",
    "twoBitPath" => "newOrg1/newOrg1.2bit",
    "groups" => "newOrg1/groups.txt",
    "description" => "Big Foot V4",
    "organism" => "BigFoot",
    "defaultPos" => "chr21:33031596-33033258",
    "orderKey" => "4800",
    "scientificName" => "Biggus Footus",
    "htmlPath" => "newOrg1/description.html",
  },
}

Map {
  "dnaseSignal" => Map {
    "track" => "dnaseSignal",
    "bigDataUrl" => "dnaseSignal.bigWig",
    "shortLabel" => "DNAse Signal",
    "longLabel" => "Depth of alignments of DNAse reads",
    "type" => "bigWig",
  },
  "dnaseReads" => Map {
    "track" => "dnaseReads",
    "bigDataUrl" => "dnaseReads.bam",
    "shortLabel" => "DNAse Reads",
    "longLabel" => "DNAse reads mapped with MAQ",
    "type" => "bam",
  },
}

Example usage:

const fs = require('fs')
const { HubFile, GenomesFile, TrackDbFile } = require('@gmod/ucsc-hub')

const hubFile = new HubFile(fs.readFileSync('hub.txt', 'utf8'))
console.log(hubFile.get('genomesFile'))
// ↳ genomes.txt

const genomesFile = new GenomesFile(fs.readFileSync('genomes.txt', 'utf8'))
console.log(genomesFile.get('hg19').get('trackDb'))
// ↳ hg19/trackDb.txt

const trackDbFile = new TrackDbFile(fs.readFileSync('hg19/trackDb.txt', 'utf8'))
console.log(trackDbFile.get('dnaseSignal').get('shortLabel'))
// ↳ DNAse Signal

API

Table of Contents

GenomesFile

Extends RaFile

Class representing a genomes.txt file.

Parameters

  • genomesFile (string | Array<string>) A genomes.txt file as a string (optional, default [])

  • Throws Error Throws if the first line of the hub.txt file doesn't start with "genome <genome_name>" or if it has invalid entries

HubFile

Extends RaStanza

Class representing a hub.txt file.

Parameters

  • hubFile (string | Array<string>) A hub.txt file as a string (optional, default [])

  • Throws Error Throws if the first line of the hub.txt file doesn't start with "hub <hub_name>", if it has invalid entries, or is missing required entries

RaFile

Extends Map

Class representing an ra file. Each file is composed of multiple stanzas, and each stanza is separated by one or more blank lines. Each stanza is stored in a Map with the key being the value of the first key-value pair in the stanza. The usual Map methods can be used on the file. An additional method add() is available to take a raw line of text and break it up into a key and value and add them to the class. This should be favored over set() when possible, as it performs more validity checks than using set().

Parameters

  • raFile (string | Array<string>) An ra file, either as a single string or an array of strings with one stanza per entry. Supports both LF and CRLF line terminators. (optional, default [])
  • options object (optional, default {checkIndent:true})
    • options.checkIndent boolean [true] - Check if a the stanzas within the file are indented consistently and keep track of the indentation

Properties

  • nameKey (undefined | string) The key of the first line of all the stanzas (undefined if the stanza has no lines yet).

  • Throws Error Throws if an empty stanza is added, if the key in the first key-value pair of each stanze isn't the same, or if two stanzas have the same value for the key-value pair in their first lines.

add

Add a single stanza to the file

Parameters
  • stanza string A single stanza

Returns RaFile The RaFile object

update

Use add() if possible instead of this method. If using this, be aware that no checks are made for comments, empty stanzas, duplicate keys, etc.

Parameters
  • key string The key of the RaFile stanza
  • value RaStanza The RaFile stanza used to replace the prior one

delete

Delete a stanza

Parameters
  • stanza string The name of the stanza to delete (the value in its first key-value pair)

Returns boolean true if the deleted stanza existed, false if it did not

clear

Clear all stanzas and comments

toString

Returns string Returns the stanza as a string fit for writing to a ra file. Original leading indent is preserved. It may not be the same as the input stanza as lines that were joined with \ in the input will be output as a single line and all comments will have the same indentations as the rest of the stanza. Comments between joined lines will move before that line.

RaStanza

Extends Map

Class representing an ra file stanza. Each stanza line is split into its key and value and stored as a Map, so the usual Map methods can be used on the stanza. An additional method add() is available to take a raw line of text and break it up into a key and value and add them to the class. This should be favored over set() when possible, as it performs more validity checks than using set().

Parameters

  • stanza (string | Array<string>) An ra file stanza, either as a string or a array of strings with one line per entry. Supports both LF and CRLF line terminators. (optional, default [])
  • options object (optional, default {checkIndent:true})
    • options.checkIndent boolean [true] - Check if a stanza is indented consistently and keep track of the indentation

Properties

  • nameKey (undefined | string) The key of the first line of the stanza (undefined if the stanza has no lines yet).

  • name (undefined | string) The value of the first line of the stanza, by which it is identified in an ra file (undefined if the stanza has no lines yet).

  • indent (undefined | string) The leading indent of the stanza, which is the same for every line (undefined if the stanza has no lines yet, '' if there is no indent).

  • Throws Error Throws if the stanza has blank lines, if the first line doesn't have both a key and a value, if a key in the stanza is duplicated, or if lines in the stanza have inconsistent indentation.

add

Add a single line to the stanza. If the exact line already exists, does nothing.

Parameters

Returns RaStanza The RaStanza object

set

Use add() if possible instead of this method. If using this, be aware that no checks are made for comments, indentation, duplicate keys, etc.

Parameters
  • key string The key of the stanza line
  • value string The value of the stanza line

Returns RaStanza The RaStanza object

delete

Delete a line

Parameters
  • key string The key of the line to delete

Returns boolean true if the deleted line existed, false if it did not

clear

Clear all lines and comments

toString

Returns string Returns the stanza as a string fit for writing to a ra file. Original leading indent is preserved. It may not be the same as the input stanza as lines that were joined with \ in the input will be output as a single line and all comments will have the same indentations as the rest of the stanza. Comments between joined lines will move before that line.

TrackDbFile

Extends RaFile

Class representing a genomes.txt file.

Parameters

  • trackDbFile (string | Array<string>) A trackDb.txt file as a string (optional, default [])

  • Throws Error Throws if "track" is not the first key in each track or if a track is missing required keys

settings

Gets all track entries including those of parent tracks, with closer entries overriding more distant ones

Parameters
  • trackName string The name of a track

  • Throws Error Throws if track name does not exist in the trackDb

License

MIT © Generic Model Organism Database Project

About

read and write UCSC track and assembly hub files in node or the browser

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • JavaScript 100.0%