Skip to content

Commit

Permalink
Merge pull request #36 from agduncan94/develop
Browse files Browse the repository at this point in the history
Release 0.1.3
  • Loading branch information
agduncan94 authored Nov 15, 2019
2 parents c50d404 + 4bb94bd commit 67a7950
Show file tree
Hide file tree
Showing 68 changed files with 3,144 additions and 298 deletions.
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
.DS_Store
node_modules/
docker/data/
docker/data/
cypress/data/
cypress/screenshots/
cypress/videos/
39 changes: 39 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
language: node_js

node_js:
- 10

addons:
apt:
packages:
- libgconf-2-4

cache:
npm: true
directories:
- ~/.cache

services:
- docker

before_install:
- mkdir docker/data
- cp cypress/data/tracks.conf docker/data/tracks.conf
- cp cypress/data/Homo_sapiens.GRCh38.dna.chromosome.1.fa.fai docker/data/Homo_sapiens.GRCh38.dna.chromosome.1.fa.fai
- cp data/jbrowse.conf docker/data/jbrowse.conf
- cd docker/data
- curl -o Homo_sapiens.GRCh38.dna.chromosome.1.fa.gz http://ftp.ensembl.org/pub/release-94/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.chromosome.1.fa.gz
- gzip -d Homo_sapiens.GRCh38.dna.chromosome.1.fa.gz
- cd ../../

install:
- npm ci
- cd docker
- 'if [ "$TRAVIS_PULL_REQUEST" != "false" ]; then docker-compose build --build-arg plugin_version=${TRAVIS_PULL_REQUEST_BRANCH}; fi'
- 'if [ "$TRAVIS_PULL_REQUEST" = "false" ]; then docker-compose build --build-arg plugin_version=${TRAVIS_BRANCH}; fi'
- docker-compose up -d
- cd ../

script:
# - $(npm bin)/cypress run --record
- $(npm bin)/cypress run
155 changes: 105 additions & 50 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
[![Build Status](https://travis-ci.org/agduncan94/gdc-viewer.svg?branch=develop)](https://travis-ci.org/agduncan94/gdc-viewer)
# GDC JBrowse Plugin
A plugin for [JBrowse](https://jbrowse.org/) for viewing [GDC](https://gdc.cancer.gov/) data. For any bugs, issues, or feature recommendations please create an issue through GitHub.

Expand All @@ -13,34 +14,19 @@ For installing gdc-viewer plugin:
2. Add 'gdc-viewer' to the array of plugins in the `jbrowse_conf.json`.

## 3. Install Reference Sequence Data
Now setup the reference sequence used. GDC requires the GRCh38 Human reference files, which can be found at http://ftp.ensembl.org/pub/release-94/fasta/homo_sapiens/dna/. You'll want to download the files of the form `Homo_sapiens.GRCh38.dna.chromosome.1.fa.gz`.
Now setup the reference sequence used. GDC requires the GRCh38 Human reference files.

Then you can use the `bin/prepare-refeqs.pl` command to generate the RefSeq information.
Download the GRCh38 `.fa` and `.fa.fai` files online (ex. http://bioinfo.hpc.cam.ac.uk/downloads/datasets/fasta/grch38/). Then put the following in `./data/tracks.conf` (note files may be named something else).

Below is an example of these two steps for Chr1.

Ex. Chromosome 1
1. Download Homo_sapiens.GRCh38.dna.chromosome.1.fa.gz from the above site.
```
wget http://ftp.ensembl.org/pub/release-94/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.chromosome.1.fa.gz
```
2. Setup refeq with the following command
```
bin/prepare-refseqs.pl --fasta Homo_sapiens.GRCh38.dna.chromosome.1.fa.gz
refSeqs=GRCh38.genome.fa.fai
[tracks.refseqs]
urlTemplate=GRCh38.genome.fa
```
Note that you can specify multiple fast in one command by doing `--fasta fasta1.fa.gz --fasta fasta2.fa.gz ...`

## 4. Adding new tracks
We have some basic example tracks in `data/tracks.conf`. You can also add new tracks by using the GDC dialog accessible within JBrowse. These are present in the menu under `GDC`.

### A. Explore cases, genes and mutations
This dialog is similar to the Exploration section of the GDC data portal. As you apply facets on the left-hand side, updated results will be shown on the right side. You can create donor specific SSM, Gene, and CNV tracks, along with GDC-wide SSM, Gene and CNV tracks.

### B. Explore Projects
This dialog shows the projects present on the GDC Data Portal. You can add SSM, Gene, and CNV tracks for each project.

### C. Explore Primary Sites
This dialog shows the primary sites present on the GDC Data Portal. You can add SSM, Gene, and CNV tracks for each primary site.
We have some basic example tracks in `data/tracks.conf`. You can also add new tracks by using the GDC dialog accessible within JBrowse. These are present in the menu under `GDC`. See [Dynamic Track Generation](#dynamic-track-generation) for more details.

## 5. Run JBrowse
You'll have to run the following commands:
Expand Down Expand Up @@ -72,9 +58,11 @@ Note that this will only show preloaded tracks as well as tracks you have added
# Available Store SeqFeature
## A note on filters
All SeqFeatures support filters as they are defined in the [GDC API Documentation](https://docs.gdc.cancer.gov/API/Users_Guide/Search_and_Retrieval/#filters-specifying-the-query).

Note that filters should have the filter type prepended to the front. Ex. Case filters use `cases.`, SSM filters use `ssms.`, and Gene filters use `genes.`. GraphQL is used to retrieve results, so if the filters work there, they work with these Store classes.

The following shows a filter for cases by ethnicity:
```
{
"op":"in",
"content":{
Expand All @@ -84,81 +72,148 @@ The following shows a filter for cases by ethnicity:
]
}
}
```

You can view/edit the filters associated with a track by clicking the down arrow for the track menu and selecting `View Applied Filters`. Be careful, there are currently no checks to see if the filters are valid before applying them.

## Genes
A simple view of all of the genes seen across all cases.

You can view case specific genes by setting the `case` field.

You can apply filters to the track too, in the same format as GDC. The below example only shows Genes whose biotype is not 'protein_coding'.

```
{
"op":"!=",
"content":{
"field":"cases.biotype",
"value":"protein_coding"
}
}
```

To put it in the track config you may want to minimize it as such:
```
filters={"op":"!=","content":{"field":"cases.biotype","value":"protein_coding"}}
```

Example Track:
```
[tracks.GDC_Genes]
storeClass=gdc-viewer/Store/SeqFeature/Genes
type=JBrowse/View/Track/GeneTrack
key=GDC Genes
metadata.datatype=Gene
unsafePopup=true
```

You can apply filters to the track too, in the same format as GDC. The below example only shows Genes whose biotype is not 'protein_coding'.

```
filters={"op":"!=","content":{"field":"cases.biotype","value":"protein_coding"}}
```

You can set the max number of genes to return with the `size` field. It defaults to 100.
You can view case specific genes by setting the `case` field.
![GDC Genes](images/GDC-genes-protein-coding.png)

### Extra notes
You can set the max number of genes to return with the `size` field (per panel). It defaults to 100. The smaller the number, the faster the results will appear.

## SSMs
A simple view of all of the simple somatic mutations seen across all cases.

You can view case specific SSMs by setting the `case` field.

You can apply filters to the track too, in the same format as GDC. The below example only shows SSMs whose reference allele is 'G'.
```
{
"op":"=",
"content":{
"field":"ssms.reference_allele",
"value":"G"
}
}
```

To put it in the track config you may want to minimize it as such:
```
filters={"op":"=","content":{"field":"ssms.reference_allele","value":"G"}}
```

Example Track:
```
[tracks.GDC_SSM]
storeClass=gdc-viewer/Store/SeqFeature/SimpleSomaticMutations
type=gdc-viewer/View/Track/SSMVariants
key=GDC SSM
metadata.datatype=SSM
unsafePopup=true
```

You can apply filters to the track too, in the same format as GDC. The below example only shows SSMs whose reference allele is 'G'.

```
filters={"op":"=","content":{"field":"ssms.reference_allele","value":"G"}}
```

You can set the max number of SSMs to return with the `size` field. It defaults to 100.
You can view case specific SSMs by setting the `case` field.
![GDC SSMs](images/GDC-mutations-base-g.png)

### Extra notes
You can set the max number of SSMs to return with the `size` field (per panel). It defaults to 100. The smaller the number, the faster the results will appear.

## CNVs
A simple view of all of the CNVs seen across all cases.

You can view case specific CNVs by setting the `case` field.

You can apply filters to the track too, in the same format as GDC. The below example only shows CNVs that are 'Gains'.
```
{
"op":"=",
"content":{
"field":"cnv_change",
"value":[
"Gain"
]
}
}
```

To put it in the track config you may want to minimize it as such:
```
filters={"op":"=","content":{"field":"cnv_change","value":["Gain"]}}
```

Example Track:
```
[tracks.GDC_CNV]
storeClass=gdc-viewer/Store/SeqFeature/CNVs
type=gdc-viewer/View/Track/Wiggle/XYPlot
type=gdc-viewer/View/Track/CNVTrack
key=GDC CNV
metadata.datatype=CNV
autoscale=local
bicolor_pivot=0
unsafePopup=true
```

You can apply filters to the track too, in the same format as GDC. The below example only shows CNVs that are 'Gains'.
![GDC CNVs](images/GDC-cnv-gain.png)

```
filters={"op":"=","content":{"field":"cnv_change","value":["Gain"]}}
```
### Extra notes
You can set the max number of CNVs to return with the `size` field (per panel). It defaults to 500. The smaller the number, the faster the results will appear.

You can set the max number of CNVs to return with the `size` field. It defaults to 500.
You can view case specific CNVs by setting the `case` field.
# Dynamic Track Generation
## Explore cases, genes and mutations
This dialog is similar to the Exploration section of the GDC data portal. As you apply facets on the left-hand side, updated results will be shown on the right side. You can create donor specific SSM, Gene, and CNV tracks, along with GDC-wide SSM, Gene and CNV tracks.
![GDC Portal](images/GDC-portal-explore.png)

Note: You can also use a density plot for the copy number data. Simply change the type from `JBrowse/View/Track/Wiggle/XYPlot` to `JBrowse/View/Track/Wiggle/Density.`
## Explore Projects
This dialog shows the projects present on the GDC Data Portal. You can add SSM, Gene, and CNV tracks for each project.
![GDC projects](images/GDC-project-browser.png)

## Explore Primary Sites
This dialog shows the primary sites present on the GDC Data Portal. You can add SSM, Gene, and CNV tracks for each primary site.
![GDC primary sites](images/GDC-primary-sites.png)

# Export Types
The following export types are supported by both GDC Genes and SSMs. To export, select `Save track data` in the track dropdown. Note that not all track information is carried over to the exported file.
* BED
* GFF3
* Sequin Table
* CSV
* Track Config
* Track Config

# Automated testing
Cypress.io is used for testing this plugin. The following steps show how to run the tests locally.
1. Install JBrowse but don't install chromosome files.
2. Download Chr 1 fasta from `http://ftp.ensembl.org/pub/release-94/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.chromosome.1.fa.gz`. There should be the fasta index file in `cypress/data/Homo_sapiens.GRCh38.dna.chromosome.1.fa.fai`. Put these files into `jbrowse/data/`.
3. Install Cypress.io with `npm install`.
4. Place `cypress/data/tracks.conf` into your `jbrowse/data/` directory. Make sure no other tracks are present.
5. Run `npx cypress open` or `npx cypress run` or `npm run e2e`

**Note** while some tests have mocked endpoints, not all endpoints are mocked. This could lead to breakage of tests in the future.
5 changes: 5 additions & 0 deletions cypress.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"viewportHeight": 900,
"viewportWidth": 1440,
"projectId": "d9b81g"
}
1 change: 1 addition & 0 deletions cypress/data/Homo_sapiens.GRCh38.dna.chromosome.1.fa.fai
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1 248956422 56 60 61
96 changes: 96 additions & 0 deletions cypress/data/jbrowse.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
#### JBrowse main configuration file

## uncomment the section below to customize this browser's title and description
[aboutThisBrowser]
title = JBrowse GDC
description = View GDC Data Portal SSMs, Genes, and CNVs

## uncomment and edit the example below to configure a faceted track selector
[trackSelector]
type = Faceted
displayColumns =
+ label
+ key
+ datatype
+ case
+ project
+ primarySite
## optionally sort the faceted track selector by column (use the names from displayColumns)
# initialSortColumn=label
## optionally give different names to some of the data facets using renameFacets
# [trackSelector.renameFacets]
# submission = Submission ID
# developmental-stage = Conditions
# cell-line = Cell Line
# key = Dataset
# label = Track

## uncomment this section to get hierarchical trackselector options
# [trackSelector]
## optionally turn off sorting for the hierarchical track selector
# sortHierarchical = false
## set collapsed categories for the hierarchical track selector
# collapsedCategories = Reference sequence,Quantitative / XY Plot
## set category ordering in the hierarchical track selector
# categoryOrder = BAM, Transcripts, Quantitative/Density, VCF

## configure where to get metadata about tracks. always indexes the
## `metadata` part of each track config, but this can be used to load
## additional metadata from CSV or JSON urls
# [trackMetadata]
# sources = data/trackMetadata.csv


[GENERAL]


## add a document.domain to set the same-origin policy
# documentDomain=foobar.com

## use classic jbrowse menu with file instead of track and genome
#classicMenu = true

## hide open genome option
#hideGenomeOptions = true

## enable or disable high resolution rendering for canvas features. set to auto, disabled, or numerical scaling factor. default: 2
# highResolutionMode=auto

## uncomment to change the default sort order of the reference
## sequence dropdown
# refSeqOrder = length descending


## to set a default data directory other than 'data', uncomment and
## edit the line below
# dataRoot = data

## optionally add more include statements to load and merge in more
## configuration files
include = {dataRoot}/trackList.json
include += {dataRoot}/tracks.conf
# include += ../url/of/my/other/config.json
# include += another_config.conf

## uncomment and edit the example below to enable one or more
## JBrowse plugins
[ plugins.gdc-viewer ]
location = plugins/gdc-viewer

# [ plugins.AnotherPlugin ]
# location = ../plugin/dir/someplace/else

## edit the datasets list below to add datasets to the jbrowse dataset
## selector

# [datasets.volvox]
# url = ?data=sample_data/json/volvox
# name = Volvox Example

# [datasets.modencode]
# url = ?data=sample_data/json/modencode
# name = MODEncode Example

# [datasets.yeast]
# url = ?data=sample_data/json/yeast
# name = Yeast Example
Loading

0 comments on commit 67a7950

Please sign in to comment.