-
Notifications
You must be signed in to change notification settings - Fork 2
Terra
Terra is a collaboration between the Broad Institute, Microsoft, and Alphabet’s Verily that provides access to Google Cloud computing resources through a simple web-based user interface for non-coding biologists. You need to register with a Google account. Terra's Getting Started documentation will help you to get up and running. There is a cost to using Terra's cloud storage and processing capabilities - in our experience, running WAT3R on one biological sample with 8 million reads is ~$5.
Data can be stored in a Google Cloud Storage bucket. To run WAT3R, you need two fastq files (one with the cell barcode and unique molecular identifier, and one with the TCR sequences) and, optionally, a cell annotation file. Once you create a Terra Workspace, you can find the associated Google Bucket on the DASHBOARD tab. More information is found in Terra's help pages "Terra architecture and where your files live in it" and "Working with workspaces".
From Terra's WORKFLOWS tab, you can "Find a Workflow". WAT3R is available in the Broad Methods Repository. Export the workflow to your workspace. This will allow you to select the parameters like in this image:
- bc_fastq should be a fastq file with a cell barcode (CB) and a unique molecular identifier (UMI)
- tcr_fastq should be a fastq file with the 150 bp TCR sequences
- disk_space specifies the amount of disk space the workflow should request in GiB, we suggest 100
- memory specifies the amount of memory the workflow should request in GB
- num_threads specifies the number of CPUs the workflow should request
- annotation should be "true" or "false", indicating whether you are providing cell annotations
- cluster_annotation can refer to a txt file with cell type annotations (example)
- sample_name will be used in the run output
Details about the wdl workflow language are found here and the WAT3R wdl script is here.
Click "SAVE", then "RUN ANALYSIS". Output from the WAT3R commands wat3r
and downstream
will be written to the workspace-associated Google Bucket.