Skip to content
Peter van Galen edited this page Apr 5, 2022 · 10 revisions

Running WAT3R in Terra

Signing up

Terra is a collaboration between the Broad Institute, Microsoft, and Alphabet’s Verily that provides access to Google Cloud computing resources through a simple web-based user interface for non-coding biologists. You need to register with a Google account. Terra's Getting Started documentation will help you to get up and running. There is a cost to using Terra's cloud storage and processing capabilities - in our experience, running WAT3R on one biological sample with 8 million reads is ~$5.

Getting data in place

Data can be stored in a Google Cloud Storage bucket. To run WAT3R, you need two fastq files (one with the cell barcode and unique molecular identifier, and one with the TCR sequences) and, optionally, a cell annotation file. Once you create a Terra Workspace, you can find the associated Google Bucket on the DASHBOARD tab. More information is found in Terra's help pages "Terra architecture and where your files live in it" and "Working with workspaces".

Running the workflow

From Terra's WORKFLOWS tab, you can "Find a Workflow". WAT3R is available in the Broad Methods Repository. Export the workflow to your workspace. This will allow you to select the parameters like in this image: Screenshot of Terra workflow setup page

  • bc_fastq should be a fastq file with a cell barcode (CB) and a unique molecular identifier (UMI)
  • tcr_fastq should be a fastq file with the 150 bp TCR sequences
  • disk_space specifies the amount of disk space the workflow should request in GiB, we suggest 100
  • memory specifies the amount of memory the workflow should request in GB
  • num_threads specifies the number of CPUs the workflow should request
  • annotation should be "true" or "false", indicating whether you are providing cell annotations
  • cluster_annotation can refer to a txt file with cell type annotations (example)
  • sample_name will be used in the run output

Details about the wdl workflow language are found here and the WAT3R wdl script is here.

Click "SAVE", then "RUN ANALYSIS". Output from the WAT3R commands wat3r and downstream will be written to the workspace-associated Google Bucket.

Clone this wiki locally