Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roughing out a sample analysis workflow #14

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions workflows-htc/bioinformatics-samples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Submitting a Workflow that Analyzes Sequencing Samples

This assumes that for what you want to process on CHTC,
the samples are each processed individually.

Assumes files are in /staging

Make a log dir

this will submit 3 jobs
19 changes: 19 additions & 0 deletions workflows-htc/bioinformatics-samples/sample-analysis.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/bin/bash

# Set the sample id:
SAMPLEID=$1
# if you used the option with a lane argument
# uncomment to set the lane value:
# LANE=$2

# build up the filenames - edit this to match
# the filenames you have!
READ1=${SAMPLEID}_R1_001.fastq
READ2=${SAMPLEID}_R2_001.fastq

# insert any needed software setup here, if relevant

# run your program
# (replace 'head' with your analysis pipeline)
head ${READ1}
head ${READ2}
24 changes: 24 additions & 0 deletions workflows-htc/bioinformatics-samples/sample-analysis.sub
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
universe = vanilla

executable = sample-analysis.sh
arguments = $(SampleID)
# alternate form, for both samples/lanes
# arguments = $(SampleID) $(Lane)

# transfer_input_files = /path/to/file1,/path/to/file2,etc
should_transfer_files = YES
when_to_transfer_output = ON_EXIT

# adjust these based on how much computing power
# you need to analyze ONE set of paired end reads
request_cpus = 1
request_memory = 1GB
request_disk = 1GB

log = logs/$(Cluster).log
error = logs/$(Cluster)_$(SampleID).err
output = logs/$(Cluster)_$(SampleID).out

queue SampleID from samples.txt
# alternate form, for both samples/lanes
# queue SampleID,Lane from samples.txt
3 changes: 3 additions & 0 deletions workflows-htc/bioinformatics-samples/samples.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
CS001
CS006
CS905