GitHub - Tongdongq/darwin-gpu: GPU accelerated version of Darwin, a DNA alignment algorithm

Tongdongq / darwin-gpu Public

Notifications You must be signed in to change notification settings
Fork 1
Star 5

GPU accelerated version of Darwin, a DNA alignment algorithm

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 176 Commits
.gitignore		.gitignore
.measure_sensitivity_NPBSS.py		.measure_sensitivity_NPBSS.py
Chameleon.cpp		Chameleon.cpp
Chameleon.h		Chameleon.h
ConfigFile.cpp		ConfigFile.cpp
ConfigFile.h		ConfigFile.h
Makefile		Makefile
README		README
align.cpp		align.cpp
align.h		align.h
benchmark.py		benchmark.py
convert.sh		convert.sh
cuda_header.h		cuda_header.h
cuda_host.cu		cuda_host.cu
darwin.cpp		darwin.cpp
fasta.cpp		fasta.cpp
fasta.h		fasta.h
gact.cpp		gact.cpp
gact.h		gact.h
gdb.sh		gdb.sh
generate.sh		generate.sh
generateperfect.py		generateperfect.py
gmon.out		gmon.out
measure_sensitivity_PBSIM.py		measure_sensitivity_PBSIM.py
ntcoding.cpp		ntcoding.cpp
ntcoding.h		ntcoding.h
ntcoding.o		ntcoding.o
params.cfg		params.cfg
profile.sh		profile.sh
reads.fasta		reads.fasta
run.sh		run.sh
script_gdb		script_gdb
seed_pos_table.cpp		seed_pos_table.cpp
seed_pos_table.h		seed_pos_table.h
seed_pos_table.o		seed_pos_table.o
w_run.sh		w_run.sh
x_scalingrun.sh		x_scalingrun.sh
y_measure_mem_usage.sh		y_measure_mem_usage.sh
z_compile.sh		z_compile.sh

Repository files navigation

This repository contains a GPU implementation of Darwin [1][2], a hardware-friendly DNA aligner.
It consists of two parts: D-SOFT and GACT, which represent typical seed-and-extend methods. D-SOFT (Diagonal-band based Seed Overlapping based Filtration Technique) filters the search space by counting non-overlapping bases in matching Kmers in a band of diagonals. GACT (Genomic Alignment using Constant Tracebackmemory) can align reads of arbitrary length using constant memory for the compute-intensive step.

This implementation can be used to run on CPU only, or use the GPU-accelerated version. For more choices between individual optimizations, go back to commit e472745e.
Compile for the CPU with './z_compile.sh', or './z_compile.sh GPU' for the GPU version.
Other compile options are 'TIME', which measures the CPU and GPU time during GACT for the GPU version, and 'NOSCORE', which removes the score calculation, all overlaps will have a reported score of 0 in this case.

To allow a more flexible substitution matrix, put back the 'gact_sub_mat' variable in darwin.cpp.

Usage: ./darwin <REFERENCE>.fasta <READS>.fasta [CPU_THREADS NUM_BLOCKS THREADS_PER_BLOCK]
Reference and reads files should be the same if used for de novo alignment.
CPU_THREADS is the number of CPU threads.
NUM_BLOCKS is the number of GPU blocks each CPU thread launches, ignored when only using CPU.
THREADS_PER_BLOCK is the number of GPU threads per GPU block, ignored when only using CPU.

For 50MB of PacBio human data, taken from the 54x dataset, 8 32 64 was found to be the best run configuration.
The included reads.fasta is a 10x E.coli dataset, generated by PBSIM. The origin in the genome and readlength are put in the name, these are used by the measurement_sensitivity_PBSIM script.

The Makefile assumes Compute Capability 3.5.

Typical run:
./z_compile.sh GPU
./run.sh 8 32 64
cat darwin.*.out | sort | uniq > out.darwin
./measure_sensitivity_PBSIM.py

[1] Darwin: A Hardware-acceleration Framework for Genomic Sequence Alignment
https://www.biorxiv.org/content/early/2017/01/24/092171

[2] Darwin: A Genomics Co-processor Provides up to 15,000X Acceleration on Long Read Assembly
https://dl.acm.org/citation.cfm?id=3173193