generated from BGAcademy23/template
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
added test files and readme for gfastats
- Loading branch information
Showing
13 changed files
with
214 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
|
||
.DS_Store |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
Help: | ||
`gfastats -h` | ||
File: | ||
cat testFiles/random1.fasta ` | ||
Summary statistics: | ||
`gfastats testFiles/random1.fasta` | ||
Tabular output: | ||
`gfastats testFiles/random1.fasta -t` | ||
Change locale: | ||
`gfastats large_input.fasta.gz --locale en_US.UTF-8` | ||
Full output: | ||
`gfastats testFiles/random1.fasta --nstar-report` | ||
Report by sequence: | ||
`gfastats testFiles/random1.fasta --seq-report` | ||
Original file: | ||
`gfastats testFiles/random1.fasta -ofa` | ||
Line length: | ||
`gfastats testFiles/random1.fasta -ofa --line-length 2` | ||
Subset: | ||
`gfastats testFiles/random1.fasta Header2 -ofa` | ||
Subset with bed: | ||
`gfastats testFiles/random1.fasta -e <(echo Header2) -ofa` | ||
cat testFiles/random1.fasta.bed` | ||
`gfastats testFiles/random1.fasta -ofa -e testFiles/random1.fasta.bed` | ||
`gfastats testFiles/random1.fasta -ofa -i testFiles/random1.fasta.bed` | ||
Size of components: | ||
`gfastats testFiles/random1.fasta -s s` | ||
`gfastats testFiles/random1.fasta -s c` | ||
`gfastats testFiles/random1.fasta -s g` | ||
AGP: | ||
`gfastats testFiles/random1.fasta -b a` | ||
BED coordinates: | ||
`gfastats testFiles/random1.fasta -b s` | ||
`gfastats testFiles/random1.fasta -b c` | ||
`gfastats testFiles/random1.fasta -b g` | ||
Sorting: | ||
`gfastats testFiles/random1.fasta -ofa --sort largest` | ||
`gfastats testFiles/random1.fasta -ofa --sort descending` | ||
`gfastats testFiles/random1.fasta -ofa --sort test.sort` | ||
GFA2: | ||
`gfastats testFiles/random1.gfa2 -o gfa2` | ||
GFA2 to FASTA conversion: | ||
`gfastats testFiles/random1.gfa2 -o fasta` | ||
GFA2 to GFA1 conversion: | ||
`gfastats testFiles/random1.gfa2 -o gfa` | ||
GFA1: | ||
`gfastats testFiles/random2.gfa -o gfa` | ||
GFA1 to FASTA: | ||
`gfastats testFiles/random2.gfa -o fasta` | ||
GFA1 to GFA2: | ||
`gfastats testFiles/random2.gfa -o gfa2` | ||
GFA1 no sequence: | ||
`gfastats testFiles/random2.noseq.gfa -o gfa` | ||
GFA1 no sequence: | ||
`gfastats testFiles/random2.noseq.gfa -o fa` | ||
Homopolymer compression: | ||
`gfastats testFiles/random1.fasta --homopolymer-compress 1 -ofa` | ||
Find terminal overlaps: | ||
`gfastats testFiles/random5.findovl.gfa -ogfa` | ||
`gfastats testFiles/random5.findovl.gfa --discover-terminal-overlaps 3 -ogfa` | ||
Discover paths: | ||
`gfastats testFiles/random1.fasta -ogfa | grep -v "^P" > test.gfa` | ||
`gfastats test.gfa -ogfa` | ||
`gfastats test.gfa -ogfa2 --discover-paths` | ||
Superimpose AGP: | ||
`gfastats testFiles/random1.fasta -a testFiles/random1.agp -ofa` | ||
SAK reverse complement: | ||
`cat testFiles/random1.rvcp.sak` | ||
`gfastats testFiles/random1.fasta -ofa` | ||
`gfastats testFiles/random1.fasta -k testFiles/random1.rvcp.sak -ofa` | ||
Other SAK instructions: | ||
`cat testFiles/random1.instructions.sak` | ||
`gfastats testFiles/random1.fasta -ofa` | ||
`gfastats testFiles/random1.fasta -ofa -k <(head -1 testFiles/random1.instructions.sak)` | ||
`gfastats testFiles/random1.fasta -ofa -k <(head -2 testFiles/random1.instructions.sak)` | ||
`gfastats testFiles/random1.fasta -ofa -k <(head -3 testFiles/random1.instructions.sak)` | ||
`gfastats testFiles/random1.fasta -ofa -k <(head -4 testFiles/random1.instructions.sak)` | ||
`gfastats testFiles/random1.fasta -ogfa2 -k <(head -4 testFiles/random1.instructions.sak)` | ||
`gfastats testFiles/random1.fasta -ofa -k <(head -5 testFiles/random1.instructions.sak)` | ||
`gfastats testFiles/random1.fasta -ogfa2 -k <(head -5 testFiles/random1.instructions.sak)` | ||
`gfastats testFiles/random1.fasta -ofa -k <(head -6 testFiles/random1.instructions.sak)` | ||
`gfastats testFiles/random1.fasta -ogfa2 -k <(head -6 testFiles/random1.instructions.sak)` | ||
`gfastats testFiles/random1.fasta -ogfa2 -k <(head -6 testFiles/random1.instructions.sak)` | ||
`gfastats testFiles/random1.fasta -ogfa2 -k <(head -7 testFiles/random1.instructions.sak)` | ||
`gfastats testFiles/random1.fasta -ofa -k <(head -8 testFiles/random1.instructions.sak)` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
``` | ||
kreeq validate -f input.[fasta|fastq][.gz] -r reads1.fastq[.gz] reads2.fastq[.gz] [...] [-k 21] | ||
``` | ||
|
||
It accepts multiple read files as input, separated by space. To check out all options and flags use `kreeq -h`. | ||
|
||
You can test some typical usage with the files in the `testFiles` folder, e.g.: | ||
|
||
``` | ||
kreeq validate -f testFiles/random1.fasta -r testFiles/random1.fastq | ||
``` | ||
|
||
Importantly, the kreeq database can only be computed once on the read set, and reused for multiple analyses to save runtime: | ||
|
||
``` | ||
kreeq validate -r testFiles/random1.fastq -o db.kreeq | ||
kreeq validate -f testFiles/random1.fasta -d db.kreeq | ||
``` | ||
|
||
Similarly, kreeq databases can be generated separately for multiple inputs and combined, with increased performance in HPC environments: | ||
|
||
``` | ||
kreeq validate -r testFiles/random1.fastq -o random1.kreeq | ||
kreeq validate -r testFiles/random2.fastq -o random2.kreeq | ||
kreeq union -d random1.kreeq random2.kreeq -o union.kreeq | ||
kreeq validate -f testFiles/random1.fasta -d union.kreeq | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
newpath1 1 5 1 W Header1 2 5 + | ||
newpath1 6 10 2 N 5 scaffold yes | ||
newpath1 11 13 3 W Header2 1 3 - | ||
newpath1 14 18 4 N 5 scaffold yes | ||
newpath1 19 24 5 W Header3 4 8 + | ||
newpath2 1 5 1 W Header5 3 7 - | ||
newpath2 6 10 2 N 5 scaffold yes | ||
newpath2 11 25 3 W Header4 1 15 + |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
>Header1 5bp sequence with no gaps and 2 lowercase bases | ||
CGa | ||
cT | ||
>Header2 5bp sequence with internal 1bp non-canonical gap | ||
CG | ||
AXT | ||
>Header3 10bp sequence with internal 4bp and 1bp terminal canonical gap | ||
TGANA | ||
TNCTN | ||
>Header4 15bp sequence with start 3bp canonical gap and 3 lowercase bases | ||
NNNTTCC | ||
TcgCACtC | ||
>Header5 15bp sequence with terminal 3bp canonical gap | ||
AACTCGAT | ||
CACGNNN |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
Header1 0 5 | ||
Header2 0 3 | ||
Header2 4 5 | ||
Header3 0 3 | ||
Header3 4 6 | ||
Header3 7 9 | ||
Header4 2 13 | ||
Header5 3 14 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
H VN:Z:2.0 | ||
S Header1.1 5 CGacT | ||
S Header2.1 3 CGA | ||
S Header2.3 1 T | ||
S Header3.1 3 TGA | ||
S Header3.3 2 AT | ||
S Header3.5 2 CT | ||
S Header4.2 12 TTCCTcgCACtC | ||
S Header5.1 12 AACTCGATCACG | ||
G Header2.2 Header2.1+ Header2.3+ 1 | ||
G Header3.2 Header3.1+ Header3.3+ 1 | ||
G Header3.4 Header3.3+ Header3.5+ 1 | ||
G Header3.6 Header3.5+ Header3.5- 1 | ||
G Header4.1 Header4.2+ Header4.2+ 3 | ||
G Header5.2 Header5.1+ Header5.1- 3 | ||
O Header1 Header1.1+ 5bp sequence with no gaps and 2 lowercase bases | ||
O Header2 Header2.1+ Header2.2 Header2.3+ 5bp sequence with internal 1bp non-canonical gap | ||
O Header3 Header3.1+ Header3.2 Header3.3+ Header3.4 Header3.5+ Header3.6 10bp sequence with internal 4bp and 1bp terminal canonical gap | ||
O Header4 Header4.1 Header4.2+ 15bp sequence with start 3bp canonical gap and 3 lowercase bases | ||
O Header5 Header5.1+ Header5.2 15bp sequence with terminal 3bp canonical gap |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
JOIN Header1+ Header2+ 5 newGap1 Scaffold1 | ||
JOIN Header4+ Header5+ 5 newGap2 Scaffold2 | ||
JOIN Scaffold1+ Header3+ 10 newGap3 FinalScaffold | ||
SPLIT Header2.1 Header2.3 Scaffold3 Scaffold4 | ||
EXCISE Header3.3 3 newGap4 | ||
INVERT Header5.1 | ||
REMOVE Header1.1 | ||
RESIZE newGap2 10 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
RVCP Header4 | ||
RVCP Header3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
H VN:Z:1.2 | ||
S 11 ACCTT LN:i:5 QL:Z:?@97? | ||
S 12 TCAAGG LN:i:6 QL:Z:@6?84@ | ||
S 13 CTTgaTT LN:i:7 QL:Z:>=?@877 | ||
L 11 + 12 - 4M | ||
L 12 - 13 + 5M | ||
L 11 + 13 + 3M | ||
J 11 + 13 - 5 SC:i:1 | ||
J 13 - 12 + 3 SC:i:1 | ||
P 14 11+;13-;12+ 5,3 | ||
P 15 11+,12-,13+ 4M,5M |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
H VN:Z:1.2 | ||
S 11 * LN:i:5 QL:Z:?@97? | ||
S 12 * LN:i:6 QL:Z:@6?84@ | ||
S 13 * LN:i:7 QL:Z:>=?@877 | ||
L 11 + 12 - 4M | ||
L 12 - 13 + 5M | ||
L 11 + 13 + 3M | ||
J 11 + 13 - 5 SC:i:1 | ||
J 13 - 12 + 3 SC:i:1 | ||
P 14 11+;13-;12+ 5,3 | ||
P 15 11+,12-,13+ 4M,5M |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
H VN:Z:1.2 | ||
S 11 CCGTTCCATGAAGGCCAGAGTTACTTACCGGCCCTTTCCATGCGCGCGCCATAAA LN:i:55 | ||
S 12 GATTTAAGAATATGTTAACGGAGGATTGCACGATCTTCTCTCCTCGTGAGAGAATTTATG LN:i:60 | ||
S 13 AAATCGCATAGCTATGTATTTTGCAGAGGTAGCGACATCTTGACGGGCACTTCACAGATAGTGGG LN:i:65 | ||
J 11 + 13 - 5 SC:i:1 | ||
J 13 - 12 + 3 SC:i:1 | ||
P 14 11+;13-;12+ 5,3 | ||
P 15 11+,12-,13+ 6M,5M |