Novel-X supplementary data

This repository contains supplementary data for "Efficient detection and assembly of non-reference DNA sequences with linked-reads" paper.

10X_insertions_supplementary.pdf contains supplementary tables and figures for Novel-X paper.

VCF-files

Results folder contains all vcf files that we used for benchmarks that generated Figures 4-5 and Tables S1-S3 from the article.

NA12878, NA19240, CHM1, CHM13, HG002 contains VCFs created using 10X Chromium datasets.

NA12878_tellseq, HG002_tellseq - UST Tell-Seq datasets.

NA12878_stlfr, HG002_stlfr - stLFR datasets.

Folder simulated contains different VCFs:

No postfix - simple simulated dataset created using LRsim.
80, 60, 40, 20 postfix - datasets that were created by downsampling original simulated removing 80/60/40/20 percent of reads
spades_first, supernova_first, velvel_second, supernova_second - VCFs that were used to create table S2. Spades_first means that in first assembly round we tried to use SPAdes instead of Velvet.

Important Python scripts

compare_vcf.py - scripts that was used to perform event comparison using positions and SV length. In order to run it go to the root repository folder and pass dataset parameter (simulated, HG002, NA12878_tellseq, ...).

Novel-X enables discovery of novel insertions in a diverse cohort

Folder 68_samples contain vcf files that were used to perform analysis on a diverse cohort.

Novel-X flowchart

flowchart.pdf contains a detailed pipeline scheme.

Read and barcode usage statistics

In read_barcode_statistics.pdf you can find a statistics on read and barcode usage on the key steps of Novel-X for each dataset from the paper.

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
68_samples		68_samples
results		results
simulation		simulation
10X_insertions_supplementary.pdf		10X_insertions_supplementary.pdf
CHM13_180GB_CrG_GRCh38_phased_possorted.vcf		CHM13_180GB_CrG_GRCh38_phased_possorted.vcf
CHM1_180GB_CrG_GRCh38_phased_possorted.vcf		CHM1_180GB_CrG_GRCh38_phased_possorted.vcf
CHM1_final_genotypes.annotated.vcf		CHM1_final_genotypes.annotated.vcf
NA12878_GRCh37.vcf		NA12878_GRCh37.vcf
NA19240_table_filtered.pdf		NA19240_table_filtered.pdf
README.md		README.md
add_sequence_to_nui.py		add_sequence_to_nui.py
add_sequence_to_popins.py		add_sequence_to_popins.py
ancestry_specific_insertions.txt		ancestry_specific_insertions.txt
ancestry_table_to_latex.py		ancestry_table_to_latex.py
bam_to_vcf.py		bam_to_vcf.py
barcode_percentage.py		barcode_percentage.py
barcode_stats.py		barcode_stats.py
chm13_popins.vcf		chm13_popins.vcf
chm1_smap_vcf_compare.py		chm1_smap_vcf_compare.py
common.py		common.py
compare_supernova_vcf2.py		compare_supernova_vcf2.py
compare_vcf.py		compare_vcf.py
compare_vcf2.py		compare_vcf2.py
compare_vcf3.py		compare_vcf3.py
compare_vcf4.py		compare_vcf4.py
compare_vcf_HG002.py		compare_vcf_HG002.py
compare_vcf_na.py		compare_vcf_na.py
compare_vcf_supernova.vcf		compare_vcf_supernova.vcf
draw_plots.ipynb		draw_plots.ipynb
filter_novel_insertions.py		filter_novel_insertions.py
filter_novel_insertions_dust.py		filter_novel_insertions_dust.py
flowchart.pdf		flowchart.pdf
from_bed_to_fasta_na.py		from_bed_to_fasta_na.py
from_vcf_to_fasta_supernova.py		from_vcf_to_fasta_supernova.py
gencode_overlap.py		gencode_overlap.py
get_reference_part.py		get_reference_part.py
insertions_setcover.vcf		insertions_setcover.vcf
is_repetitive.py		is_repetitive.py
is_there_misassemblies.py		is_there_misassemblies.py
mappability_track.py		mappability_track.py
merge_vcf.py		merge_vcf.py
minimap_to_vcf.py		minimap_to_vcf.py
out.vcf		out.vcf
output.vcf		output.vcf
popins.vcf		popins.vcf
read_barcode_statistics.pdf		read_barcode_statistics.pdf
remapped_insertions_setcover.vcf		remapped_insertions_setcover.vcf
remapped_popins.vcf		remapped_popins.vcf
two_vcf_statistics.py		two_vcf_statistics.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Novel-X supplementary data

VCF-files

Important Python scripts

Novel-X enables discovery of novel insertions in a diverse cohort

Novel-X flowchart

Read and barcode usage statistics

About

Releases

Packages

Languages

1dayac/novel_insertions_supplementary

Folders and files

Latest commit

History

Repository files navigation

Novel-X supplementary data

VCF-files

Important Python scripts

Novel-X enables discovery of novel insertions in a diverse cohort

Novel-X flowchart

Read and barcode usage statistics

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages