Nanopore data analysis #484

enc-kcotto · 2024-09-03T17:12:42Z

Is your feature request related to a problem? Please describe.
When trying to analyze nanopore sequencing data with CRISPResso2, the runs fails due to insufficient memory. I believe this is arising due to how CRISPResso caches reads and nanopore being noisier than Illumina sequencing data, thus a lower percentage of reads matching the cache and eventually the cache growing so large that it crashes the run. I've tried splitting the input fastq, running multiple CRISPResso runs, and then running CRISPRessoAggregate. The aggregate function works well but since I'm splitting a single sample, it would be nice to have aggregate to sum the information from all runs and just report it/plot it as one sample.

Describe the solution you'd like
It would be nice if with Aggregate, you could specify whether to merge the information from each of the runs or keep them separate as distinct samples.

Describe alternatives you've considered
A potential alternative is to be able to turn off caching in order to prevent the creation of a large cache dictionary that eventually causes the run to fail.

Additional context
Happy to hear any other ideas or discuss possible other implementations.

kclem · 2024-09-04T03:37:01Z

Hi @enc-kcotto, we're working on tools for analyzing the longer sequences - and will keep you updated as they progress.

How are you creating your sequencing libraries? Are you using hybrid selection or a primer-based method?

If you'd like to give input, feel free to reach out to me at [email protected].

kclem added the enhancement New feature or request label Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nanopore data analysis #484

Nanopore data analysis #484

enc-kcotto commented Sep 3, 2024

kclem commented Sep 4, 2024

Nanopore data analysis #484

Nanopore data analysis #484

Comments

enc-kcotto commented Sep 3, 2024

kclem commented Sep 4, 2024