-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output GFA missing unitigs in S lines #73
Comments
@atabeerk Hi, I have the same problem and the Bandage warning me that the format is not correct. See attached warning. How should I solve it? Thanks, |
Hi @jianshu93, thanks for reaching out. We will look into this. Ataberk |
Thanks for the quick response. The code is well-written, I can run it without any problems and produce expected output. Just the format (feel like a small bug). Let me know if you want my data to reproduce the error. best, Jianshu |
@jianshu93, if you can share
that would be very helpful. Feel free to attach the files to this issue or send an email to [email protected] if that is what you prefer. Ataberk |
Hi @atabeerk, I shared with you the reads, metaFlye assembly graph and strainy graph output. I followed exact the same scripts as you suggested: I've shared the input and output files with you via goole drive, let me know if you cannot access them. Best, |
Hi @jianshu93, I got the files. I will keep you updated. Best, |
hi @atabeerk, Is there any updates for this error. I just needs a GFA plot that can be visualized in Bandage for a companion's purposes. Thanks, |
hi @jianshu93, |
Oh, I did not realize that , not sure what happened. I can confirm that it works. But it seems not so many new strain genomes were created. Do you have an idea why since a simple LJA/metaFlye assembly, we can have 5 complete/circular genomes. Thanks, |
Hi @jianshu93, |
Hi @jianshu93, I was able to run strainy on my end got the results. The statistics ( Let me know if you have more questions. Best, |
Hi @atabeerk, Many thanks for confirming it. So those hairpin problem (several big assembly in the plot) is not due to strain heterogeneity right but the coverage is too high. How can I resolve those hairpin if not via phasing. Thanks, |
Hi @jianshu93, We were looking at the flye assembly and there are regions with heterogeneity. This may indicate that flye was able to phase some of these regions already. When we input a graph that is already partially phased (such as this one) it is not surprising that strainy produces a disconnected output. This is one possible reason. We are also aware that strainy may create disconnected graphs even with completely unphased input assemblies and we are working on a fix that should be available in the next update. One way to alleviate these issues on this data with flye+strainy is to change flye parameters (e.g., sensitivity) so that the assembly produced by flye is simpler and more contiguous. Strainy may be able to work better with that kind of input. Best, |
In the output GFA file (strainy_final.gfa), some unitigs in L lines do not have corresponding S lines. This may be due to attempting to remove the unitigs at some point (and removing their S lines) but forgetting to remove the L lines in which these unitigs are used.
The attached file is the output of the mock ONT dataset. Some unitigs that have that issue:
edge_1291_139, edge_956_40, edge_874_33, edge_3054_s1_3041692, edge_3024_11380, edge_1553_1030193, edge_2864_1000769
The text was updated successfully, but these errors were encountered: