You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @BjornFJohansson this is what we discussed the other day that I could not explain clearly. Here is an example. Basically, when calling Assembly.assemble_linear, the assemblies that are returned are only the ones that start from the first fragment in either orientation, and finish with the last fragment in either orientation. See minimal example below where the same inputs are provided, but their order is changed:
# Current pydna assembly implementationfrompydnaimportassemblyfrompydna.dseqrecordimportDseqrecordfragments= [
Dseqrecord('aaaTCGATGGGaaa', id='f_1'),
Dseqrecord('ccTCGATGGGcccCTCTCATAcc', id='f_2'),
Dseqrecord('ggCTCTCATAggg', id='f_3'),
]
print('Old implementation, original order')
asm=assembly.Assembly(fragments, limit=8)
foroutputinasm.assemble_linear():
print(output.seq)
print()
print('Old implementation, change order')
# Change the order, now fragment f_1 is lastasm=assembly.Assembly(fragments[1:] +fragments[:1], limit=8)
foroutputinasm.assemble_linear():
print(output.seq)
print()
This prints
Old implementation, original order
aaaTCGATGGGcccCTCTCATAggg
Old implementation, change order
ggTATGAGAGgggCCCATCGAttt < This is (f_2 inverted + f_1 inverted)
ccTCGATGGGaaa < This is f_2 + f_1
As you can see, it only returns assemblies that start from the first fragment in either orientation and finish with the last fragment, even when the first result is a subassembly of f_1 + f_2 + f_3.
Instead, the new implementation ignores the order of inputs for linear assemblies, and returns always the same output. See how all possibilities are returned.
To reproduce the old behaviour and pass most old tests, I introduced the parameter use_fragment_order. If you agree, I think this can be removed after the merge (I will fix the tests).
importassembly2print('New implementation, original order')
# New implementationasm=assembly2.Assembly(fragments, limit=8, use_fragment_order=False)
foroutputinasm.assemble_linear():
print(output.seq)
print()
print('New implementation, change order')
asm=assembly2.Assembly(fragments[1:] +fragments[:1], limit=8, use_fragment_order=False)
foroutputinasm.assemble_linear():
print(output.seq)
print()
print('New implementation, original order, start from first')
# To reproduce the old behavior, just set use_fragment_order=Trueasm=assembly2.Assembly(fragments, limit=8, use_fragment_order=True)
foroutputinasm.assemble_linear():
print(output.seq)
print()
print('New implementation, change order, start from first')
asm=assembly2.Assembly(fragments[1:] +fragments[:1], limit=8, use_fragment_order=True)
foroutputinasm.assemble_linear():
print(output.seq)
This prints
New implementation, original order
aaaTCGATGGGcccCTCTCATAggg < f_1 + f_2 + f_3
ccTCGATGGGaaa < f_2 + f_1
ggCTCTCATAcc < f_3 + f_2
New implementation, change order
aaaTCGATGGGcccCTCTCATAggg
ccTCGATGGGaaa
ggCTCTCATAcc
New implementation, original order, start from first
aaaTCGATGGGcccCTCTCATAggg
New implementation, change order, start from first
ccTCGATGGGaaa
ggTATGAGAGgggCCCATCGAttt
To reproduce the old behaviour and pass most old tests, I introduced the parameter use_fragment_order. If you agree, I think this can be removed after the merge (I will fix the tests).
Should i then remove this behaviour after the merge?
Hi @BjornFJohansson this is what we discussed the other day that I could not explain clearly. Here is an example. Basically, when calling
Assembly.assemble_linear
, the assemblies that are returned are only the ones that start from the first fragment in either orientation, and finish with the last fragment in either orientation. See minimal example below where the same inputs are provided, but their order is changed:This prints
As you can see, it only returns assemblies that start from the first fragment in either orientation and finish with the last fragment, even when the first result is a subassembly of f_1 + f_2 + f_3.
Instead, the new implementation ignores the order of inputs for linear assemblies, and returns always the same output. See how all possibilities are returned.
To reproduce the old behaviour and pass most old tests, I introduced the parameter
use_fragment_order
. If you agree, I think this can be removed after the merge (I will fix the tests).This prints
cc @hiyama341 @JamesBagley since they might be interested
The text was updated successfully, but these errors were encountered: