Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring and tests of dnatracing.py #183

Closed
6 of 8 tasks
ns-rse opened this issue Jun 17, 2022 · 8 comments
Closed
6 of 8 tasks

Refactoring and tests of dnatracing.py #183

ns-rse opened this issue Jun 17, 2022 · 8 comments
Assignees
Labels
duplicate This issue or pull request already exists enhancement New feature or request Medium Priority

Comments

@ns-rse
Copy link
Collaborator

ns-rse commented Jun 17, 2022

topostats.tracing.dnatracing is the module that covers the processing after "topotracing" has processed an image via Filters(), Grains() and GrainStats(). It contains a couple of classes dnaTrace() and traceStats() and some work has been done on these to align them with the new workflow (see #166).

But there are no tests in place and the code can be refactored to be cleaner which in turn will allow for finder-grained tests to be written. Further docstrings need writing for all modules, classes and methods, there is some information there but not fully qualified numpy docstrings.

Many of the methods have for: loops in them, each of the contents of these loops should be abstracted out into its own _[method] and that method then called from within a loop of the associated [method]. This makes it easier to write tests on a single case.

There is a lot of old code left in various comments or documentation blocks that will need cleaning out.

@ns-rse
Copy link
Collaborator Author

ns-rse commented Jul 20, 2022

One thing to consider is replacing the skeletonize solution that dnatracing calls with the same functionality from scikit-image skeletonize.

@alicepyne
Copy link
Member

alicepyne commented Jul 20, 2022 via email

@ns-rse
Copy link
Collaborator Author

ns-rse commented Sep 16, 2022

This is quite a big refactor and so I'm going to break it down into smaller tasks, and so will use this issue as an "Epic" and break it down into smaller tasks and list the issues below.

@ns-rse
Copy link
Collaborator Author

ns-rse commented Dec 23, 2022

I've started work on this and am aiming to simplify a lot of the code by making the class process a single grain at a time, that removes all the complexity that is scattered around that repeats each step on multiple grains.

Processing multiple grains is then simplified by iterating over a dictionary of grains which itself may contain several different sets of information such as the grain in its raw processed form (post Filters), the mask of the grain (as skeletonisation currently works on binary arrays), the original coordinates should there be a desire to reconstruct the whole image with skeletons for example.

@ns-rse
Copy link
Collaborator Author

ns-rse commented Jan 20, 2023

Belatedly adding notes from meeting 2023-01-17

  • Make binary dilation optional, this sometimes results in grains touching the border and when this happens skeletonisation doesn't always work correctly (unclear what method though whether this is scikit-image or "Joe's". A possible solution to deal with this might be to np.pad() the dilated arrays (masked or otherwise) so that there are blanks/zero's and dilated images don't touch the edges.

  • Rather than purge_obvious_crap we need to have the option to remove objects that after skeletonisation are smaller than a minimum size. Despite removing small objects during filtering and grain detection we can still end up with blobs (roughly circular artefacts) that when skeletonised reduce down to very small skeletons.

As a dummy example...

[
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0],
 [0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
 [0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0],
 [0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
 [0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
 [0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
 [0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0],
 [0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
 [0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
]

...might skeletonise to...

[
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
]

Strictly it has skeletonised, but its neither circular nor linear and should be removed.

@ns-rse
Copy link
Collaborator Author

ns-rse commented Jun 1, 2023

Add a function to topostats.io to write co-ordinates of traces to CSV.

Documented in #595 and added to the todo list.

@ns-rse
Copy link
Collaborator Author

ns-rse commented Aug 10, 2023

After #610 has been completed the next step is to spline the traces, current code-base only performs this for linear molecules, be sure to git rebase and include the changes in #653.

@ns-rse
Copy link
Collaborator Author

ns-rse commented Sep 23, 2024

Closing as work has been undertaken under the EPIC #800 and related issues linked from there.

@ns-rse ns-rse closed this as completed Sep 23, 2024
@ns-rse ns-rse added the duplicate This issue or pull request already exists label Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists enhancement New feature or request Medium Priority
Projects
None yet
Development

No branches or pull requests

4 participants