Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow bonds etc to be additively guessed when present #4761

Merged
merged 32 commits into from
Nov 11, 2024

Conversation

lilyminium
Copy link
Member

@lilyminium lilyminium commented Oct 25, 2024

Fixes #4759

Changes made in this Pull Request:

  • Fixes bond caching when adding/removing bonds with guess_TopologyAttr
  • bonds, etc in force_guess now delete existing bonds, etc and replace them with a new guess
  • bonds, etc in to_guess simply are applied additively on top of the existing bonds
  • Changes force-guessing bonds (as in the original guesser PR) to additively guess them, which was the spirit of the previous bond guessing (as shown by some failing PDB tests in commit 1fc4514)

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Developers certificate of origin


📚 Documentation preview 📚: https://mdanalysis--4761.org.readthedocs.build/en/4761/

@pep8speaks
Copy link

pep8speaks commented Oct 25, 2024

Hello @lilyminium! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 461:80: E501 line too long (86 > 79 characters)
Line 1244:1: W293 blank line contains whitespace
Line 1674:80: E501 line too long (87 > 79 characters)

Line 233:1: W293 blank line contains whitespace

Comment last updated at 2024-11-11 08:38:31 UTC

@lilyminium
Copy link
Member Author

(I have a fix for this but can't get tests to work due to #4762; fix incoming in #4763.)

@lilyminium
Copy link
Member Author

(Current status of PR: tests are failing due to #4762, assuming they pass after that's resolved, this should be ready for review.)

@lilyminium lilyminium marked this pull request as ready for review October 26, 2024 04:42
@lilyminium
Copy link
Member Author

@yuxuanzhuang if you have time to look at this, it'd be great if you could see if this resolves the original issue in #4759!

Copy link

codecov bot commented Oct 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.64%. Comparing base (800b4b2) to head (839e7f3).
Report is 3 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4761      +/-   ##
===========================================
+ Coverage    93.59%   93.64%   +0.04%     
===========================================
  Files          177      189      +12     
  Lines        21710    22808    +1098     
  Branches      3052     3055       +3     
===========================================
+ Hits         20320    21358    +1038     
- Misses         943     1003      +60     
  Partials       447      447              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lilyminium lilyminium added this to the Release 2.8.0 milestone Oct 27, 2024
Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First half of a review - blocking over the NoDataError check, but also some suggestions for some code comments to help future devs / reviewers.

package/MDAnalysis/core/universe.py Show resolved Hide resolved
package/MDAnalysis/core/universe.py Show resolved Hide resolved
package/MDAnalysis/core/universe.py Show resolved Hide resolved
@@ -1640,23 +1649,32 @@ def guess_TopologyAttrs(
fg = attr in force_guess
try:
values = guesser.guess_attr(attr, fg)
except ValueError as e:
except NoDataError as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to do this? There are a couple of cases in guess_bonds where we can encounter a ValueError instead of a NoDataError.

e.g. https://github.com/MDAnalysis/mdanalysis/blob/develop/package/MDAnalysis/guesser/default_guesser.py#L430

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what should happen when you encounter a KeyError? (i.e. the unrecognised topologyattr check you added)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup for this exact case -- I didn't want to change existing behaviour too much in case downstream packages were relying on it, and this is an existing test for vdwradii that wasn't introduced with guessers. The NoDataError is meant to catch cases where (e.g.) masses can't be guessed since there aren't types, not if the vdwradii is incomplete, so you can turn it off with error_if_missing. Same with the KeyError.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I just don't follow what you mean here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. I hallucinated that you linked a test instead of a spot in the code, sorry about that. Basically IIRC there's two tests that test for this ValueError. One is definitely

def test_universe_guess_bonds_no_vdwradii(self):
"""Make a Universe that has atoms with unknown vdwradii."""
with pytest.raises(ValueError):
mda.Universe(two_water_gro_nonames, guess_bonds=True)
and the other one I can't immediately recall. To preserve existing pre-guesser behaviour, we want the ValueError to pass through this check and get caught by those tests.

These passed in the initial guesser PR because that was pre #4754.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, keeping the original behavior makes sense to me (and I confirm it is the original behavior).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NoDataError is meant to catch cases where (e.g.) masses can't be guessed since there aren't types, not if the vdwradii is incomplete, so you can turn it off with error_if_missing. Same with the KeyError.

I'm not sure I understand what you mean by "same with the KeyError" - i.e. there is nothing here that catches the KeyError as far as I can tell.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of not stalling things more I'm going to go with it (i.e. the KeyError shouldn't happen under normal circumstances), but I would like to know what you meant here @lilyminium - it's unclear if you meant that the KeyError should behave like the NoDataError, in which case this except should include the KeyError too?

package/MDAnalysis/core/universe.py Outdated Show resolved Hide resolved
@IAlibay
Copy link
Member

IAlibay commented Oct 27, 2024

Thanks for the work here @lilyminium - overall this looks good, but I'm blocking on a bit more discussion for the NoDataError and the "duplicate old bonds" test.

@orbeckst
Copy link
Member

I don't know the topology/guessers well enough to give informed input here, sorry.

Copy link
Contributor

@yuxuanzhuang yuxuanzhuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After testing a few corner cases and comparing it with the previous behavior, I think it looks good to me!

@@ -1640,23 +1649,32 @@ def guess_TopologyAttrs(
fg = attr in force_guess
try:
values = guesser.guess_attr(attr, fg)
except ValueError as e:
except NoDataError as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, keeping the original behavior makes sense to me (and I confirm it is the original behavior).

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one follow-up question that I'm going to punt to later, and one formatting thing that I'll self merge. Otherwise lgtm.

testsuite/MDAnalysisTests/guesser/test_base.py Outdated Show resolved Hide resolved
@@ -1640,23 +1649,32 @@ def guess_TopologyAttrs(
fg = attr in force_guess
try:
values = guesser.guess_attr(attr, fg)
except ValueError as e:
except NoDataError as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NoDataError is meant to catch cases where (e.g.) masses can't be guessed since there aren't types, not if the vdwradii is incomplete, so you can turn it off with error_if_missing. Same with the KeyError.

I'm not sure I understand what you mean by "same with the KeyError" - i.e. there is nothing here that catches the KeyError as far as I can tell.

@@ -1640,23 +1649,32 @@ def guess_TopologyAttrs(
fg = attr in force_guess
try:
values = guesser.guess_attr(attr, fg)
except ValueError as e:
except NoDataError as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of not stalling things more I'm going to go with it (i.e. the KeyError shouldn't happen under normal circumstances), but I would like to know what you meant here @lilyminium - it's unclear if you meant that the KeyError should behave like the NoDataError, in which case this except should include the KeyError too?

@IAlibay IAlibay merged commit e6bc096 into MDAnalysis:develop Nov 11, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unclear Error When Using to_guess=['bonds'] with Existing Bonds in Topology
5 participants