Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: modernise HBA to use AG as objects internally instead of selection strings - code only PR (Issue #3933) #4533

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from
21 changes: 21 additions & 0 deletions package/CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,27 @@ The rules for this file:


-------------------------------------------------------------------------------
03/27/23 lunamorrow

* 2.8.1

Fixes

Enhancements
* modernise `HydrogenBondAnalysis` to use `AtomGroups` as objects internally
instead of selection strings

Changes
* `guess_acceptors`, `guess_hydrogens` and `guess_donators` no longer return
selection strings (return `AtomGroups`instead)
* `_prepare` and `_single_frame` no longer pass selection strings to other
class methods (pass `AtomGroups` instead)

Deprecations
* Selection string return of `guess_acceptors`, `guess_hydrogens` and
`guess_donators` has been deprecated in favour of return of `AtomGroups`
and will removed in removed in version 3.0.0 (PR #4533)

??/??/?? IAlibay, HeetVekariya, marinegor, lilyminium, RMeli,
ljwoods2, aditya292002, pstaerk, PicoCentauri, BFedder,
tyler.je.reddy
Expand Down
141 changes: 79 additions & 62 deletions package/MDAnalysis/analysis/hydrogenbonds/hbond_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -406,9 +406,9 @@ def guess_hydrogens(self,

Returns
-------
potential_hydrogens: str
String containing the :attr:`resname` and :attr:`name` of all
hydrogen atoms potentially capable of forming hydrogen bonds.
potential_hydrogens: AtomGroup
AtomGroup corresponding to all hydrogen atoms potentially capable
of forming hydrogen bonds.

Notes
-----
Expand All @@ -425,9 +425,9 @@ def guess_hydrogens(self,
the selection.

Alternatively, this function may be used to quickly generate a
:class:`str` of potential hydrogen atoms involved in hydrogen bonding.
This str may then be modified before being used to set the attribute
:attr:`hydrogens_sel`.
:class:`AtomGroup` of potential hydrogen atoms involved in hydrogen
bonding. This :class:`AtomGroup` may then be modified before being used
to set the attribute :attr:`hydrogens_sel`.


.. versionchanged:: 2.4.0
Expand All @@ -447,7 +447,7 @@ def guess_hydrogens(self,
))
]

return self._group_categories(hydrogens_ag)
return hydrogens_ag

def guess_donors(self, select='all', max_charge=-0.5):
"""Guesses which atoms could be considered donors in the analysis. Only
Expand All @@ -465,9 +465,9 @@ def guess_donors(self, select='all', max_charge=-0.5):

Returns
-------
potential_donors: str
String containing the :attr:`resname` and :attr:`name` of all atoms
that are potentially capable of forming hydrogen bonds.
potential_donors: AtomGroup
AtomGroup corresponding to all atoms potentially capable of forming
hydrogen bonds.

Notes
-----
Expand All @@ -485,9 +485,9 @@ def guess_donors(self, select='all', max_charge=-0.5):
selection.

Alternatively, this function may be used to quickly generate a
:class:`str` of potential donor atoms involved in hydrogen bonding.
This :class:`str` may then be modified before being used to set the
attribute :attr:`donors_sel`.
:class:`AtomGroup` of potential donor atoms involved in hydrogen
bonding. This :class:`AtomGroup` may then be modified before being used
to set the attribute :attr:`donors_sel`.


.. versionchanged:: 2.4.0
Expand All @@ -498,11 +498,11 @@ def guess_donors(self, select='all', max_charge=-0.5):
# We need to know `hydrogens_sel` before we can find donors
# Use a new variable `hydrogens_sel` so that we do not set
# `self.hydrogens_sel` if it is currently `None`
hydrogens_ag = self.guess_hydrogens()
if self.hydrogens_sel is None:
hydrogens_sel = self.guess_hydrogens()
hydrogens_sel = self._group_categories(hydrogens_ag)
else:
hydrogens_sel = self.hydrogens_sel
hydrogens_ag = self.u.select_atoms(hydrogens_sel)

# We're using u._topology.bonds rather than u.bonds as it is a million
# times faster to access. This is because u.bonds also calculates
Expand All @@ -521,9 +521,7 @@ def guess_donors(self, select='all', max_charge=-0.5):
)
)

donors_ag = donors_ag[donors_ag.charges < max_charge]

return self._group_categories(donors_ag)
return donors_ag[donors_ag.charges < max_charge]

def guess_acceptors(self, select='all', max_charge=-0.5):
"""Guesses which atoms could be considered acceptors in the analysis.
Expand All @@ -542,9 +540,9 @@ def guess_acceptors(self, select='all', max_charge=-0.5):

Returns
-------
potential_acceptors: str
String containing the :attr:`resname` and :attr:`name` of all atoms
that potentially capable of forming hydrogen bonds.
potential_acceptors: AtomGroup
AtomGroup corresponding to all atoms potentially capable of forming
hydrogen bonds.

Notes
-----
Expand All @@ -560,20 +558,19 @@ def guess_acceptors(self, select='all', max_charge=-0.5):
the selection.

Alternatively, this function may be used to quickly generate a
:class:`str` of potential acceptor atoms involved in hydrogen bonding.
This :class:`str` may then be modified before being used to set the
attribute :attr:`acceptors_sel`.
:class:`AtomGroup` of potential acceptor atoms involved in hydrogen
bonding. This :class:`AtomGroup` may then be modified before being used
to set the attribute :attr:`acceptors_sel`.


.. versionchanged:: 2.4.0
Added ability to use atom types

"""

ag = self.u.select_atoms(select)
acceptors_ag = ag[ag.charges < max_charge]
ag = self.u.select_atoms(select, updating=self.update_selections)

return self._group_categories(acceptors_ag)
return ag[ag.charges < max_charge]

@staticmethod
def _group_categories(group):
Expand Down Expand Up @@ -621,37 +618,50 @@ def _get_dh_pairs(self):
produce a list of donor-hydrogen pairs.
"""

# If donors_sel is not provided, use topology to find d-h pairs
if self.donors_sel is None:

# We're using u._topology.bonds rather than u.bonds as it is a million times faster to access.
# This is because u.bonds also calculates properties of each bond (e.g bond length).
# See https://github.com/MDAnalysis/mdanalysis/issues/2396#issuecomment-596251787
if not (hasattr(self.u._topology, 'bonds') and len(self.u._topology.bonds.values) != 0):
raise NoDataError('Cannot assign donor-hydrogen pairs via topology as no bond information is present. '
'Please either: load a topology file with bond information; use the guess_bonds() '
'topology guesser; or set HydrogenBondAnalysis.donors_sel so that a distance cutoff '
'can be used.')

hydrogens = self.u.select_atoms(self.hydrogens_sel)
donors = sum(h.bonded_atoms[0] for h in hydrogens) if hydrogens \
else AtomGroup([], self.u)

# Otherwise, use d_h_cutoff as a cutoff distance
else:

hydrogens = self.u.select_atoms(self.hydrogens_sel)
donors = self.u.select_atoms(self.donors_sel)
donors_indices, hydrogen_indices = capped_distance(
donors.positions,
hydrogens.positions,
max_cutoff=self.d_h_cutoff,
box=self.u.dimensions,
return_distances=False
).T

donors = donors[donors_indices]
hydrogens = hydrogens[hydrogen_indices]
# # If donors_sel is not provided, use topology to find d-h pairs
# if self.donors_sel is None:

# # We're using u._topology.bonds rather than u.bonds as it is a million times faster to access.
# # This is because u.bonds also calculates properties of each bond (e.g bond length).
# # See https://github.com/MDAnalysis/mdanalysis/issues/2396#issuecomment-596251787
# if not (hasattr(self.u._topology, 'bonds') and len(self.u._topology.bonds.values) != 0):
# raise NoDataError('Cannot assign donor-hydrogen pairs via topology as no bond information is present. '
# 'Please either: load a topology file with bond information; use the guess_bonds() '
# 'topology guesser; or set HydrogenBondAnalysis.donors_sel so that a distance cutoff '
# 'can be used.')

# hydrogens = self.u.select_atoms(self.hydrogens_sel)
# donors = sum(h.bonded_atoms[0] for h in hydrogens) if hydrogens \
# else AtomGroup([], self.u)

# # Otherwise, use d_h_cutoff as a cutoff distance
# else:

# hydrogens = self.guess_hydrogens()
# donors = self.guess_donors()
# donors_indices, hydrogen_indices = capped_distance(
# donors.positions,
# hydrogens.positions,
# max_cutoff=self.d_h_cutoff,
# box=self.u.dimensions,
# return_distances=False
# ).T

# donors = donors[donors_indices]
# hydrogens = hydrogens[hydrogen_indices]

hydrogens = self.hydrogens_ag #NOTE: do I need this?
donors = self.donors_ag #NOTE: do I need this?
donors_indices, hydrogen_indices = capped_distance(
donors.positions,
hydrogens.positions,
max_cutoff=self.d_h_cutoff,
box=self.u.dimensions,
return_distances=False
).T

donors = donors[donors_indices]
hydrogens = hydrogens[hydrogen_indices]

return donors, hydrogens

Expand Down Expand Up @@ -699,15 +709,22 @@ def _filter_atoms(self, donors, acceptors):
def _prepare(self):
self.results.hbonds = [[], [], [], [], [], []]

# Set unpaired acceptor and hydrogen AtomGroups
self.acceptors_ag = self.guess_acceptors() #NOTE: do I need this?
self.hydrogens_ag = self.guess_hydrogens()
self.donors_ag = self.guess_donors()

# Set atom selections if they have not been provided
if self.acceptors_sel is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to make sure acceptors_ag, hydrogens_ag, etc. always exist.

else:
    self.acceptors_ag = self.u.select_atoms(self.acceptors_sel)

self.acceptors_sel = self.guess_acceptors()
self.acceptors_sel = self._group_categories(self.acceptors_ag) #NOTE: do I need this?
if self.hydrogens_sel is None:
self.hydrogens_sel = self.guess_hydrogens()
self.hydrogens_sel = self._group_categories(self.hydrogens_ag)
#NOTE: add in self.donors_sel? Is this not done for a reason?
if self.donors_sel is None:
self.donors_sel = self._group_categories(self.donors_ag)

# Select atom groups
self._acceptors = self.u.select_atoms(self.acceptors_sel,
updating=self.update_selections)
self._acceptors = self.guess_acceptors()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already have self.acceptors_ag above so you don't need to guess it again. Simply parse self._acceptors = self.acceptors_ag

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops I missed this one, thank you for picking it up :)

self._donors, self._hydrogens = self._get_dh_pairs()

def _single_frame(self):
Expand Down
Loading