Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bond environments: bond-atom neighbouring #240

Open
wants to merge 3 commits into
base: system-with-data
Choose a base branch
from

Conversation

liam-o-marsh
Copy link
Contributor

@liam-o-marsh liam-o-marsh commented Oct 3, 2023

the changes I talked about, split into three commits that can be reviewed separately

There's just one point where I know feedback is needed, it's for the new API for the systems.
(specifically, the huge incoherence between pairs_per_species and triplets_per_species, that can be seen in the comments of the two methods)


📚 Documentation preview 📚: https://rascaline--240.org.readthedocs.build/en/240/

⚙️ Download Python wheels for this pull-request (you can install these with pip)

@Luthaf
Copy link
Member

Luthaf commented Oct 3, 2023

Thank you for the PR!

These changes are quite big, so could you actually make one PR with only the first commit, so we can discuss it & improve on it, and then later add the other commits in separate PRs?

I see a lot of changes to the C API, which I would prefer to keep minimal, since any additional function there impose requirements to new system implementations.

Could you explain what the batriplet is and why it is necessary to have this information in the C API? From a very quick look, It seems that this information could be rebuilt inside the calculator that needs it using only the neighbors list.

@liam-o-marsh
Copy link
Contributor Author

sure!
"BAtriplet"s are short for Bond/Atom triplets. They contain three atoms, two grouped in a bond (that acts as a center), and a third atom which acts as a neighbour.

I'm not sure what alternatives there would be to using this object explicitely.

If I were to create a new system with "ghost atoms" to serve as bond centers, I could represent the old system's triplets with the new system's pairs? But a lot of the pairs would end up being bond/bond or atom/atom, and would go to waste. Unless I deliberately break compute_pairs to not compute all of them.
Plus, I need the orientation of the bond, and I don't think I can fit that info in the ghost atoms.

I guess one way to bypass changing the system trait would be to have calculators::bondatom_neighbor_list::BANeighborList call directly systems::neighbors::TripletNeighborsList but I'm not sure how many things I would have to reroute and how rascaline-idiomatic it is to do so.

@Luthaf
Copy link
Member

Luthaf commented Oct 4, 2023

I'm not sure what alternatives there would be to using this object explicitely.

Inside SphericalExpansionForBonds or the other calculator, you could get two neighbors list, one with the bond cutoff and one with the neighbors cutoff:

system.compute_neighbors(bond_cutoff);
bonds = system.pairs();

// atoms_cutoff needs to be a bit bigger than the one in the current 
// implementation to be sure we get the same set of neighbors.
system.compute_neighbors(atoms_cutoff); 

let mut ba_triplets = vec![];
for bond in bonds {
    let mid_point = positions[bond.first] + bond.vector / 2;

    for pair in system.pairs_containing(bond.first) {
        if distance_from_mid_point_is_below_cutoff {
            ba_triplets.push(bond, other_atom_in_the_pair);
        }
    }

    for pair in system.pairs_containing(bond.second) {
        if distance_from_mid_point_is_below_cutoff {
            ba_triplets.push(bond, other_atom_in_the_pair);
        }
    }
}

This way, no changes to the system are required, and you can still get the data you need.


I guess one way to bypass changing the system trait would be to have calculators::bondatom_neighbor_list::BANeighborList call directly systems::neighbors::TripletNeighborsList but I'm not sure how many things I would have to reroute and how rascaline-idiomatic it is to do so.

This is not possible, because the System trait is here to abstract over system providers, meaning the system you get as input to your calculator are not necessarily systems::SimpleSystem. They could be something provided in rascaline-torch: https://github.com/Luthaf/rascaline/blob/55ad1f14363812639c865801a6d517d8c5f363f6/rascaline-torch/include/rascaline/torch/system.hpp#L19-L24

or even a Python implementation, where compute_neighbors will call pure Python code: https://github.com/Luthaf/rascaline/blob/55ad1f14363812639c865801a6d517d8c5f363f6/python/rascaline/rascaline/systems/ase.py#L18

This is also why I am reluctant to add new functions to the System API, since that mean implementing this function everywhere, and while there is a neighbor list that can be adapted relatively easily in ASE/LAMMPS/…, nothing similar exists for bond-atom triplets. Computing the bond-atom triplets from two pairs list would solve this.


EDIT: sorry, clicked the wrong button.

@Luthaf Luthaf closed this Oct 4, 2023
@Luthaf Luthaf reopened this Oct 4, 2023
@liam-o-marsh
Copy link
Contributor Author

oh, I see, thanks!
Now that I managed to work on this a little, I found another problem:
No matter how i slice it, the calculators I create need to have access to this list of triplets in their .keys(), .samples(), .positions_gradient_samples() and .compute().
And for now, the only possibilities I see are really not satisfactory:

  • recomputing the triplet list every time
  • implementing some sort of global cache that keeps them per-system (which… would be a hashmap based on… the memory address of the systems? that doesn't seem sound. But then again good luck having a HashMap<dyn System,_>)
  • having the calculator contain a Vec<_> cache whose lifetime depends (explicitely or not) on the lifetime of the &mut [Box<dyn System>] used with the calculator?
    The explicit dependency does not seem possible. For the implicit dependency, how do we detect that it is the same &mut [Box<dyn System>] passed to the different methods of the calculator?

If it is true that none of those ideas would cut it, how do you think I should proceed?

Thanks, and sorry for the late reply.

@Luthaf
Copy link
Member

Luthaf commented Oct 6, 2023

Another solution would be to allow the calculator to store data in the system, but I don't love the implications of this either.

I'll think about this over the weekend, and then maybe we can have a chat to find the best way forward!

@liam-o-marsh
Copy link
Contributor Author

sure!
have a nice weekend, and see you then!

@liam-o-marsh
Copy link
Contributor Author

A thing I thought about for this would be the possibility to add other methods to BaseCalculator that would be called with a &mut [Box<dyn System>], but also caches: &mut [BTreeMap<String,Any>].
and the default version of these new methods (implemented in trait) would fall back to keys,samples,compute,etc...

not exactly great either (and it feels way jankier than just allowing systems to carry a cache around), but probably less disruptive than changing the System trait.

@Luthaf
Copy link
Member

Luthaf commented Oct 9, 2023

Yes, this kinda what I landed on over the weekend. Replace the input of compute from &mut [Box<dyn System>] to &mut [SystemWithData] (name to be improved) where

struct SystemWithData {
    system: Box<dyn System>,
    data: BTreeMap<String,Any>
}

I think this is the cleanest solution here

@liam-o-marsh
Copy link
Contributor Author

that does sound way better than what I suggested.
Thank you!

@Luthaf
Copy link
Member

Luthaf commented Oct 13, 2023

I've implemented this in the system-with-data branch (the struct is called System, the trait is now SystemBase). Could you try to use it and let me know if this works for you?

For now, I allowed the data to be Any, but I'm also thinking to restrict it to TensorBlock instead (since we will have a similar mechanism in metatensor for atomistic simulations). Here as well, let me know if a TensorBlock would be enough for you or if you need the extra flexibility of Any!

@liam-o-marsh
Copy link
Contributor Author

ah, should have replied earlier.
thanks, I'll try to rebase my changes on top of that, sometimes next week.
(found a thorny math bug in some of my own code, it will take a while to actually pinpoint)

@liam-o-marsh
Copy link
Contributor Author

liam-o-marsh commented Oct 24, 2023

and that should be it
(…just note that it now has the changes from the systems_with_data branch, so I had to switch the target branch of the PR)

@liam-o-marsh liam-o-marsh changed the base branch from master to system-with-data October 24, 2023 15:03
@liam-o-marsh
Copy link
Contributor Author

I'll be using a PR within m-stack-org to make sure it passes CI tests. (which catches more errors than cargo test and cargo bench)

Copy link
Member

@Luthaf Luthaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only gave a quick look from a very high level, checking code organization and documentation.

I'm not sure I understand the separation between SphericalExpansionForBondType and SphericalExpansionForBond`. Do you plan on using both separately? I could not find any registration of the calculator, so I'm not sure which one you need to use for the actual ML work.

I would only include the ones you directly need.

In the current code, NeighborsList, SphericalExpansionByPair and SphericalExpansion are all used directly in various ML pipelines of the lab.

rascaline/src/pre_calculators/mod.rs Outdated Show resolved Hide resolved
rascaline/src/systems/mod.rs Outdated Show resolved Hide resolved
rascaline/src/systems/neighbors.rs Outdated Show resolved Hide resolved
rascaline/src/calculators/bondatom_neighbor_list.rs Outdated Show resolved Hide resolved
rascaline/src/calculators/bondatom_neighbor_list.rs Outdated Show resolved Hide resolved
rascaline/src/calculators/bondatom_neighbor_list.rs Outdated Show resolved Hide resolved
rascaline/src/calculators/bondatom_neighbor_list.rs Outdated Show resolved Hide resolved
rascaline/src/calculators/bondatom_neighbor_list.rs Outdated Show resolved Hide resolved
rascaline/src/calculators/bondatom_neighbor_list.rs Outdated Show resolved Hide resolved
@liam-o-marsh
Copy link
Contributor Author

and most of your remarks should be taken into account (notable exception: the ones that were about "TODO"s left in the code)

@liam-o-marsh
Copy link
Contributor Author

liam-o-marsh commented Dec 7, 2023

question:
are there specific things I need to do here before it gets merged?

EDIT: let me merge to take the clebsch-gordan restructure into account

@Luthaf
Copy link
Member

Luthaf commented Dec 7, 2023

Sorry for the delay, I was deep in another part of development which is now reaching its end; so I should have more time for rascaline work.

There is not much you need to do for now, I have to do another full round of review on this & finish + merge #260.


One point I wanted to discuss with you was how we could expose the fact that this is a very new, still experimental representation that might be changed if we realise it needs to do some things differently. My idea here was to have the code in the main library and make sure it continues working & passing tests, but force the user to opt-in and acknowledge this is experimental code if they want to use it.

An environment variable would be a good way to do this:

export RASCALINE_EXPERIMENTAL_BOND_ATOM_SPX=1
python script.py

or

import os
os.environ["RASCALINE_EXPERIMENTAL_BOND_ATOM_SPX"] = "1"

from rascaline import BondAtomSphericalExpansion

What do you think?

Another point about this is when will the code move out of this experimental state and always be available. I'm thinking having one published paper with it + the main author (you in this case, but we will use similar mechanism for other new representation later) agreeing that the code is ready for general use.

@@ -32,6 +33,15 @@ def __init__(self):
.. _chemfiles.Frame: http://chemfiles.org/chemfiles.py/latest/reference/frame.html
"""

if HAVE_PYSCF:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make the PYSCF system into a separate PR?

Copy link
Contributor Author

@liam-o-marsh liam-o-marsh Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

certainly!
for now, I'm bisecting a failing test on debian12 (that was there on debian11): valgrind is annoyed at entries 5 and 9 of cargo test -p rascaline-c-api --test run-cxx-tests
(a problem that pre-dates my changes)

Copy link
Contributor Author

@liam-o-marsh liam-o-marsh Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...bisecting only found the commit where equistore was renamed to metatensor, breaking the build on all older rascaline commits, due to a failed hash check on the zip archive containing metatensor at a given commit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

valgrind is annoyed at entries 5 and 9 of cargo test -p rascaline-c-api --test run-cxx-tests
(a problem that pre-dates my changes)

Hmm, can you open a separate issue for this? Or just share the setup with me on Slack? We run the valgrind tests on CI, so this is strange.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well nevermind the error disappears for a fresh clone of master

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll need to look more into this, but when it does/doesn't happen seems to be nonsense for now.

Copy link
Contributor Author

@liam-o-marsh liam-o-marsh Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 'bug' was in the Cargo.lock file, it seems. Do you want me to send it to you, see if a dependency needs to be updated or something?

@liam-o-marsh
Copy link
Contributor Author

liam-o-marsh commented Dec 7, 2023

Sorry for the delay, I was deep in another part of development which is now reaching its end; so I should have more time for rascaline work.
There is not much you need to do for now, I have to do another full round of review on this & finish + merge #260.

OK! To be honest, I was also busy, with my candidacy exam.

export RASCALINE_EXPERIMENTAL_BOND_ATOM_SPX=1

Sure!

Another point about this is when will the code move out of this experimental state and always be available. I'm thinking having one published paper with it + the main author (you in this case, but we will use similar mechanism for other new representation later) agreeing that the code is ready for general use.

That seems like a good rule of thumb for this sort of thing. Thank you for thinking about this!

@liam-o-marsh
Copy link
Contributor Author

liam-o-marsh commented Apr 9, 2024

oh no
er.... I might have broken the PR... (I merged the changes from main)
I forgot it targeted the system-with-data branch rather than the main branch...
(speaking of, isn't system-with-data ready for prime time by now?)

@Luthaf
Copy link
Member

Luthaf commented Apr 9, 2024

speaking of, isn't system-with-data ready for prime time by now?

Yes, it should be. I mainly need to document and merge it!

EDIT: PR breakage is fine, we can fix it =)

[restructured contribution, commit 1]
[restructured contribution, commit 2]
[restructured contribution, commit 3]
@liam-o-marsh
Copy link
Contributor Author

aaand rebased.
also for some reason trying to sign commits with ssh rather than gpg makes github unhappy

@liam-o-marsh
Copy link
Contributor Author

except I probably need to rebase this further because of the name change, which seems to break some tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants