Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coordination_geometry int64 indices breaks on 32-bit systems #4161

Open
drew-parsons opened this issue Nov 9, 2024 · 4 comments · May be fixed by #4194
Open

coordination_geometry int64 indices breaks on 32-bit systems #4161

drew-parsons opened this issue Nov 9, 2024 · 4 comments · May be fixed by #4194
Labels

Comments

@drew-parsons
Copy link
Contributor

Python version

Python 3.12.7

Pymatgen version

2024.10.29

Operating system version

Debian unstable

Current behavior

pymatgen coordination geometry sets an explicit int64 for separation indices at

_sep = [np.array(ss, dtype=np.int64) for ss in separation]

This breaks on 32-bit systems at

inp = self.local_geometry.coords.take(separation_indices[1], axis=0)

inp = self.local_geometry.coords.take(separation_indices[1], axis=0)

(also ll.1981,1989).

On 32-bit systems array indices are int32 (or more precisely, np.intp). So using int64 as the argument to the take() function results in an error

        if sepplane.ordered_plane:
>           inp = self.local_geometry.coords.take(separation_indices[1], axis=0)
E           TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe'

/<<PKGBUILDDIR>>/.pybuild/test_python3.12/pymatgen/analysis/chemenv/coordination_environments/coordination_geometry_finder.py:1979: TypeError

Error logs can be found at https://buildd.debian.org/status/fetch.php?pkg=pymatgen&arch=armel&ver=2024.10.29%2Bdfsg1-1&stamp=1731068454&raw=0
also armhf,

Expected Behavior

The index type provided to take() needs to be compatible.
Down-conversion could be down by changing l.1979 to

inp = self.local_geometry.coords.take(separation_indices[1].astype(np.intp), axis=0)

(likewise at the other 2 lines).

Down conversion could be a problem if you really wanted to use 64-bit indexing on 32-bit systems, but perhaps that's unavoidable. One way of avoiding unexpected problems is to not use int64 in the first place, and define the separation indices at l.1738 as

 _sep = [np.array(ss, dtype=np.intp) for ss in separation]

(in which case the astype patch at l.1979 would not be needed)

Minimal example

pytest-3 -v -k "test_real_systems or TestCoordinationGeometryFinder or test_from_structure_environments" tests/analysis/chemenv

Relevant files to reproduce this bug

No response

@DanielYang59
Copy link
Contributor

DanielYang59 commented Nov 10, 2024

Hi @drew-parsons thanks for reporting this and the analysis, sorry for the trouble caused.

We explicitly cast int to int64 in a previous migration to NumPy 2.x #3992, as NumPy 2.x changed the default NumPy/Cython integer on 64-bit Windows platform to int64 (was previously int32). Also np.intp was not used for the same reason (we want to consistently use int64 onwards, over declaring fused type).

This decision was made with the assumption that very few, if any, users would still be running a 32-bit platform. However, it apparently you are using one. Currently, pymatgen’s CI does not cover 32-bit systems, and I don’t have access to one for testing. I’ll need to set up a 32-bit test environment to investigate this further later.

@drew-parsons
Copy link
Contributor Author

Not using it personally, and you're right in practice there are probably few users (maybe some on Raspberry Pi). But the Debian project tries to support as many architectures as feasible, including a few 32-bit ones. Our build logs across the various architectures are listed at https://buildd.debian.org/status/package.php?p=pymatgen.

Debian hasn't migrated to numpy 2 yet, so those needful updates are still ahead for us.

@DanielYang59
Copy link
Contributor

But the Debian project tries to support as many architectures as feasible, including a few 32-bit ones.

Yep I agree having better compatibility is almost always a good thing, as long as the price is not too high. I would have to look at this a bit later, thanks for sharing the info.

@DanielYang59
Copy link
Contributor

DanielYang59 commented Nov 21, 2024

Okay I just set up a Debian 12 32-bit VM and I could recreate this, would try to fix this soon.

    def _cg_csm_separation_plane_optim2(
        self,
        coordination_geometry,
        sepplane,
        local_plane,
        points_perfect=None,
        separation_indices=None,
    ):
        argref_separation = sepplane.argsorted_ref_separation_perm
        permutations = []
        permutations_symmetry_measures = []
        stop_search = False
        # TODO: do not do that several times ... also keep in memory
        if sepplane.ordered_plane:
>           inp = self.local_geometry.coords.take(separation_indices[1], axis=0)
E           TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe'

/home/yang/pymatgen/src/pymatgen/analysis/chemenv/coordination_environments/coordination_geometry_finder.py:1979: TypeError

4 failed, 20 passed, 2 skipped in 6.13s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants