Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: allow int as placeholder array index type #3341

Closed
wants to merge 2 commits into from

Conversation

pfackeldey
Copy link
Collaborator

This PR allows integer indexing on Placeholder arrays, because we know that an int will return a new_length of 1. This is a typical operation that happens when calling repr of an awkward-array.

This is useful, because instead the repr building for (usually broken) awkward-arrays that contain Placeholder arrays will currently generate a rather cryptic error:

...
TypeError: PlaceholderArray supports only trivial slices, not int

where it's rather unclear what the underlying reason was.

With this PR, the repr can be built, and it shows the missing entries as ??:

<Array [{run: [??], ...}, ..., {run: ..., ...}] type='40 * {run: uint32, lu...'>

which at least gives a better hint on which columns are still placeholders given that they have ?? as entry.

This is a problem that typically occurs if the ak.Form of an array has entries without matching buffers.

cc @ikrommyd

@pfackeldey pfackeldey changed the title allow int as placeholder array index type fix: allow int as placeholder array index type Dec 13, 2024
@pfackeldey
Copy link
Collaborator Author

I think this would help for the errors seen in scikit-hep/uproot5#1349 and scikit-hep/coffea#1231

@pfackeldey
Copy link
Collaborator Author

This is also maybe worth a discussion on what the policy should be here.
Allowing the arrays to be constructed instead of throwing an error when building the repr has the advantage of inspecting the array to understand what went wrong. But it also may hide the fact that the arrays is broken in the first place...

Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this work if it returned

type(self)(self._nplike, (), self._dtype)

instead? That is, a zero-dimensional array, which is more like a scalar than a one-dimensional, length-one array.

Also, the shape of a PlaceholderArray can be more than 1-dimensional, right? In that case, the return value wouldn't be scalar.

@agoose77
Copy link
Collaborator

@pfackeldey my feeling here is that we shouldn't allow indexing on placeholders; they should only arise in contexts where indexing hasn't taken place.

The motivating issue for this PR is that an operation that does not occur at typetrace time is being performed after typetracing, which therefore performs an operation that wasn't anticipated, and fails.

My feeling is that we should watch the array repr to know about placeholders, instead!

@jpivarski
Copy link
Member

I agree, @agoose77! It would be preferable to just have a better repr.

@pfackeldey
Copy link
Collaborator Author

I agree! I was already a bit uncomfortable with this PR, my proposal would be to close this one and I'll look into having a better repr.

@pfackeldey pfackeldey closed this Dec 14, 2024
@pfackeldey pfackeldey deleted the pfackeldey/allow_int_as_placeholder_slice branch December 14, 2024 00:25
@ikrommyd
Copy link

ikrommyd commented Dec 14, 2024

FYI: This did solve the problems in scikit-hep/coffea#1231. Just mentioning it here for documentation purposes.

@pfackeldey
Copy link
Collaborator Author

@ikrommyd the fix for scikit-hep/coffea#1231 is something else I assume. I'll try to understand what goes on there in detail. This PR would allow you to run the reproducer in this issue, but the output array can not be trusted to be a valid array - it was not fixing the root cause of it.
PR #3342 won't fix your reproducer either, but the error message is likely better, because awkward-array likely can build a representation of the array now (this is my current understanding/assumption).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants