Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Android-specific platform differences #118

Merged
merged 3 commits into from
Apr 10, 2023
Merged

Conversation

notgull
Copy link
Member

@notgull notgull commented Apr 4, 2023

Makes Active a no-op on all platforms.

cc @rib and @MarijnS95 Is this valid? I want to make extra sure, I don't want to screw this up twice

Post-Mortem

Originally I had created a window-handle crate to act as a safe window handle abstraction over raw-window-handle and proposed it be added to winit in rust-windowing/winit#2744. It was suggested that it should be moved into raw-window-handle so that both systems could be developed in lockstep, so I merged the crates in #110.

I also opened #111 for further discussion of raw window handles, to make sure that they worked for all platforms, present and future. At the time, I used a strategy where there would be a single Active token that only existed when the application was active. However, I soon realized that there were soundness flaws in this strategy, among other issues. Therefore, in #116, I implemented the current strategy, where an RwLock is used on Android to ensure that window handles can only be active when the application is.

Between the closing of #116 and the release, I'd implemented the new strategy in the form of other PRs in the rust-windowing ecosystem: rust-windowing/winit#2744, rust-windowing/glutin#1582 and rust-windowing/softbuffer#82. I'd also written a blog post about the initiative and posted it on Reddit, where it had received some visibility. Between both of these, I'd decided that the proposal had received enough eyes to be ready. At this point, I was unaware that both rib and MarjinS95 were predisposed, and took their perceived silence as approval. I pushed for a release and it was eventually released as v0.5.2.

In retrospect, I should have waited for one of the Android platform maintainers to directly respond to the issue.

Two days later, rib responded to #111 and pointed out an issue that I was previously unaware of: it is possible to increase the refcount on ANativeWindow such that it is not deleted. After some discussion that got leaked elsewhere (here, here, here), the conclusion we came to was:

  • The current Active system can be bypassed by cloning an ANativeWindow pointer.
  • However, since the ANativeWindow can be cloned and the reference sticks around, the reason why the Active system exists (to prevent using the reference after it's been deleted) can be worked around safely. This means that the Active isn't necessary.

However, since there's been a release we can't actually remove the Active/ActiveHandle types from the crate, until the next breaking release. So, this is the next best thing; just turning the Active into a no-op.

@MarijnS95
Copy link
Member

@notgull thanks for sending this in, I wouldn't call the original a screw-up outright but haven't caught up to the discussion in existing issues PRs (which should definitely be linked and explained as part of the context for why this PR is making this change 😉). Will take a closer look tomorrow, and we can always push users to not use the no-op via #[deprecated] without breaking changes. On that note I'm sorry for repeating the below if it has already been discussed:

It quite surprises me that these changes have been made and pressured into a release within a timespan of about two weeks, without any apparent community discussion/review not to mention a comment from active Android+Rust platform maintainers (turns out @rib was on holiday, and I had a massive backlog to process after also being out of country for a conference). I get it, it's sometimes hard to not have a release while porting other crates in the ecosystem to the new thing, but using temporary git references (for example) allows the new API to settle for a while to get more eyes on it and work out the final kinks (for a critical crate like RWH). We used to do that for the ndk in winit, so that it could use the latest features and force the ndk to finally make a release with winit in lockstep. It does help that these are (or were...) all under the same GitHub organization back then, and now with RWH.
But that's just my 2cts on the process. Still, thanks again for working on this crate and coming up with a safe abstraction!

(And for the commit title: s/no-ip/no-op)

@notgull
Copy link
Member Author

notgull commented Apr 4, 2023

Will take a closer look tomorrow, and we can always push users to not use the no-op via #[deprecated] without breaking changes.

Would probably be for the best (although I doubt there are any extant users aside from my PRs).

It quite surprises me that these changes have been made and pressured into a release within a timespan of about two weeks, without any apparent community discussion/review not to mention a comment from active Android+Rust platform maintainers (turns out @rib was on holiday, and I had a massive backlog to process after also being out of country for a conference).

Yes, this was undeniably a mistake on my part with no excuses. At this stage, I'd been able to implement Has[Display/Window]Handle on most types in the ecosystem without any real friction. In addition, I'd posted about my initiative to Reddit and had not received any real opposition to my design. My rationale was that at least one person familiar with Android would read my post and point out flaws if there were any. I took this silence as approval and pushed for a release.

In the future, I'll make sure to check with rust-windowing-specific platform experts before pushing for any kind of release.

I get it, it's sometimes hard to not have a release while porting other crates in the ecosystem to the new thing, but using temporary git references (for example) allows the new API to settle for a while to get more eyes on it and work out the final kinks (for a critical crate like RWH).

This is actually the original path I took. See rust-windowing/winit#2744, rust-windowing/glutin#1582 and rust-windowing/softbuffer#82. Like I said above, my rationale was that I'd already gotten enough eyes on the system that it was ready to be published.

But, ah well, this is recoverable. Chances are, with #104 we're likely to have a breaking release in the future anyways.

@notgull notgull force-pushed the notgull/de-active branch from 1be0e38 to ac9636b Compare April 4, 2023 23:03
@rib
Copy link

rib commented Apr 5, 2023

Cool, thanks for the follow up @notgull

Sorry that I'm again a bit slow with responding - I actually just started a new job this week, so been a bit busy :) as it happens though I'll be responsible for lots of Android + Rust enabling though, so fingers crossed I'll be in a good position to actively help with this kind of thing.

Yeah I wouldn't call it a screw up either - it's an awkward corner of Android - and I'll even be ready to accept egg on my face at some point when we discover there is some weird safety issue here that we've all overlooked.

To the best of my understanding currently though I think the conclusion above is correct. ANativeWindow is taken out of our hands and referenced by graphics API surfaces, so even if there were a safety issue with touching an ANativeWindow outside of the create/destroy protocol (which I also haven't seen anything to suggest that's the case, if holding a reference) then I don't think the Active system would be enough to address that issue.

Btw, although I generally try to follow r/rust and r/rust_gamedev I missed your blog post unfortunately. I guess the challenge with aiming to get feedback via Reddit is that (even though there are lots of people that read reddit) there are still very few people actively involved with Rust + Android atm, so your best chance at getting feedback is to probably reach out to people directly.

@notgull
Copy link
Member Author

notgull commented Apr 5, 2023

I actually just started a new job this week, so been a bit busy :) as it happens though I'll be responsible for lots of Android + Rust enabling though, so fingers crossed I'll be in a good position to actively help with this kind of thing.

Ooo, sounds fun! Good luck!

Yeah I wouldn't call it a screw up either - it's an awkward corner of Android - and I'll even be ready to accept egg on my face at some point when we discover there is some weird safety issue here that we've all overlooked.

Thanks. My gut still tells me that there's some kind of edge case here that we're forgetting. I feel like, every time a project touches Java, it gets ten times more complex for no reason.

Btw, although I generally try to follow r/rust and r/rust_gamedev I missed your blog post unfortunately. I guess the challenge with aiming to get feedback via Reddit is that (even though there are lots of people that read reddit) there are still very few people actively involved with Rust + Android atm, so your best chance at getting feedback is to probably reach out to people directly.

I guess that's the curse of trying to do things that not many people are doing; there just aren't enough experts to go around.

To the best of my understanding currently though I think the conclusion above is correct. ANativeWindow is taken out of our hands and referenced by graphics API surfaces, so even if there were a safety issue with touching an ANativeWindow outside of the create/destroy protocol (which I also haven't seen anything to suggest that's the case, if holding a reference) then I don't think the Active system would be enough to address that issue.

I'll take this as approval of the new system. Once @MarijnS95 signs off on this I'll go ahead and merge it.

@rib
Copy link

rib commented Apr 6, 2023

Thanks. My gut still tells me that there's some kind of edge case here that we're forgetting. I feel like, every time a project touches Java, it gets ten times more complex for no reason.

Yeah, generally, that's probably a reasonable concern.

Since (accidentally) becoming maintainer of the jni crate and trying to help get it in shape I feel like I've come across countless sharp-edged corners. In general JNI is pretty tricky to abstract safely and the jni crate in particular has started out as a relatively thin layer over the C JNI API, and in some places that's given a false sense of safety with how various APIs have been misleadingly marked as safe. I think that's also given me a healthy fear of Rust + JNI + Java in general.

For this particular issue I feel like it's something that I've looked at quite a few times and I think my focus is generally on the requirement that Android apps need to drop their graphics surfaces when suspending - which is slightly different to what the Active protocol was tackling. I probably wouldn't poke too hard with seeing how ANativeWindow behaves outside of the create/destroy protocol but at the same time I know I have unwitingly done that before without having things blow up. From everything I've seen by looking through the implementation of the view.Surface + ANativeWindow and surrounding abstraction then I think it's reasonable to expect that owning a reference is enough to keep an ANativeWindow alive and safe.

Just in terms of a C API it would be particularly cruel to provide a ref-counting API for some state and somehow let that state get pulled from under your feet while you think you hold a reference to it.

I guess that's the curse of trying to do things that not many people are doing; there just aren't enough experts to go around.

yeah, I guess as you get into the weeds there will always be fewer and fewer people that are coming along with you :) This is a particularly gnarly area I think. You need some interest in lower-level graphics stacks in addition to Android architecture and Rust <-> Java JNI.

I'll take this as approval of the new system. Once @MarijnS95 signs off on this I'll go ahead and merge it.

sound good to me

@notgull
Copy link
Member Author

notgull commented Apr 10, 2023

@MarijnS95 Poke, does this look good to you?

@MarijnS95
Copy link
Member

@notgull perhaps, I haven't had the time to look into anything over the Easter weekend, sorry for enjoying time off and slacking on this!

Comment on lines +25 to +28
/// On Android, it was previously believed that the application could enter the suspended state
/// and immediately invalidate all window handles. However, it was later discovered that the
/// handle actually remains valid, but the window does not produce any more GPU buffers. This
/// type is a no-op and will be removed at the next major release.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tad odd that some of the original documentation is still up above, and being nullified again right here?

src/borrowed.rs Outdated
@@ -96,8 +65,9 @@ impl Active {
/// use raw_window_handle::Active;
/// let active = Active::new();
/// ```
#[deprecated = "Will be removed at next major release, use ActiveHandle::new() for now"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this sensible (same for the other deprecated attributes on member fns) if the Active type as a whole is already marked as deprecated?

src/borrowed.rs Outdated
Comment on lines 45 to 48
/// On non-Android platforms, this is a ZST. On Android, this is a reference counted handle that
/// keeps the application active while it is alive.
#[derive(Clone)]
pub struct ActiveHandle<'a>(imp::ActiveHandle<'a>);
pub struct ActiveHandle<'a>(PhantomData<&'a UnsafeCell<()>>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the docs here be updated?

Copy link
Member

@MarijnS95 MarijnS95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the new "borrowed" API of raw-window-handle and have just pointed out some generic findings on the diff. Overall the gain in removing this Active handling on Android is solid since the handle won't be pulled out from under you if holding a reference to it (which the NDK crate does for you; and graphics API's like Vulkan do too internally). The handle will just be "useless".

@notgull notgull merged commit 7967619 into master Apr 10, 2023
@notgull notgull deleted the notgull/de-active branch April 10, 2023 16:15
@notgull
Copy link
Member Author

notgull commented Apr 10, 2023

Thank you!

@notgull notgull mentioned this pull request Jun 10, 2023
notgull added a commit that referenced this pull request Jun 10, 2023
Lokathor pushed a commit that referenced this pull request Jun 10, 2023
* Fix changelog

* Add #118
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants