Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add safer borrowed handles #110

Merged
merged 2 commits into from
Mar 17, 2023

Conversation

notgull
Copy link
Member

@notgull notgull commented Mar 17, 2023

Pretty much a copy-paste of my window-handle repo to this crate. See here for rationale.

By the way, would it be possible to add me to the @rust-windowing/raw-window-handle team so that I can maintain this code?

@Lokathor
Copy link
Contributor

I'd be happy to have you on the team but I don't think I personally have permissions to edit the team. One of the higher level rust-windowing admins should be able to add you.

@kchibisov
Copy link
Member

By the way, would it be possible to add me to the @rust-windowing/raw-window-handle team so that I can maintain this code?

Done.

@notgull
Copy link
Member Author

notgull commented Mar 17, 2023

Thanks @kchibisov! I'll merge this patch in.

@notgull notgull merged commit d0eb070 into rust-windowing:master Mar 17, 2023
@notgull notgull deleted the borrowed-handles branch March 17, 2023 20:07
@Lokathor
Copy link
Contributor

@notgull can you also do a follow up PR to put this in the changelog so i can just pull the latest and release the patch once I'm free later?

notgull added a commit that referenced this pull request Mar 17, 2023
Lokathor pushed a commit that referenced this pull request Mar 17, 2023
@rib
Copy link

rib commented Apr 2, 2023

Hey, sorry for being late to the party here. I'm just trying to follow the recent changes here and get a clearer understanding.

Here are a few initial comments I made on related issue #111 where I was pinged #111 (comment)

I'm not sure that there was a safety issue on Android here - assuming a valid reference to an ANativeWindow is held.

Android apps need to be aware to recreate render surfaces when they resume but it was possible to handle that safely with the previous raw-window-handle + Winit API (E.g. here are some examples that work on Android and are also portable to desktop window systems: https://github.com/rust-mobile/rust-android-examples)

I'm maybe missing a full understanding currently of how consumers of this are intended to handle the introduced Active state here.

Are consumers now required to poll when a display handle becomes active/inactive and maybe automatically re-create surfaces when transitioning to active, and NOP when inactive?

My gut feeling at the moment is that Android applications should be driven by the app lifecycle events directly and instead of having handle consumers poll for lifecycle changes and automatically re-creating surfaces it seems like we should give out new handles to consumers when resumed/re-activated.

There can surely be lots of, potentially app-specific, details involved in creating graphics API surfaces from a handle so I'm very unclear if the implication with this design is that consumers polling for active state changes need to always be able to automatically/transparently re-create a surface that was equivalent to a previously created/configured surface? I don't think I'd expect that to always be feasible.

@notgull
Copy link
Member Author

notgull commented Apr 2, 2023

I'm maybe missing a full understanding currently of how consumers of this are intended to handle the introduced Active state here.

Are consumers now required to poll when a display handle becomes active/inactive and maybe automatically re-create surfaces when transitioning to active, and NOP when inactive?

My gut feeling at the moment is that Android applications should be driven by the app lifecycle events directly and instead of having handle consumers poll for lifecycle changes and automatically re-creating surfaces it seems like we should give out new handles to consumers when resumed/re-activated.

The theory is that consumers would actually hold references to the underlying HasWindowHandle object (such as the winit::window::Window). Under the hood, the lifecycle system would call Active::set_inactive() and Active::set_active() in order to set whether or not it's valid to create new window handles.

There isn't really any kind of "polling" here; when you want to access the window you check whether or not the application is suspended. While you hold the window handle, the application won't enter the suspended state until you stop holding it.

I'm not sure that there was a safety issue on Android here - assuming a valid reference to an ANativeWindow is held.

I'm not sure that the reference stays valid forever, as the docs say:

You MUST ensure that you do not touch the window object after returning from this function

The docs also seem to imply that the pointer could change in between the onNtiveWindowDestroyed and onNativeWindowCreated.

@MarijnS95
Copy link
Member

There isn't really any kind of "polling" here; when you want to access the window you check whether or not the application is suspended. While you hold the window handle, the application won't enter the suspended state until you stop holding it.

This sounds problematic. As mentioned in #111 (comment) the typical graphics API doesn't "want to access the window" to check whether it may be suspended at random: this (creating a Surface/Swapchain) is a one-off setup at the start of an app, and the lifecycle events currently provided by ndk-glue/android-activity and winit inform the user that it's time to clean up their implicit references to ANativeWindow by destroying their Surface/Swapchain.

The only odd case for add-hoc getting the window/handle is to read the size, or call ANativeWindow_lock() for CPU-based pixel blitting (the latter will happen on the NativeWindow struct directly though, not via its abstracted window handle).

@rib
Copy link

rib commented Apr 2, 2023

I'm maybe missing a full understanding currently of how consumers of this are intended to handle the introduced Active state here.
Are consumers now required to poll when a display handle becomes active/inactive and maybe automatically re-create surfaces when transitioning to active, and NOP when inactive?
My gut feeling at the moment is that Android applications should be driven by the app lifecycle events directly and instead of having handle consumers poll for lifecycle changes and automatically re-creating surfaces it seems like we should give out new handles to consumers when resumed/re-activated.

The theory is that consumers would actually hold references to the underlying HasWindowHandle object (such as the winit::window::Window). Under the hood, the lifecycle system would call Active::set_inactive() and Active::set_active() in order to set whether or not it's valid to create new window handles.

There isn't really any kind of "polling" here; when you want to access the window you check whether or not the application is suspended. While you hold the window handle, the application won't enter the suspended state until you stop holding it.

I'm not sure how this will be manageable. If you create a surface, such as an EGLSurface that's effectively holding on to the window handle indefinitely. Creating a surface is going to hold a reference to the ANativeWindow, so conceptually I think this lock needs to be held all the time that the surface exists - not just temporarily in some event handler callbacks. If that were the case though you'd never be able to acquire the lock to suspend?

E.g. see docs for creating an Vulkan surface which say how it takes a reference to the ANativeWindow: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/vkCreateAndroidSurfaceKHR.html#_c_specification

I'm not sure that there was a safety issue on Android here - assuming a valid reference to an ANativeWindow is held.

I'm not sure that the reference stays valid forever, as the docs say:

You MUST ensure that you do not touch the window object after returning from this function

The docs also seem to imply that the pointer could change in between the onNtiveWindowDestroyed and onNativeWindowCreated.

The ANativeWindow is effectively the backend for android.view.Surface and you can also explicitly get an ANativeView from a JNI reference to a surface.

You could e.g. call ANativeWindow_fromSurface and get an ANativeWindow outside of the onNativeWindowCreated/Destroyed protocol - in which case that documentation could be said to not apply. The problem is that that documentation isn't really for ANativeWindow itself - it's for ANativeActivity.

Your app won't work properly if you do just continue trying to use the window after the NativeActivity has informed you that its "terminated" so it's at least prudent that applications shouldn't do that but I'm not sure this is the same as saying it's going to be unsafe/unsound to access the ANativeWindow API while you own a reference to it.

Considering how the Surface Java API is a thin layer over ANativeWindow: https://android.googlesource.com/platform/frameworks/base/+/master/core/java/android/view/Surface.java i'd be surprised if ANativeWindow wasn't always safe to access while you have a valid reference.

As another point of reference it's also possible to acquire an ANativeWindow without using NativeActivity. E.g. in the implementation of the GameActivity backend of android-activity you can see how the backend simply creates an ANativeWindow from a surface and which it's finished with it then it _releases it via its ref count: https://github.com/rust-mobile/android-activity/blob/36ddfaa9cea7af9377aa43e0db6cc56e80ccc234/android-activity/game-activity-csrc/game-activity/GameActivity.cpp#L335

At least anecdotally I can also say that when I've accessed ANativeWindow I've not seen that lead to an abort/segfault etc - it's just resulted in a black screen from not being able to render to the surface any more.

@notgull
Copy link
Member Author

notgull commented Apr 2, 2023

As mentioned in #111 (comment) the typical graphics API doesn't "want to access the window" to check whether it may be suspended at random: this (creating a Surface/Swapchain) is a one-off setup at the start of an app, and the lifecycle events currently provided by ndk-glue/android-activity and winit

The idea is that, before you call any unsafe code that utilizes the event handle, you would run:

let _guard = match self.window_type.window_handle() {
    Err(HandleError::Inactive) => return Err(MyError::WindowWasInactive),
    Ok(handle) => {
        if handle != self.raw_window_handle_thats_registered_in_gl_or_whatever {
            return Err(MyError::UserDidntDropUsWhenWeRequestedIt);
        }

        handle
    }
};

The idea is that:

  • The _guard makes sure the handle isn't dropped out from under us.
  • If the handle changed, we should invalidate ourselves, so just error out.
  • If we're called while the window is inactive, error out.

As far as I know, this is the best way of delivering the notification that the window is suspended in a way that can be done 100% safely by the end user.

(Now that I think about it, the above code might be useful to add to HasWindowHandle as an extension trait...)

The ANativeWindow is effectively the backend for android.view.Surface and you can also explicitly get an ANativeView from a JNI reference to a surface.

You could e.g. call ANativeWindow_fromSurface and get an ANativeWindow outside of the onNativeWindowCreated/Destroyed protocol - in which case that documentation could be said to not apply. The problem is that that documentation isn't really for ANativeWindow itself - it's for ANativeActivity.

This is an interesting point that I wasn't aware of. In this case, I think the current API can accommodate this, though; we just add documentation to the ActiveHandle::new_unchecked function that it's allowed if the ANativeWindow was derived from an external JNI reference.

I'm not sure how this will be manageable. If you create a surface, such as an EGLSurface that's effectively holding on to the window handle indefinitely. Creating a surface is going to hold a reference to the ANativeWindow, so conceptually I think this lock needs to be held all the time that the surface exists - not just temporarily in some event handler callbacks. If that were the case though you'd never be able to acquire the lock to suspend?

I think that the system I mentioned above should counteract this.

@rib
Copy link

rib commented Apr 2, 2023

I'm not sure how this will be manageable. If you create a surface, such as an EGLSurface that's effectively holding on to the window handle indefinitely. Creating a surface is going to hold a reference to the ANativeWindow, so conceptually I think this lock needs to be held all the time that the surface exists - not just temporarily in some event handler callbacks. If that were the case though you'd never be able to acquire the lock to suspend?

I think that the system I mentioned above should counteract this.

I'm not sure that it does - or I didn't follow how it does.

this protocol is concerned with (not) letting the app touch the ANativeWindow outside of onNativeWindowCreated/Destroyed - due to a concern that it may be unsafe.

If an EGLSurface/VkSurface takes a reference to an ANativeWindow then that effectively continues to hold the handle (not from a Rust pov necessarily but it holds a reference to the thing we're trying to control) outside of the guard. You don't necessarily know what actions might lead a graphics API to poke at a surface.

I don't see how the transient guard will be able to affect the retention of surface references passed on to C APIs that are going to be held for long periods of time.

What's missing is that Android apps need to go and explicitly drop all graphics surfaces that could be holding on to an ANativeWindow, but I think that's a higher-level concern that probably can't be done automatically at the raw-window-handle level.

@MarijnS95
Copy link
Member

Your app won't work properly if you do just continue trying to use the window after the NativeActivity has informed you that its "terminated"

What's missing is that Android apps need to go and explicitly drop all graphics surfaces that could be holding on to an ANativeWindow

For completeness we deduced that holding on to ANativeWindow - with the appropriate refcount increase - is totally fine, but rendering to it won't have any effect. It's not just about dropping the graphics surface on "suspend", but also about picking up a new, fresh ANativeWindow on "resume" 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants