-
-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COM convenience methods should take *mut this rather than &self #1025
Comments
I recently came to this same conclusion and have been worrying about it. Theoretically this could be worked around with exposed provenance: Of course, these are nightly only APIs, which is a little problematic. The tracking issue FAQ links |
Using exposed provenance is trickier than I'd hoped. #![feature(strict_provenance)]
#![forbid(unsafe_op_in_unsafe_fn)]
#![allow(dead_code)]
#![allow(non_snake_case)]
use std::sync::{Arc, atomic::*};
use core::ffi::c_void;
fn main() {
let com = CComObject::new();
unsafe { Arc::increment_strong_count(com) };
unsafe { (*com).AddRef() };
unsafe { (*com).Release() };
}
#[repr(C)] pub struct CComObject {
vtbl: *const IUnknownVtbl,
// above is accessible with &IUnknown's strict spatial provenance
// bellow isn't accessible with &IUnknown's strict spatial provenance
member: AtomicUsize,
}
impl core::ops::Deref for CComObject {
type Target = IUnknown;
fn deref(&self) -> &IUnknown {
// This dance does nothing of value, since converting to &IUnknown
// immediately re-introduces a new restricted spatial provenance?
unsafe { &*core::ptr::from_exposed_addr(<*const Self>::expose_addr(self)) }
}
}
impl CComObject {
pub fn new() -> *const CComObject {
Arc::into_raw(Arc::new(Self { vtbl: &Self::VTBL, member: AtomicUsize::new(0) }))
}
const VTBL : IUnknownVtbl = IUnknownVtbl {
AddRef: Self::add_ref,
Release: Self::release,
QueryInterface: Self::query_interface,
};
extern "system" fn add_ref(unknown: *mut IUnknown) -> ULONG {
// Rely on (&IUnknown -> *IUnknown -> usize) having the same value as:
// <*const CComObject>::expose_addr(com_object) ?
let this : *const CComObject = core::ptr::from_exposed_addr_mut(unknown.addr());
//let this : *const CComObject = unknown.cast(); // breaks fetch_add if uncommented
// MIRI at least no longer complains about this:
unsafe { (*this).member.fetch_add(1, Ordering::Relaxed) };
// MIRI still complains about this though:
// error: Undefined Behavior: trying to retag from <wildcard> for SharedReadWrite permission at alloc1566[0x0], but no exposed tags have suitable permission in the borrow stack for this location
unsafe { Arc::increment_strong_count(this) };
dbg!("add_ref finished");
0
}
// Stubby placeholders
extern "system" fn release(_this: *mut IUnknown) -> ULONG { 0 }
extern "system" fn query_interface(_this: *mut IUnknown, _riid: REFIID, _ppv_object: *mut *mut c_void) -> HRESULT { 0 }
}
// winapi fills
pub type HRESULT = i32;
pub type ULONG = u32;
pub struct GUID { pub Data1: u32, pub Data2: u16, pub Data3: u16, pub Data4: [u8; 16] }
pub type REFIID = *const GUID;
#[repr(C)] pub struct IUnknown {
lpVtbl: *const IUnknownVtbl,
}
#[repr(C)] pub struct IUnknownVtbl {
pub QueryInterface: unsafe extern "system" fn(This: *mut IUnknown, riid: REFIID, ppvObject: *mut *mut c_void) -> HRESULT,
pub AddRef: unsafe extern "system" fn(This: *mut IUnknown) -> ULONG,
pub Release: unsafe extern "system" fn(This: *mut IUnknown) -> ULONG,
}
impl IUnknown {
pub unsafe fn QueryInterface(&self, riid: REFIID, ppvObject: *mut *mut c_void) -> HRESULT { unsafe { ((*self.lpVtbl).QueryInterface)(self as *const _ as *mut _, riid, ppvObject) } }
pub unsafe fn AddRef(&self) -> ULONG { unsafe { ((*self.lpVtbl).AddRef)(self as *const _ as *mut _) } }
pub unsafe fn Release(&self) -> ULONG { unsafe { ((*self.lpVtbl).Release)(self as *const _ as *mut _) } }
} |
Seems more "necessary" than "would be nice" |
I believe Miri is complaining in your example because you're only exposing the provenance of the So, I think you are right that it can be sound (under permissive provenance but not strict provenance!) for COM methods to take
Like I said in the issue description, it's perfectly possible to write COM code in a way which is unambiguously sound with respect to strict provenance by simply passing around raw pointers instead of converting them to references and back, with the only downside being that method calls have to look like |
Ahh, good catch!
The other downside is that this would necessitate a winapi 0.4 / abandon attempting to maintain soundness for existing winapi 0.3 based code.
This is more or less what windows-rs does FWIW (although they impl Drop and make it a COM style pointer AFAIK) There's also |
I suppose defining the COM interfaces as extern types rather than structs would solve the problem of out of bounds access. |
The
RIDL
macro generates a nice set of Rust methods on the class struct for invoking the underlying interface methods without having to manually look them up in the vtable yourself. These methods do make COM APIs more convenient and idiomatic to use from Rust; however, I believe that they make it easy to trigger undefined behavior and should probably be replaced with associated methods that take a*mut this
argument instead.The most straightforward example of this is
IUnknown::Release
.Release
can deallocate the object pointed to by the&self
reference, invalidating that reference, which results in undefined behavior. The only way to avoid this problem is forRelease
to takeself
as a raw pointer instead.The
Release
method could be special-cased to avoid this problem, but I believe it's unsound for other methods to take&self
as well, due to Rust's pointer provenance rules. Under Stacked Borrows, a pointer derived from an&T
may only be used to access the contents of that particularT
, and not whatever larger object thatT
may happen to be a part of. Since COM methods will almost always result in accesses to memory outside the bounds of the base class struct, calling these methods will usually result in undefined behavior.This is probably a mostly theoretical issue in the case of virtual calls across an FFI boundary to dynamically linked Windows code, since there's not much that the Rust optimizer can do with that. It's also not specifically an issue when Rust code implements a COM interface to be called by external system code (e.g. the element provider interfaces from the UI Automation API), since currently that would be done by directly populating the vtable struct with user-defined functions that take a
this: *mut T
. I would be most worried about a scenario where a single Rust codebase both defines a class that implements a COM interface and then calls that class's methods using theRIDL
-generated convenience methods. However, I think it's desirable to avoid undefined behavior even if it currently seems unlikely for it to result in negative consequences in practice.While it is invalid to use a pointer formed from an
&T
to access memory outside the bounds of theT
, it's perfectly valid to take a pointer derived from a reference to a container, form a pointer to one of its members (using e.g.addr_of
), and then form another pointer back to the original container and use it to access memory outside the bounds of the member. The issue at hand only occurs if the member pointer is converted to a member reference and then back. So, exposing the COM methods as associated types of the base class struct (e.g.fn IUnknown::Release(this: *mut Self) -> ULONG
) should be perfectly sound. It would be a little bit less convenient to use, and having classesDeref
to their base class would no longer work the way it does currently, but it would prevent undefined behavior from occurring.The text was updated successfully, but these errors were encountered: