Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misaligned pointer dereference #210

Closed
fgadaleta opened this issue Aug 29, 2023 · 24 comments · May be fixed by #223
Closed

Misaligned pointer dereference #210

fgadaleta opened this issue Aug 29, 2023 · 24 comments · May be fixed by #223
Labels
bug Something isn't working

Comments

@fgadaleta
Copy link

fgadaleta commented Aug 29, 2023

Describe the bug

2023-08-29T08:14:34.620649Z  INFO bevy_winit::system: Creating new window "Bevy App" (0v0)
2023-08-29T08:14:34.621476Z  INFO winit::platform_impl::platform::x11::window: Guessed window scale factor: 1    
2023-08-29T08:14:35.225130Z  INFO bevy_render::renderer: AdapterInfo { name: "NVIDIA GeForce RTX 3060 Ti", vendor: 4318, device: 9353, device_type: DiscreteGpu, driver: "NVIDIA", driver_info: "525.125.06", backend: Vulkan }
thread 'main' panicked at /home/frag/.cargo/git/checkouts/bevy_mod_physx-ec641f3181f0becd/467a452/src/physx/src/traits/user_data.rs:26:41:
misaligned pointer dereference: address must be a multiple of 0x8 but is 0x7ffff4a09606
stack backtrace:
   0: rust_begin_unwind
             at /rustc/4e78abb437a0478d1f42115198ee45888e5330fd/library/std/src/panicking.rs:619:5
   1: core::panicking::panic_nounwind_fmt
             at /rustc/4e78abb437a0478d1f42115198ee45888e5330fd/library/core/src/panicking.rs:106:14
   2: core::panicking::panic_misaligned_pointer_dereference
             at /rustc/4e78abb437a0478d1f42115198ee45888e5330fd/library/core/src/panicking.rs:193:5
   3: physx::traits::user_data::UserData::init_user_data
             at /home/frag/.cargo/git/checkouts/bevy_mod_physx-ec641f3181f0becd/467a452/src/physx/src/traits/user_data.rs:26:41
   4: physx::material::Material::from_raw
             at /home/frag/.cargo/git/checkouts/bevy_mod_physx-ec641f3181f0becd/467a452/src/physx/src/material.rs:122:25
   5: physx::physics::Physics::create_material
             at /home/frag/.cargo/git/checkouts/bevy_mod_physx-ec641f3181f0becd/467a452/src/physx/src/physics.rs:418:13
   6: <bevy_physx::PhysicsCore as bevy_app::plugin::Plugin>::finish
             at /home/frag/.cargo/git/checkouts/bevy_mod_physx-ec641f3181f0becd/467a452/src/lib.rs:334:22
   7: bevy_app::app::App::finish
             at /home/frag/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bevy_app-0.11.0/src/app.rs:313:13
   8: bevy_winit::winit_runner::{{closure}}
             at /home/frag/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bevy_winit-0.11.0/src/lib.rs:352:17
   9: winit::platform_impl::platform::sticky_exit_callback
             at /home/frag/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.28.6/src/platform_impl/linux/mod.rs:884:9
  10: winit::platform_impl::platform::x11::EventLoop<T>::run_return::single_iteration
             at /home/frag/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.28.6/src/platform_impl/linux/x11/mod.rs:334:17
  11: winit::platform_impl::platform::x11::EventLoop<T>::run_return
             at /home/frag/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.28.6/src/platform_impl/linux/x11/mod.rs:443:31
  12: winit::platform_impl::platform::x11::EventLoop<T>::run
             at /home/frag/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.28.6/src/platform_impl/linux/x11/mod.rs:498:25
  13: winit::platform_impl::platform::EventLoop<T>::run
             at /home/frag/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.28.6/src/platform_impl/linux/mod.rs:792:56
  14: winit::event_loop::EventLoop<T>::run
             at /home/frag/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.28.6/src/event_loop.rs:305:9
  15: bevy_winit::run
             at /home/frag/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bevy_winit-0.11.0/src/lib.rs:186:5
  16: bevy_winit::winit_runner
             at /home/frag/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bevy_winit-0.11.0/src/lib.rs:787:9
  17: core::ops::function::FnOnce::call_once
             at /rustc/4e78abb437a0478d1f42115198ee45888e5330fd/library/core/src/ops/function.rs:250:5
  18: core::ops::function::FnOnce::call_once{{vtable.shim}}
             at /rustc/4e78abb437a0478d1f42115198ee45888e5330fd/library/core/src/ops/function.rs:250:5
  19: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /rustc/4e78abb437a0478d1f42115198ee45888e5330fd/library/alloc/src/boxed.rs:2007:9
  20: bevy_app::app::App::run
             at /home/frag/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bevy_app-0.11.0/src/app.rs:292:9
  21: intrepid_sim::main
             at ./src/main.rs:129:5
  22: core::ops::function::FnOnce::call_once
             at /rustc/4e78abb437a0478d1f42115198ee45888e5330fd/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread caused non-unwinding panic. aborting.
Aborted


@fgadaleta fgadaleta added the bug Something isn't working label Aug 29, 2023
@tgolsson
Copy link
Member

Please rerun with RUST_BACKTRACE=1 so we can see the callstack.

@tgolsson
Copy link
Member

tgolsson commented Aug 29, 2023

Thanks for the update! The misalignment here makes me wonder whether the passed in material itself is dead or incorrect. Is there a repository where I can run this code with a debugger attached?

@rlidwka CC :-)

@rlidwka
Copy link
Contributor

rlidwka commented Aug 29, 2023

@tgolsson, can you maybe spot anything that can cause this error in the following else branch? UserData being stored here is empty tuple, ().

} else {
// DATA_SIZE <= VOID_SIZE
unsafe {
*self.user_data_ptr_mut() =
*(&user_data as *const Self::UserData as *const *mut c_void)
}
}

This is a spurious error. It failed at the first material being created, only on one machine, only with one example, only in debug mode, and it worked on the same machine before. I cannot reproduce it myself.


The line it fails at is here:
https://github.com/rlidwka/bevy_mod_physx/blob/467a452eb94b069a6c997eadb0dcd13211e44673/src/lib.rs#L334

Be aware that that branch uses older physx-rs version that works with physx4 (because bindings for vehicles from physx5 don't exist yet), but relevant code in physx-rs hasn't changed since then.

If you want, you can run cargo run --example vehicle on the repository https://github.com/rlidwka/bevy_mod_physx, branch physx4_20230717, but I wouldn't expect it to do very much.

@rlidwka
Copy link
Contributor

rlidwka commented Aug 29, 2023

gonna debug it later today

@tgolsson
Copy link
Member

I'm really confused to be honest. Either the write is misaligned, or the read is... () has alignment of 1, but I don't think it could ever be aligned < 8 without breaking stack alignment in this code. Similarly, the write target is definitely offset by 16 from the start of the struct, which means that the PxMaterial would also be misaligned, which seems odd.

@tgolsson
Copy link
Member

tgolsson commented Aug 29, 2023

Ok, as usual, the problem is apparent after refuting it. The read must be the problem (if the material is valid). We're reading an 8 byte word from a location containing 1 byte. And reading those bytes together enforces the 8-byte alignment. Doesn't matter how we get to the invalid alignment, it's clearly UB. We should probably cast the target instead of the source... writing 1 byte to an 8-byte location is legal.

@rlidwka if you have repro, do you care to try switching the cast to the other side and seeing what falls out?

@rlidwka
Copy link
Contributor

rlidwka commented Aug 29, 2023

do you care to try switching the cast to the other side and seeing what falls out?

@tgolsson, on @fgadaleta's machine read seems to have alignment of 1 for the reasons I don't fully understand.

The following code works. Is this what you had in mind?

unsafe fn init_user_data(&mut self, user_data: Self::UserData) -> &mut Self {
    if size_of::<Self::UserData>() > size_of::<*mut c_void>() {
        // Too big to pack into a *mut c_void, kick it to the heap.
        let data = Box::into_raw(Box::new(user_data));
        *(self.user_data_ptr_mut() as *mut *mut c_void as *mut *mut Self::UserData) = data;
    } else {
        // DATA_SIZE <= VOID_SIZE
        *(self.user_data_ptr_mut() as *mut *mut c_void as *mut Self::UserData) = user_data;
    }
    self
}

@tgolsson
Copy link
Member

tgolsson commented Aug 29, 2023

@tgolsson, on @fgadaleta's machine read seems to have alignment of 1 for the reasons I don't fully understand.

You mean that the pointee has a 1-byte alignment? That's correct given that () has 1-byte alignment. I can't repro it actually having that alignment using std::mem::align_of-- when we take the address it would be spilled to stack and be 8-byte aligned as far as I can tell. But it might be debug vs release differences, architecture, etc.

The following code works. Is this what you had in mind?

<snip>

Yes, pretty much. The userData is going to be aligned by 8-bytes inside its parent object in every situation, and always 8 bytes big.

@fgadaleta
Copy link
Author

@tgolsson, on @fgadaleta's machine read seems to have alignment of 1 for the reasons I don't fully understand.

You mean that the pointee has a 1-byte alignment? That's correct given that () has 1-byte alignment. I can't repro it actually having that alignment using std::mem::align_of-- when we take the address it would be spilled to stack and be 8-byte aligned as far as I can tell. But it might be debug vs release differences, architecture, etc.

The following code works. Is this what you had in mind?

<snip>

Yes, pretty much. The userData is going to be aligned by 8-bytes inside its parent object in every situation, and always 8 bytes big.

Indeed in release there is no alignment problem.

@iddm
Copy link

iddm commented Oct 6, 2023

I have just updated Rustc 1.73 and have the same problem. I have never had it before. The problem is exactly the same - creating a material without user data (so with a unit struct). I am also confused now as to why this has never been a problem before. And what should I do now? Also, why doesn't the example have the same problem? Or does it? I'll check.

The example crashes funny:

❯ /home/user/workspace/physx-rs/target/debug/examples/ball_physx
zsh: IOT instruction (core dumped)  /home/user/workspace/physx-rs/target/debug/examples/ball_physx
Thread 1 "ball_physx" received signal SIGABRT, Aborted.
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0)
    at pthread_kill.c:44
44	      return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
(gdb) bt
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6,
    no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007ffff79bf8a3 in __pthread_kill_internal (signo=6, threadid=<optimized out>)
    at pthread_kill.c:78
#2  0x00007ffff796f668 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007ffff79574b8 in __GI_abort () at abort.c:79
#4  0x00005555555f6a06 in physx::PxCreateDynamic (sdk=..., transform=..., geometry=..., material=...,
    density=10, shapeOffset=...)
    at /home/user/workspace/physx-rs/physx-sys/physx/physx/source/physxextensions/src/ExtSimpleFactory.cpp:91
#5  0x00005555555ad001 in physx::rigid_dynamic::RigidDynamic::new<physx::rigid_dynamic::PxRigidDynamic<(), physx::shape::PxShape<(), physx::material::PxMaterial<()>>>, physx::physics::PhysicsFoundation<physx::foundation::DefaultAllocator, physx::shape::PxShape<(), physx::material::PxMaterial<()>>>, physx_sys::PxSphereGeometry> (physics=0x7fffffffd958, transform=..., geometry=0x7fffffffdbd4,
    material=0x555555d40510, density=10, shape_transform=..., user_data=())
    at physx/src/rigid_dynamic.rs:114
#6  0x00005555555a544a in physx::physics::Physics::create_rigid_dynamic<physx::physics::PhysicsFoundation<physx::foundation::DefaultAllocator, physx::shape::PxShape<(), physx::material::PxMaterial<()>>>, (), physx_sys::PxSphereGeometry> (self=0x7fffffffd958, transform=..., geometry=0x7fffffffdbd4,
    material=0x555555d40510, density=10, shape_transform=..., user_data=()) at physx/src/physics.rs:497
#7  0x00005555555afda3 in ball_physx::main () at physx/examples/ball_physx.rs:104

@tgolsson
Copy link
Member

tgolsson commented Oct 6, 2023

😕 Are you able to repro with the test too?

@rlidwka
Copy link
Contributor

rlidwka commented Oct 7, 2023

I am also confused now as to why this has never been a problem before.

Rust added these asserts in debug mode in Rust 1.70 (rust-lang/rust#98112).

@iddm
Copy link

iddm commented Oct 7, 2023

I am also confused now as to why this has never been a problem before.

Rust added these asserts in debug mode in Rust 1.70 (rust-lang/rust#98112).

I have always been using the latest Rust and started seeing this issue just now after the upgrade to 1.73.

@tgolsson tgolsson reopened this Oct 7, 2023
@iddm
Copy link

iddm commented Oct 7, 2023

😕 Are you able to repro with the test too?

By test, what exactly do you mean?

@tgolsson
Copy link
Member

tgolsson commented Oct 7, 2023

There's a unit test here:

#[cfg(test)]
mod tests {
use std::{ffi::c_void, fmt::Debug, marker::PhantomData, ptr::null_mut};
use super::UserData;
struct TestUserData<U> {
user_data: *mut c_void,
phantom: PhantomData<U>,
}
impl<U> Default for TestUserData<U> {
fn default() -> Self {
Self {
user_data: null_mut(),
phantom: PhantomData,
}
}
}
unsafe impl<U> UserData for TestUserData<U> {
type UserData = U;
fn user_data_ptr(&self) -> &*mut c_void {
&self.user_data
}
fn user_data_ptr_mut(&mut self) -> &mut *mut c_void {
&mut self.user_data
}
}
fn do_test<U: PartialEq + Clone + Debug>(user_data: U) {
unsafe {
let mut object: TestUserData<U> = TestUserData::default();
object.init_user_data(user_data.clone());
assert_eq!(UserData::get_user_data(&object), &user_data);
assert_eq!(UserData::get_user_data_mut(&mut object), &user_data);
}
}
#[test]
fn test_user_data() {
do_test(()); // unit type
do_test(100u8); // smaller than pointer
do_test(100usize); // same size as pointer
do_test([100usize; 4]); // larger than pointer
}

@iddm
Copy link

iddm commented Oct 7, 2023

There's a unit test here:

#[cfg(test)]
mod tests {
use std::{ffi::c_void, fmt::Debug, marker::PhantomData, ptr::null_mut};
use super::UserData;
struct TestUserData<U> {
user_data: *mut c_void,
phantom: PhantomData<U>,
}
impl<U> Default for TestUserData<U> {
fn default() -> Self {
Self {
user_data: null_mut(),
phantom: PhantomData,
}
}
}
unsafe impl<U> UserData for TestUserData<U> {
type UserData = U;
fn user_data_ptr(&self) -> &*mut c_void {
&self.user_data
}
fn user_data_ptr_mut(&mut self) -> &mut *mut c_void {
&mut self.user_data
}
}
fn do_test<U: PartialEq + Clone + Debug>(user_data: U) {
unsafe {
let mut object: TestUserData<U> = TestUserData::default();
object.init_user_data(user_data.clone());
assert_eq!(UserData::get_user_data(&object), &user_data);
assert_eq!(UserData::get_user_data_mut(&mut object), &user_data);
}
}
#[test]
fn test_user_data() {
do_test(()); // unit type
do_test(100u8); // smaller than pointer
do_test(100usize); // same size as pointer
do_test([100usize; 4]); // larger than pointer
}

This one passes, but another one fails:

test flags ... FAILED

failures:

---- flags stdout ----
thread 'flags' panicked at physx-sys/pxbind/tests/enums.rs:58:6:
called `Result::unwrap()` on an `Err` value: failed to consume Record { id: None, name: Some("PxProcessPxBaseCallback"), tag_used: Some(Class), definition_data: Some(DefinitionData { dtor: Dtor { irrelevant: false, simple: false }, is_abstract: true, is_polymorphic: true }), bases: [] }

Caused by:
    0: failed to parse parameter 'anon_param0 (PxBase &)' for function 'process'
    1: Unknown type 'Simple("PxBase")'
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    flags

test result: FAILED. 3 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 1.71s

error: test failed, to rerun pass `-p pxbind --test enums`

Note that the ball_physx example fails.

@tgolsson
Copy link
Member

This looks like a different issue; cc @Jake-Shadle -- do you recognize it from rewriting the pxbind setup?

@iddm
Copy link

iddm commented Oct 10, 2023

This looks like a different issue; cc @Jake-Shadle -- do you recognize it from rewriting the pxbind setup?

Please note that there are two issues - one is the example of ball_physx.rs, and another is this test. Neither works fine.

One piece of news: this is probably related to the changes in 1.73. Before the update of Rustc, I had never had this issue. Now, I also have other issues with Vulkan, really-really-really weird ones (my code is plain simple, and I can't use it wrong, besides the fact that since 1.6x and until just 1.73, it has always been working). I suspect a regression in 1.73.

UPD: 1.72.1 works okay (in debug), I don't have this issue. Contrary, 1.73, 1.74 and 1.75 have the same behaviour.

@tgolsson
Copy link
Member

tgolsson commented Oct 10, 2023

I can repro the issue with flags test, but we'll track that in the other issue. I can't repro the issue with ball_physx, and detect no issues with sanitizers. What platform/architecture are you using?

@iddm
Copy link

iddm commented Oct 10, 2023

I can repro the issue with flags test, but we'll track that in the other issue. I can't repro the issue with ball_physx, and detect no issues with sanitizers. What platform/architecture are you using?

Interesting, what that might be then? ... 🤔
I am using AMD 7950x3d, ArchLinux (the latest), and the latest Rust via Rustup.

@iddm
Copy link

iddm commented Oct 11, 2023

Fixed the ball_physx.rs issue myself. Now, the only problem is this pointer.

If we trust gdb, then there is probably a bit more information on where this misaligned pointer dereference happens:

Thread 1 "test-binary" hit Breakpoint 1.3, physx::traits::user_data::UserData::init_user_data<physx::material::PxMaterial<()>> (self=0x5555623d5170, user_data=())
    at /home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/physx-0.18.0/src/traits/user_data.rs:30
30	                    *(&user_data as *const Self::UserData as *const *mut c_void)
(gdb) n
30	                    *(&user_data as *const Self::UserData as *const *mut c_void)
(gdb)
INFO: Message: misaligned pointer dereference: address must be a multiple of 0x8 but is 0x7fffffffa606
Location: Location { file: "/home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/physx-0.18.0/src/traits/user_data.rs", line: 30, col: 21 }
thread caused non-unwinding panic. aborting.

Thread 1 "test-binary" received signal SIGABRT, Aborted.
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44	      return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;

I guess the problem is that the fix for the issue hasn't been published as a new version of the crate, and I still depend on the version in crates.io instead of this git repository. Will the fix ever be uploaded to crates.io?

@iddm
Copy link

iddm commented Oct 11, 2023

One last update: the problem is actually fixed but hasn't been published to crates.io. I do not see this issue anymore with the changes made to fix it earlier by changing the dependency from crates.io to this git repository. I, however, would like to depend on it through crates.io to avoid some in-between breakages.

@tgolsson
Copy link
Member

Ah, ok. Closing again! I've wanted to make a release but haven't had the time. Will close this and try to do a release today.

@iddm
Copy link

iddm commented Oct 11, 2023

Sure thing! I apologise for having taken your time, but I didn't know I had been using an outdated version all this time. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants