Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory-model design discussion #16

Closed
andreeaflorescu opened this issue Feb 11, 2019 · 11 comments
Closed

memory-model design discussion #16

andreeaflorescu opened this issue Feb 11, 2019 · 11 comments

Comments

@andreeaflorescu
Copy link
Member

No description provided.

@jiangliu
Copy link
Member

jiangliu commented Feb 11, 2019

Hi all,
I have posted the first version of memory-model crate under rust-vmm/memory-model, which breaks the repository inclusion process of the rust-vmm community. So I have created a personal GitHub repository (https://github.com/jiangliu/memory-model) for v2 and the rust-vmm/memory-model repository will be deleted soon. Sorry for the inconvenience!
The main change from v1 to v2 is the introduction of the AddressSpace abstraction, which is used to present the physical address space of a virtual machine. An AddressSpace object contains guest memory(RAM) regions and MMIO regions for devices. There are two possible ways to make use of the memory-model crate:
1) Use the GuestMemory to represent a virtual machine address space, as it’s used currently by the firecracker and crosvm project.
2) Use the AddressSpace to represent a virtual machine address space, and build GuestMemory objects from the AddressSpace object on demand. So different permission and protection mechanisms may be applied to different regions in guest address space. For example we may protect guest kernel code region with advanced EPT permission flags. It may help to mitigate the security concerns mentioned on the last meeting.

On the other hand, the memory-model crate needs to satisfy requirements from both crosvm and firecracker, and currently the most sensitive conflict is that crosvm uses u64 for memory related fields but firecracker uses usize instead. As the valid usage case Zack has mentioned:
"On 64-bit arm devices, we usually run a 32-bit userspace with a 64-bit kernel. In this case, the machine word size (usize) that crosvm is compiled with (32-bit) isn't the same as the one the guest kernel, host kernel, hardware is using (64-bit). We used u64 to ensure that the size was always at least as big as needed.“

So we can’t simply replace u64 with usize. With the introduction of the AddressSpace abstraction, the proposal to solve this conflict is:
1) Use u64 for all fields related to virtual machine physical address space. Most fields of the AddressSpace and GuestAddress structure falls into this category.
2) Use usize for all fields representing address/size in current process(VMM). Most fields of the MemoryMapping and GuestMemory structure falls into this category.
If the proposal is the right way to go, I will posted a v3 with the proposed solution.

Thanks,
Gerry

@bonzini
Copy link
Member

bonzini commented Feb 11, 2019

This review is based on commit ab37cf3324310b074869680b3a471ca436f51a68 at jiangliu/memory-model.

A memory model crate is certainly a very good starting point for rust-vmm, but it shouldn't include the implementation of the backend. Instead, for rust-vmm we should focus on defining common traits that can be used by any VMM.

In this light, GuestMemory is composed of two parts that are completely independent. The first is a way to convert a GuestAddress to a MemoryMapping and an offset, which can be a very simple MemoryMap trait:

 pub trait MemoryMap {
    type Region: Bytes<usize> /* see below */ ;
    fn do_in_region<F, T>(&self, guest_addr: GuestAddress,
                          size: usize, cb: F) -> Result<T>
    where
      F: FnOnce(&Self::Region, usize) -> Result<T>;

    fn do_in_region_partial<F>(&self, guest_addr: GuestAddress,
                                   cb: F) -> Result<usize>
    where
      F: FnOnce(&Self::Region, usize) -> Result<usize>;
}

This can be implemented with linear lookup as is currently the case in firecracker, or it could use a binary search or a radix tree. rust-vmm shouldn't care. It should only define the trait so that (for example) virtio crates can receive a MemoryMap and access it.

The second part is the convenience API to access memory as slices/streams/objects. This part of the API is actually shared by MemoryMapping and GuestMemory:

// From MemoryMapping
pub fn read_to_memory<F>(&self, mem_offset: usize, src: &mut F,
                          count: usize) -> Result<()>
    where F: Read;

 // From GuestMemory
 pub fn read_to_memory<F>(&self, guest_addr: GuestAddress,
                          src: &mut F, count: usize) -> Result<()>
   where F: Read;

... though sometimes the methods have different names:

// From MemoryMapping
pub fn write_slice(&self, buf: &[u8], offset: usize) -> Result<usize>;
pub fn read_obj<T: DataInit>(&self, offset: usize) -> Result<T>;

// From GuestMemory
 pub fn write_slice_at_addr(&self, buf: &[u8], guest_addr: GuestAddress)
    -> Result<usize>;
pub fn read_obj_from_addr<T: DataInit>(&self, guest_addr: GuestAddress)
    -> Result<T>;

This API should be a trait that is implemented by both MemoryMapping and GuestMemory. For example if we call it Bytes<O>, MemoryMapping would implement Bytes<usize> and GuestMemory would implement Bytes<GuestAddress>.

// O for offset
pub trait Bytes<O> {
    type Error;

    // Perhaps rename it to "read_from"?
    fn read_to_memory<F>(&self, offset: O, src: &mut F,
                      count: usize) -> Result<(), Self::Error>
      where F: Read;
    // Perhaps rename it to just "read"?
    fn read_obj<T: DataInit>(&self, offset: O) -> Result<T, Self::Error>;
    ...
    fn read_slice(&self, buf: &[u8], mem_offset: O) ->
      Result<usize, Self::Error>;
    ...
}

(Alternatively, the offset type could be an associated type. Opinions are welcome from people more familiar with Rust API design, but I think it's better to have it as an argument so that MemoryMap can declare type Region: Bytes<usize>).

The memory-model crate should definitely provide an implementation of Bytes<GuestAddress> for MemoryMap, based on the current implementation of GuestMemory. However, the rest of GuestMemory and MemoryMapping could be provided in a separate repository (rust-vmm-examples?) as a reference implementation. There's always time to "graduate" things from rust-vmm-examples to rust-vmm.

AddressSpace is too specialized. I would leave it out completely from the time being; again there's always time to add stuff, removing is harder.

On the other hand, I am very positive that endian.rs should be part of this crate from the very beginning, so that you can write

let x: LE<u32> = mem.read_obj(ofs);

No objections from me of course on other parts of the crate, for example VolatileMemory or DataInit.

@jiangliu
Copy link
Member

@bonzini
So the key idea is to build a layer architecture as [vm_mem_accessor] -> [vm_mem_access_mechanism] -> [vm_mem_content_backend]. The vm_mem_accessor layer defines interfaces to access vm memory, and all clients such as VMMs, virtio drivers and vhost-user drivers only have dependency on this layer. The vm_mem_access_mechanism layer implements the mechanism to access vm memory by means of mmap, file rw, ioctl or even RPC. And the vm_mem_content_backend may be anonymous memory, memfd, file or even some virtual content.
For the memory-model crate, it should only cover the vm_mem_accessor layer, and the two lower layers should be covered by other crates and may have different implementation for qemu and firecracker.
By this way, it may need more changes to the crosvm and firecracker project. So need feedbacks from them:)

@bonzini
Copy link
Member

bonzini commented Feb 11, 2019

[vm_mem_accessor] -> [vm_mem_access_mechanism] -> [vm_mem_content_backend].

Right, which in my description are Bytes<GuestAddress>, MemoryMap and Bytes<usize>. All three are traits so you're free to plug in whatever implementation you want.

For the memory-model crate, it should only cover the vm_mem_accessor layer, and the two lower layers
should be covered by other crates and may have different implementation for qemu and firecracker.

Yes. More precisely memory-model should provide the traits for all three layers. It turns out that:

  • two layers are just specializations of the same "byte-level access" trait

  • what you called vm_mem_accessor (Bytes<GuestAddress>) admits a useful implementation for any vm_mem_access_mechanism (i.e. an implementation in terms of MemoryMap). Therefore, impl<T> Bytes<GuestAddress> for T where T: MemoryMap can be in memory-model too.

As you say it's not just QEMU and Firecracker, vhost-user crates should also be able to use MemoryMap and Bytes<>, so that they can share the virtqueue implementation with Firecracker. Starting with traits makes it harder to over-specify things.

By this way, it may need more changes to the crosvm and firecracker project. So need feedbacks from
them:)

Sure. Hopefully the changes to crosvm and firecracker amount to little more than search-and-replace.

@andreeaflorescu
Copy link
Member Author

@alexandruag @dhrgit can you also take a look at this?

@bonzini
Copy link
Member

bonzini commented Feb 19, 2019

One issue regarding testing: even if we can have a reference implementation in a different “example” crate, or we need an implementation of the traits to test the whole thing.

What does everyone think of implementing Bytes<usize> for &mut [u8] inside this crate? This would replace MemoryMapping in the tests.

@alexandruag
Copy link
Contributor

Hi everyone,

Apologies for joining the discussion so late. I really like @bonzini's point of view; it makes sense to have memory_model expose an interface, and then also have some sort of default/reference implementation in another crate. Here are just a few remarks/questions I kinda have on this topic.

The implementation of GuestMemory in CrosmVM (and by extension Firecracker) exhibits interior mutability, so mutable operations (like writes) are done using a regular &GuestMemory borrow. This is very helpful when multiple references to the same guest memory are lying around (especially for different threads), otherwise we'd have to use RefCells or Mutexes and the like. I was just wondering if we miss something with this abstraction. Can anyone imagine a (useful) scenario where having a meaningful distinction between a &GuestMemory and a &mut GuestMemory makes sense?

If Region is a part of the interface exposed by MemoryMap, maybe we should have an actual trait like:

pub trait GuestMemoryRegion<T>: Bytes<T> {
    fn host_address(&self) -> usize;
    fn guest_address(&self) -> usize;
    fn size(&self) -> usize;
}  

I used usize as the return type for the methods above, but it might as well be anything else. If components such as KVM high-level wrappers also end up interacting with things that implement GuestMemory, as opposed to a particular implementation, we need more info about the regions for setting up memory slots, etc. I imagine there are other use cases where this is nice to know, at least for info/debugging purposes.

Speaking of regions, what do you think about having some sort of slice (or view if you will) analogue for GuestMemory. For example, let's say I'm writing some device model code and I'm afraid a malicious driver will try to make me access certain memory areas by providing invalid descriptor indices. I can add bound checks and all that, but it seems nicer if I could operate on a 'slice' of the GuestMemory, restricted to precisely the memory area(s) used to interact with the suspicious device. I don't know if this makes that much sense, just throwing the idea out there :D

The u64 vs usize debate raised a very interesting point. In the end, is it ok to have a predefined underlying datatype (u64 in this case) for the GuestAddress offset? Would it make sense for GuestMemory to become generic in that regard? It looks tricky to define an abstraction that's not awkward to use, but it should be doable if there are good reasons to think about it :D

At this point I guess the next step is having some sort of initial version/proposal for the complete interface. Thanks for taking the time to read all of this!

@jiangliu
Copy link
Member

jiangliu commented Feb 20, 2019

Speaking of regions, what do you think about having some sort of slice (or view if you will) analogue for >GuestMemory. For example, let's say I'm writing some device model code and I'm afraid a malicious driver >will try to make me access certain memory areas by providing invalid descriptor indices. I can add bound >checks and all that, but it seems nicer if I could operate on a 'slice' of the GuestMemory, restricted to >precisely the memory area(s) used to interact with the suspicious device. I don't know if this makes that >much sense, just throwing the idea out there :D

How about rethink about the AddressSpace abstraction?
I tried to solve above issue by introducing the AddressSpace abstraction, which contains group of guest memory regions with different types/properties. Then GuestMemory is used to map partial or full of the regions in AddressSpace into current process. By this way, we may build a GuestMemory object for device, another GuestMemory object for the boot loader.

@bonzini
Copy link
Member

bonzini commented Feb 20, 2019

The implementation of GuestMemory in CrosmVM (and by extension Firecracker) exhibits interior mutability, so mutable operations (like writes) are done using a regular &GuestMemory borrow. This is very helpful when multiple references to the same guest memory are lying around (especially for different threads), otherwise we'd have to use RefCells or Mutexes and the like. I was just wondering if we miss something with this abstraction. Can anyone imagine a (useful) scenario where having a meaningful distinction between a &GuestMemory and a &mut GuestMemory makes sense?

I suppose it is sound because all reads/writes are for types that are DataInit? If so I guess it's okay, even if you lose a little bit in safety. Most guest memory accesses are atomic-like anyway. A Mutex would make no sense when the guest can anyway perform accesses without taking it.

Regarding GuestMemoryRegion, instead I think that Region could become a generic argument MemoryMap<R> instead of an associated type. Then KVM bindings could use be written to expect a MemoryMap<MemoryMapping> and use that to access the host address in the callers of do_in_region and the like.

@jiangliu
Copy link
Member

Hi all,
I have posted the design doc at #22 and you may access the latest code at my personal github repository: https://github.com/jiangliu/memory-model/tree/v2
thanks,
Gerry

@bonzini
Copy link
Member

bonzini commented Feb 25, 2019

Thanks! Let's move the discussion there. @andreeaflorescu can you close this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants