Support batch messages #170

stbuehler · 2021-06-06T16:11:02Z

This is motivated by netfilter; changes to netfilter must be done through a series of messages (started by NFNL_MSG_BATCH_BEGIN, ended by NFNL_MSG_BATCH_END). The full batch needs to be submitted to the kernel in one write/sendto/..., otherwise the kernel will abort the batch.

cathay4t · 2021-06-14T00:21:30Z

@stbuehler Thanks for the patch. Could you provide test or example code helping me understand how this works in upper level?

stbuehler · 2021-06-14T21:43:50Z

Hi. I'm working on some netfilter code; I don't think it's ready to show yet.

But here is a small example. netfilter changes need to be done in a "batch" request; it starts with a BatchBegin and ends with a BatchEnd message. The kernel wants to read the end message in the same "buffer" as the begin message (I suppose so userspace can't block the "lock" held during a change). The batch is also locked to a certain netfilter subsystem.

let mut batch = handle.batch(SocketAddr::new(0, 0));
batch.notify(NetfilterMessage {
	header: NetfilterHeader {
		resource_id: (libc::NFNL_SUBSYS_NFTABLES as u16) * 256,
		..Default::default()
	},
	body: NetfilterBody::BatchBegin,
}.request_no_ack());
let c_table_response = batch.request(NetfilterMessage {
	header: NetfilterHeader {
		family: Family::INET,
		..Default::default()
	},
	body: NetfilterBody::NewTable(NftTable {
		name: Some(std::ffi::OsString::from("filter")),
		flags: Some(0),
		..Default::default()
	}),
}.create());
let c_table_set = batch.request(NetfilterMessage {
	header: NetfilterHeader {
		family: Family::INET,
		..Default::default()
	},
	body: NetfilterBody::NewSet(NftSet {
		table: Some(std::ffi::OsString::from("filter")),
		name: Some(std::ffi::OsString::from("allow")),
		key_type: Some(9), // "ether_addr" in userspace nft; kernel doesn't care
		key_len: Some(6),
		id: Some(1),
		flags: Some(libc::NFT_SET_TIMEOUT as u32),
		// userdata: b"\x00\x04\x02\x00\x00\x00",
		..Default::default()
	}),
}.create());
batch.notify(NetfilterMessage {
	header: NetfilterHeader {
		resource_id: (libc::NFNL_SUBSYS_NFTABLES as u16) * 256,
		..Default::default()
	},
	body: NetfilterBody::BatchEnd,
}.request_no_ack());
batch.send()?;

cathay4t

~~1. API changed, please increase version number.~~
2. As a lower level library, a single fault will require hours debuging when used in other tool, please at least include a unit test case to ensure batch message is encoding to expected u8 array and also decoding in expected way.

netlink-proto/src/protocol/request.rs

little-dude · 2021-11-08T09:10:10Z

FWIW this is a feature I'd love to see supported. Thanks for working on this @stbuehler :)

little-dude · 2021-11-25T19:27:12Z

Hey @stbuehler do you think you'll pursue this PR now that #195 is in? I'd love to see batch messages landing so I might pick it up next week if you're not planning to come back to it.

stbuehler · 2021-11-25T23:52:20Z

Hey @stbuehler do you think you'll pursue this PR now that #195 is in? I'd love to see batch messages landing so I might pick it up next week if you're not planning to come back to it.

Yeah sorry, got... distracted. Quite late here, so I probably should take another look tomorrow, but let's see how this is looking now. (Also should test whether my netfilter code still works with this.)

The definition of the `NLMSG_NEXT` macro in the kernel makes it clear that netlink messages are padded to a multiple of 4 bytes; this padding is not part of the `nlmsg_len` field of the message header. Luckily `nlmsg_len` is usually already a multiple of 4 (as the netlink attributes are also padded to multiple of 4 bytes, and fixed-length parts at the beginning of messages are usually aligned too), so the padding is usually empty. Anyway, fix this: 1. Make sure encoded messages is padded to multiple of 4 bytes Technically the last message in a datagram doesn't need padding (i.e. the kernel doesn't require it), but it's easier to be consistent. 2. Skip padding in decoder Similar to the kernel we ignore missing padding for now, which would be real bad if the buffer is received via "stream" instead of "datagram", and might include partial datagrams (i.e. where the padding is read later and not yet in the buffer). Netlink sockets are datagram sockets though, so this is fine. Note: NetlinkAuditCodec might need similar changes, but audit is quite special.

- inner `FramedIO` only generic over socket type, not codec / message type - prepare for second `Sink` impl on `NetlinkFramed` with less code duplication

Also drop tuple conversions; preparing for a Batch variant.

Prepares for handling of batch messages.

This is motivated by netfilter; changes to netfilter must be done through a series of messages (started by NFNL_MSG_BATCH_BEGIN, ended by NFNL_MSG_BATCH_END). The full batch needs to be submitted to the kernel in one write/sendto/..., otherwise the kernel will abort the batch. (And sending a batch without an END message is interpreted as a query to verify the batch without actually committing it.)

stbuehler · 2022-04-10T14:38:39Z

First of all: sorry, didn't work on this in a long time. I wasn't satisfied with how I split the changes into single commits, and I think this is looking better now.

Sadly I discovered a new design issue (I was working on similar things in python the last two weeks, which made me think about it again):
I think the handling of pending_requests seems dangerous (#138): basically all messages to the kernel require NLM_F_REQUEST, and not adding NLM_F_ACK means the pending_requests will stay around forever unless the request either returns an error or a "done" response (after possibly other messages). Imho this means you should always set NLM_F_ACK and hope the kernel honors the flag, just to make sure pending_requests is cleaned up.

Now regarding batch messages and pending_requests:

The real bad news is: neither NFNL_MSG_BATCH_BEGIN nor NFNL_MSG_BATCH_END respect NLM_F_ACK - i.e. the kernel will never send a message on success. A successful empty batch (i.e. just begin+end) returns no message at all... (and if the batch was aborted due to a failure of an inner message, the begin/end will still not fail...)
On the other hand even a NFNL_MSG_BATCH_END message can fail with an error (for example if it is missing the NLM_F_REQUEST), and NFNL_MSG_BATCH_BEGIN can fail due to lots of problems.
Some errors on single messages won't abort the full batch, but rather collect all errors (and ACKs) in all messages. But some errors (afaict: all BEGIN failures, OOM, ...) will abort the batch, which means there won't be any ACKs.
Afaict an ACK on a single message in the batch won't mean the batch actually got committed

Not sure how to handle this in a sane way. I think it will at least require a manual way to remove entries from pending_requests once we know they won't receive any messages anymore. And perhaps a flag whether a "response" message was the last one in a datagram - because I think we will receive all errors/acks for a batch in a single datagram; once we saw the last response in a datagram that matches a batch, we know we got all responses ("implicit ack").

Another way could be using the same sequence number for all messages in a batch, and collect all responses for it from a single datagram before dropping the pending_requests entry (instead of dropping it on the first ack). It would be a challenge to map errors to single messages of the batch, but otoh that is probably not important anyway - the batch should either succeed or fail completely, at least in a first iteration of the feature :)

stbuehler · 2022-04-15T10:27:35Z

Just checked and it doesn't seem like we get the errors in one go; so that is not an option. Next idea: use a dummy request (with a simple response) after a batch: as the kernel handles the messages in order the response to the dummy request marks the end of all errors for the batch. Still need a way to stop waiting for sequence ids so we can detect problems with the BATCH_END message.

cathay4t · 2022-05-07T00:35:28Z

@stbuehler This PR in draft mode for a while. Is it ready for review?

JohnTitor added the Wait_Review label Jun 9, 2021

cathay4t added Wait_Submitter and removed Wait_Review labels Jun 14, 2021

cathay4t removed the Wait_Submitter label Jul 26, 2021

cathay4t suggested changes Jul 26, 2021

View reviewed changes

netlink-proto/src/protocol/request.rs Outdated Show resolved Hide resolved

cathay4t added the Wait_Submitter label Jul 26, 2021

stbuehler force-pushed the batch-messages branch 2 times, most recently from 3147601 to 7253106 Compare October 30, 2021 21:07

stbuehler force-pushed the batch-messages branch 2 times, most recently from 08def5c to 0cbe2e0 Compare November 7, 2021 15:00

stbuehler mentioned this pull request Nov 7, 2021

Cleanup proto visibility #193

Merged

stbuehler mentioned this pull request Nov 13, 2021

Refactor base types #195

Merged

stbuehler force-pushed the batch-messages branch from 0cbe2e0 to f5f98dc Compare November 25, 2021 23:49

dzamlo mentioned this pull request Nov 26, 2021

NFLOG #196

Closed

stbuehler added 8 commits April 10, 2022 15:20

Implement Clone for ConnectionHandle independent of whether T: Clone

8239440

Move "io only" parts of NetlinkFramed to a new struct

3d45a5e

- inner `FramedIO` only generic over socket type, not codec / message type - prepare for second `Sink` impl on `NetlinkFramed` with less code duplication

Implement Sink for batched messages on NetlinkFramed

1d90c22

Make internal Request an enum

151dcf5

Also drop tuple conversions; preparing for a Batch variant.

Use new enum for outgoing message

c521fd1

Prepares for handling of batch messages.

Prepare Protocol::request to handle batch messages

3363b7a

stbuehler marked this pull request as draft April 10, 2022 13:21

stbuehler force-pushed the batch-messages branch from f5f98dc to a220a69 Compare April 10, 2022 13:22

stbuehler mentioned this pull request May 7, 2022

Rename metadata to response_tx, remove Metadata generic #260

Open

cathay4t added the stale label Jun 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support batch messages #170

Support batch messages #170

stbuehler commented Jun 6, 2021

cathay4t commented Jun 14, 2021

stbuehler commented Jun 14, 2021

cathay4t left a comment •

edited

Loading

little-dude commented Nov 8, 2021

little-dude commented Nov 25, 2021

stbuehler commented Nov 25, 2021

stbuehler commented Apr 10, 2022

stbuehler commented Apr 15, 2022

cathay4t commented May 7, 2022

Support batch messages #170

Are you sure you want to change the base?

Support batch messages #170

Conversation

stbuehler commented Jun 6, 2021

cathay4t commented Jun 14, 2021

stbuehler commented Jun 14, 2021

cathay4t left a comment • edited Loading

Choose a reason for hiding this comment

little-dude commented Nov 8, 2021

little-dude commented Nov 25, 2021

stbuehler commented Nov 25, 2021

stbuehler commented Apr 10, 2022

stbuehler commented Apr 15, 2022

cathay4t commented May 7, 2022

cathay4t left a comment •

edited

Loading