Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf, mm: Introduce __GFP_TRYLOCK #4765

Closed

Conversation

kernel-patches-daemon-bpf-rc[bot]
Copy link

Pull request for series with
subject: bpf, mm: Introduce __GFP_TRYLOCK
version: 2
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=916186

@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: 4d33dc1
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=916186
version: 2

Alexei Starovoitov added 5 commits December 10, 2024 09:03
Tracing BPF programs execute from tracepoints and kprobes where running
context is unknown, but they need to request additional memory.
The prior workarounds were using pre-allocated memory and BPF specific
freelists to satisfy such allocation requests. Instead, introduce
__GFP_TRYLOCK flag that makes page allocator accessible from any context.
It relies on percpu free list of pages that rmqueue_pcplist() should be
able to pop the page from. If it fails (due to IRQ re-entrancy or list
being empty) then try_alloc_pages() attempts to spin_trylock zone->lock
and refill percpu freelist as normal.
BPF program may execute with IRQs disabled and zone->lock is sleeping in RT,
so trylock is the only option.
In theory we can introduce percpu reentrance counter and increment it
every time spin_lock_irqsave(&zone->lock, flags) is used,
but we cannot rely on it. Even if this cpu is not in page_alloc path
the spin_lock_irqsave() is not safe, since BPF prog might be called
from tracepoint where preemption is disabled. So trylock only.

Note, free_page and memcg are not taught about __GFP_TRYLOCK yet.
The support comes in the next patches.

This is a first step towards supporting BPF requirements in SLUB
and getting rid of bpf_mem_alloc.
That goal was discussed at LSFMM: https://lwn.net/Articles/974138/

Signed-off-by: Alexei Starovoitov <[email protected]>
Introduce free_pages_nolock() that can free a page without taking locks.
It relies on trylock only and can be called from any context.

Signed-off-by: Alexei Starovoitov <[email protected]>
Similar to local_lock_irqsave() introduce local_trylock_irqsave().
It uses spin_trylock in PREEMPT_RT and always succeeds when !RT.

Signed-off-by: Alexei Starovoitov <[email protected]>
Teach memcg to operate under __GFP_TRYLOCK conditions when
spinning locks cannot be used.
The end result is __memcg_kmem_charge_page() and
__memcg_kmem_uncharge_page() become lockless.

Signed-off-by: Alexei Starovoitov <[email protected]>
Unconditionally use __GFP_ACCOUNT in try_alloc_pages().
The caller is responsible to setup memcg correctly.
All BPF memory accounting is memcg based.

Signed-off-by: Alexei Starovoitov <[email protected]>
@kernel-patches-daemon-bpf-rc
Copy link
Author

Upstream branch: 6e8ba49
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=916186
version: 2

Use try_alloc_pages() and free_pages_nolock()

Signed-off-by: Alexei Starovoitov <[email protected]>
@kernel-patches-daemon-bpf-rc
Copy link
Author

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=916186 expired. Closing PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants