Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement in-place pod request scaling #12

Merged
merged 6 commits into from
Jun 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ jobs:
run: sudo --preserve-env make test

build:
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v3

Expand All @@ -46,10 +46,20 @@ jobs:
with:
go-version: "1.22"

- name: Install protoc-gen-go
run: |
go install google.golang.org/protobuf/cmd/[email protected]
go install github.com/containerd/ttrpc/cmd/[email protected]

- uses: awalsh128/cache-apt-pkgs-action@v1
with:
packages: protobuf-compiler libprotobuf-dev
version: 1.0

- name: build ebpf image
run: make build-ebpf

- name: generate ebpf
- name: generate ttrpc and ebpf
run: make generate

- name: check for diff
Expand Down
9 changes: 8 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,16 @@ CFLAGS := -O2 -g -Wall -Werror
# dependencies installed.
generate: export BPF_CLANG := $(CLANG)
generate: export BPF_CFLAGS := $(CFLAGS)
generate:
generate: ttrpc
docker run --rm -v $(PWD):/app:Z --user $(shell id -u):$(shell id -g) --env=BPF_CLANG="$(CLANG)" --env=BPF_CFLAGS="$(CFLAGS)" $(EBPF_IMAGE)

ttrpc:
go mod download
cd api/shim/v1; protoc --go_out=. --go_opt=paths=source_relative \
--ttrpc_out=. --plugin=protoc-gen-ttrpc=`which protoc-gen-go-ttrpc` \
--ttrpc_opt=paths=source_relative *.proto -I. \
-I $(shell go env GOMODCACHE)/github.com/prometheus/[email protected]

# to improve reproducibility of the bpf builds, we dump the vmlinux.h and
# store it compressed in git instead of dumping it during the build.
update-vmlinux:
Expand Down
31 changes: 28 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -285,13 +285,38 @@ the shim otherwise. For example, loading eBPF programs can be quite memory
intensive so they have been moved from the shim to the manager to keep the
shim memory usage as minimal as possible.

In addition to that it collects metrics from all the shim processes and
exposes those metrics on an HTTP endpoint.
These are the responsibilities of the manager:

- Loading eBPF programs that the shim(s) rely on.
- Collect metrics from all shim processes and expose them on HTTP for scraping.
- Subscribes to shim scaling events and adjusts Pod requests.

#### In-place Resource scaling (Experimental)

This makes use of the feature flag
[InPlacePodVerticalScaling](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources)
to automatically update the pod resource requests to a minimum on scale down
events and revert them again on scale up. Once the Kubernetes feature flag is
enabled, it also needs to be enabled using the manager flag
`-in-place-scaling=true` plus some additional permissions are required for the
node driver to patch pods. To deploy this, simply uncomment the
`in-place-scaling` component in the `config/production/kustomization.yaml`.
This will add the flag and the required permissions when building the
kustomization.

#### Flags

```
-metrics-addr=":8080" sets the address of the metrics server
-debug enables debug logging
-in-place-scaling=false enable in-place resource scaling, requires InPlacePodVerticalScaling feature flag
```

## Metrics

The zeropod-node pod exposes metrics on `0.0.0.0:8080/metrics` in Prometheus
format on each installed node. The following metrics are currently available:
format on each installed node. The metrics address can be configured with the
`-metrics-addr` flag. The following metrics are currently available:

```bash
# HELP zeropod_checkpoint_duration_seconds The duration of the last checkpoint in seconds.
Expand Down
Loading
Loading