SCIONLab next version initial update #158

juagargi · 2024-01-19T16:26:08Z

Following instructions on https://gitlab.inf.ethz.ch/PRV-PERRIG/scionlab/scionlab-docs/-/wikis/Biweekly-SCIONLab-Update

Notes:

Related issue: Plug-in based system to enhance maintainability #159 , although it won't be taken into consideration for this initial update.

TODO:

This change is

…to#4424) Bump google.golang.org/grpc from 1.57.0 to 1.57.2 due to a security patch. Skipping 1.57.1, because that contained a bug. From vulnerability alert "gRPC-Go HTTP/2 Rapid Reset vulnerability": > Impact > > In affected releases of gRPC-Go, it is possible for an attacker to send > HTTP/2 requests, cancel them, and send subsequent requests, which is > valid by the HTTP/2 protocol, but would cause the gRPC-Go server to > launch more concurrent method handlers than the configured maximum > stream limit. Patches > > This vulnerability was addressed by #6703 and has been included in patch > releases: 1.56.3, 1.57.1, 1.58.3. It is also included in the latest > release, 1.59.0. > > Along with applying the patch, users should also ensure they are using > the grpc.MaxConcurrentStreams server option to apply a limit to the > server's resources used for any single connection.

Improve metrics for router packet processing: * Classify packets by size bucket: log2(size) * Classify packets by traffic type: incoming, outgoing, as-transit-in/out, br-transit. * Add process-level on-cpu and sleep time accounting

Refactor the remaining targets that depended on @cgrindel_bazel_starlib//updatesrc. The repository has now been completely migrated to @aspect_bazel_lib//lib:write_source_files. During the refactor, we dropped the functionality to write the source files in rules_openapi. Instead, the caller side can decide how to write the source files. This makes the rules more composable and easier to use. The functionality has been moved to a macro in private/mgmtapi/api.bzl, which is used inside of scionproto/scion.

Drop the `go_embed_data` custom rules, those were deprecated for a long time. Drop most of the custom `rules_openapi` bazel rules. Users are recommended to use their own invocations of the JS tools.

The interface label value of every metric instance had been lost to a refactoring in scionproto#4422.

The oapi-codegen libraries went through a reorganization; the repositories for the oapi-codegen build tool was separated from the runtime libraries. In our normal code dependencies, the oapi-codegen build tool no longer shows up. Import it in a dummy .go file as an easy way to ensure that we have all the appropriate dependencies (via the `go.mod`/`go_deps.bzl`) . This leaves some clutter in the `go.mod` file which is not aesthetically pleasing, but is not a real issue due to dependency graph pruning. The alternative solution would have been to manage the dependencies of the `rules_openapi` bazel rules explicitly in a `rules_openapi_dependencies` macro. This turns out to be rather cumbersome, and can lead to unexpected or even invalid dependency version combinations, because `rules_go` does not have a way to separate the dependencies of the build tools from the dependencies of the code. Aside; fix `cfg = "exec"` for the oapi-codegen buildtool so that this tool is built for the environment of the builder, not the target platform.

Simplify the usage of the various docker-compose configurations by including the project name in the configuration file. This has been supported for a while now in docker-compose v2. This allows to drop the `-p`/`--project-name` from all `docker-compose` incantations. Also, streamline the docker-compose files generated by the topogen scripts; remove the explicit `container_name` configurations and drop all the explicit `scion_` prefixes -- managing these prefixes is docker-compose's job. Shut up the warnings on "SCION_EXPERIMENTAL_... variable is not set. Defaulting to a blank string." by using the variable expansion syntax to explicitly default to a blank string. Also, drop the `docker compose` `--compatibility` flag. The compatibility flag affects what word separator is used in the container name and it has long been deprecated. We don't usually rely on specific container names, so this should be fine. In some exception cases where expecting specific container names seems more practical (containers for bazel-remote-cache and go-module-proxy) due to special casing in the CI scripts, container_name is set explicitly.

--always, will substitute an abbreviated commit hash if there's no tag in sight. Better than failing with a cryptic message.

…cionproto#4437) * Enhance the end2end_integration programs as follows: * end2end_integration has a slightly finer-grain role-based pair selection capability. * end2end_integration outputs an extra result line that provides precise start/stop timestamps for the whole test. * process_metrics exposes the number of cores used by Go. * Add end2endblast. Similar to end2end but plays a game other than pingpong: packetblast. It keeps logging and tracing to a minimum. The server side responds to very few of the pings (1/256). Just enough to prove that the circuit works. * Add an integration test called router_benchmark: * relies on an especially crafted topology: router_bm.topo * uses end2end_integration with command end2endblast over that topology as a load test for the router * extracts and logs relevant performance metrics after the test * compares performances with a set of expectations if executed by CI. As a benchmark program, this is only a first approximation. Extending the end2end suite provided a quick solution but isn't ideal: it uses too many cores at once due to requiring 5 routers actively running for one of the tests. More work is in progress to replace this with some leaner setup where only one router is actually running.

Update the docstring to follow the semantics of negative means undefined.

The `supervisorctl shutdown` command does not block for the supervisor or its child processes to terminate. This can occasionally lead to situations where SCION processes were still running after the `scion.sh stop` command returned. In the CI system, which immediately proceeds to bundle up the log files, this led to `tar` reporting an error that the log file was still being written to. Fixed by invoking `supervisorctl stop all`, which does block, to terminate all child processes before `shutdown`.

…#4446)

Bazel and bazilisk do support ARM64 by now. Lift the restriction on AMD64 and, while we're at it, update bazelisk to the latest version.

The default target remains unchanged but is renamed to `build-dev`. The target `build` now only builds the main SCION services.

When copying metric labels we assumed that `append` will always create a copy of the label array. This is not necessarily the case. In such case two metrics may end up with the same underlaying array of labels and change to one of them also overwrites the labels in the other one.

…scionproto#4453) I found a bug in the SCION daemon. If you create a SCION daemon connection and call one of the drkey endpoints without first setting the configuration value "drkey_level2_db.Connection" to something meaningful, the SCION daemon will encounter an error like that and crash: ``` panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xb82d33] ``` This bug can easily be reproduced by running a code snippet like this for a SCION daemon that has no drkey configured: ``` sd, _ := daemon.NewService(daemonAddr).Connect(ctx) sd.DRKeyGetASHostKey(ctx, drkey.ASHostMeta{}) ``` The problem is that in the file "daemon/internal/servers/grpc.go", when the drkey endpoints are called, DaemonServer.DRKeyClient is nil if it is not configured as described above. Fixed by explicitly checking whether DRKey is available and erroring out if it is not.

log.FromCtx could panic if a custom Logger was used and a span was present in the context. This commit allows the custom logger to implement the `WithOptions(...zap.Option) Logger` method so that the CallerSkip can still be applied. In case the logger can't be casted to anything the caller skip is not applied, but we also don't panic anymore.

…nproto#4444) This is an alternative to router_benchmark and meant to eventually replace it. It's main feature is that it only runs a single router. As a result it's measured performance is far less dependent on the number of cores available on the test system. It is heavily inspired by the braccept test, in that: * it relies on the test harness to create an isolated network and veth interfaces to connect the test driver to the router. * it manufactures packets that are fed directly into the router's interfaces and it optionally captures packets for verification in the same manner. * it assumes a custom topology and cannot be usefully run on any other topology. It differs from braccept in that: * It does far less verification of the forwarded packets (it's braccept's job to do that). * It is considerably simplified, mainly because of the lighter verifications. * It relies on a simple address-assignment scheme so code maintainers don't have to keep tens of addresses in mind. * It does not assume that the router runs in a container network; it could also talk to a real router if given the names and mac addresses of the relevant interfaces. The test harness (or the user) is responsible for supplying the interface names and mac addresses to the test driver. * It does not require the test harness (or the user) to have any understanding of the topology (only to configure the router with it). The specifics of the captive network to be configured are supplied by the test driver. * The test harness doesn't need the "pause" container anymore. The similarity between this and braccept means that, in the future, we could re-converge them. Notably, the benefits of the simple addressing scheme would make the braccept test cases easier to maintain or expand. Collateral: the go version required was bumped to 1.21 Fixes scionproto#4442

…to#4454) This test has been superceeded by the router_newbenchmark test. One component of it is preserved for its usefulness as an experiment/debugging tool: the end2endblast component of the end2end suite.

…to#4455) Since, the original "router_benchmark" has been deleted the "new" prefix is no longer necessary.

…scionproto#4456) In the router. Compute a better default for the number of processors based on multiple observations: given enough streams to occupy all processors, performance is degraded if there are as many processors as cores. Best performance is obtained by setting aside two cores for other tasks. Making this the default simplifies the benchmarking process by not having to configure an override for each type of target machine being tested. In the benchmark. Run the test with many streams. It reflects reality and allows a very parallel router to perform at its best without distorting the performance of a less-parallel one. Make sure that the number of streams is likely to be a multiple of the number of cores so the cores are loaded evenly in spite of there being typically fewer streams than in real life. (TBD, learn the router's internal processors count and just use that). As a side effect: * Adjust CI's expectations for the benchmark * Add automatic retries (2) for failed CI tests. * Allow retrying successful CI tests (helps investigating performance variations). * Add cpu_info to the router_benchmark output (same reason).

…cionproto#4457) * testing: improved stream synthesis in router benchmark Changes: * Update the flowID of packets (and a couple of checksums) on the fly (just before sending) as opposed to pre-constructing a different packet for each stream. * As a result, the test cases are simplified, they no-longer need to provide one packet per flowID. * In passing, deleted the dumping of /proc/cpuinfo in the log. That has served its purpose. * In the test harness, left in the form of comments, a suggestion of core partitioning. Not sure how to apply that in general. This is what provides the best results on my machine. * testing: simplify the checksum updates out of the blaster. Also enable core-partitioning to see how it behaves on the CI system. * testing: Add check for proper saturation in the router benchmark. In passing: fixed a couple of function names (camel-> snake) * testing: fix expectation for droppage in the router benchmark Was expressed in pkts instead of %. * testing: cosmetic change to the output of expected drop ratio. * testing: linter friendship * testing: include a cpu selection in the router benchmark Also: fixed the broken prom query string. * testing: please the linter. * testing: log lscpus on the CI system. * testing: fix bad string formating. * testing: Improve core selection in the router benchmark. Added L2 cache criteria. The test harness will prefer configuration where either cores share no L2 cache or all cores share the same one. * testing: make the router benchmark test harness compatible with more variants of lscpu (hopefully). * testing: log the CPU serial number in the router benchmark. This is done as an attempt at detecting live migration. * build: added dmidecode as a required package. * build: Complete the list of files that affect the list of packages to install. Three files were missing, most notably the pkgs.txt files which are typically the ones modified when adding a dependency. As a result new packages weren't getting installed. * build: Delete use of dmidecode. This proved useless as a means of spotting migrations. The information doesn't even change from instance to instance. * build: fix trailing blank line. * router: reduce the benchmarks expectations for saturation. After fixing an arithmetic error, it seems we are not able to reliably cause the router to drop more than 1% of the packets (and then sometimes no even that). Until we figure out how to reach higher levels, making sure that other PRs aren't blocked. * testing: fixed bugs in router_benchmark core selection code. * testing: update the expectations of the router benchmark. After switching to non-overbooked aws instances, performance has become much better and repeatable.

Build debian packages for amd64, arm64, i386 and armel. - There are separate packages for router, control, daemon, dispatcher, gateway and tools (scion and scion-pki). - The packages include systemd unit files to run the services. For the daemon, dispatcher and gateway one instance per host is supported by the systemd service, and a default configuration file is included. For router and control, multiple instances per host are supported, and as a consequence of this, no default configuration file is provided. - Currently, there is no man page contained in the packages. We should be able to build these from our existing manuals, but it seems to require a significant amount of fiddling to get something useful. Building the .deb packages uses bazel with `rules_pkg`. The target `//dist:deb_all` cross-builds packages for the default set of target platforms. Alternatively, the target `//dist:deb` allows to build (all) packages for the current target platform. This current platform can be set with the `--platforms` bazel option (see https://github.com/bazelbuild/rules_go#how-do-i-cross-compile for more details). - To increase reuse of build results while cross-building, some internal targets related to openapi forcibly ignore the target platform - The package version is based on the current git tag. `rules_pkg` can include this version _in_ the package metadata, but bazel _cannot_ spit out appropriately named package files. This is addressed by copying and renaming the package files after build in a make target `make dist-deb`. Add installation documentation for the packages and the systemd units. As a side effect, slightly reorganize the build documentation also, trying to make give a simpler path to just build the binaries without installing the entire development setup.

) Dissecting TLV options crashes the wireshark plugin, if the `data` field is empty. This happens always for Pad1 options and for PadN options if `data_len` is 0.

Documenting that we've bumped the buildkite instance type from t3 to m6i. The main reason for the switch is that the "burstable" nature of the t*-instances makes them extremely unsuitable for running benchmarks. The m6i are also much faster, but not much more expensive. Also, the agent pool is now at most 6 agents. Spinning up 10 agents and having to warm up all the caches, and then scaling back in after a single build, seemed excessive.

Make the segment signer pluggable. This decouples the beacon extender from the trust signer implementation and allows plugging in different implementations.

Reflect the changes introduced in scionproto#4376

Etags can be used so as not to transfer large prefix sets over and over again. Etags may or may not be supported by a particular gateway implementation. This change is fully forward and backward compatible. Implementations that do not support etags should just ignore the field.

* Add Carbon Intensity Information to beacons First crude version, just to allow adding _some_ numbers. Note: not merged code paths with bandwidth although this is currently processed identically, as changes to the representation of the carbon intensity information are to be expected. * Add Carbon Intensity to combinator, snet.Path and showpaths * Resolved conflict in daemon protobuf file by incrementing field number. Recreated protobuf generated file. * Resolved conflicts in other places due to diff not matching original, by adding the carbon intentity field in data structures/initializations again to the upstream ones.

tzaeschke

LGTM % the fact that my review was quite superficial. I hope it is still more helpful than not.

tzaeschke · 2024-02-16T15:37:48Z

Makefile


 lint-doc: lint-doc-mdlint

 lint-doc-mdlint:
 	$(info ==> $@)
-	@FILES=$$(find -type f -iname '*.md' -not -path "./rules_openapi/tools/node_modules/*" -not -path "./.github/**/*" | grep -vf tools/md/skipped); \
+	@FILES=$$(find -type f -iname '*.md' -not -path "./private/mgmtapi/tools/node_modules/*" -not -path "./.github/**/*" | grep -vf tools/md/skipped); \


Not exactly sure what this does, but the path ./private/mgmtapi/tools/node_modules/* doesn't seem to exist. Is that expected?

tzaeschke · 2024-02-16T15:47:08Z

acceptance/router_benchmark/cases/topo.go

+//                 |
+//    AS3 (br3) ---+
+//
+// We're only executing and monitoring br1a. All the others are a fiction (except for the knowledge


"a fiction" -> "fictional" ?

tzaeschke · 2024-02-16T16:11:19Z

pkg/private/processmetrics/processmetrics_linux.go

+//
+// The metrics we add serve to estimate the available cpu time; that is, the amount of CPU that the
+// scheduler granted to the process, independently of whether it ran with it or slept on it. The
+// goal is to measure what the overall performance of the process would be if it was never


"measure" -> "estimate"?

tzaeschke · 2024-02-16T16:21:22Z

pkg/private/processmetrics/processmetrics_linux.go

+// preempted. For example, this could be expressed as some_output/available_cpu_time.
+//
+// At a given time, a given thread is either running, runnable, or sleeping. When running it
+// consumes exactly one core. When runnable, it is being deprived of exactly one core (because Go


"exactly one core" -> I guess this ignores shared caches and other resources which slows down neighbor cores, sometimes considerably, in effect one core "consumes" more than just its share of the machine? Also , I guess process may trigger CPU operations that may still be going on even if the process is preempted (in effect the process is using machine time without using core time)?

Anyway, depending on what the measurement is used for, these effects can probably be ignored.

tzaeschke · 2024-02-16T16:24:21Z

pkg/private/xtest/graph/graph.go

+		carbonIntensity.Inter = make(map[common.IFIDType]uint64)
+		for ifid := range as.IFIDs {
+			if ifid != outIF {
+				carbonIntensity.Intra[common.IFIDType(ifid)] = g.CarbonIntensity(ifid, outIF)


So if ifid != outIf && isPeer() == true, the carbon intensity is added to both intra and inter?

tzaeschke · 2024-02-16T16:38:34Z

proto/control_plane/v1/seg_extensions.proto

@@ -40,6 +40,8 @@ message StaticInfoExtension {
    LatencyInfo latency = 1;
    // Approximate, maximum bandwidth for paths based on this ASEntry.
    BandwidthInfo bandwidth = 2;
+    // Aproximate carbon intensity.


"Aproximate" -> "Approximate"

tzaeschke · 2024-02-16T17:00:40Z

acceptance/router_benchmark/brload/main.go

@@ -0,0 +1,326 @@
+// Copyright 2023 SCION Association


Maybe in future (not as part of this PR) we could consider a more compact (and machine readable) license header: https://spdx.dev/learn/handling-license-info/
E.g. one liner: /* SPDX-License-Identifier: Apache-2.0 */

juagargi

:LGT

Reviewable status: 0 of 220 files reviewed, 7 unresolved discussions / 0 of 1 LGTMs obtained / 0 of 1 approvals obtained

Makefile line 149 at r1 (raw file):