-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
de-duplication of key/value pairs #415
Conversation
This issue is currently awaiting triage. If klog contributors determine this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Recent golangci-lint warns about shadowing the print function.
1.21 enables the usage of generics and the slices package.
When testing with Go v1.23.0 in the PATH, `go test go_vet` panics. Bumping the tools dependency adds 1.23 as toolchain requirement and bumps the Go version. This is limited to the examples module and thus does not affect users of klog. panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x62d74f] goroutine 97 [running]: go/types.(*Checker).handleBailout(0xc0005bb340, 0xc0001afc88) /nvme/gopath/go-1.23.0/src/go/types/check.go:412 +0x88 panic({0x6e6c40?, 0x954490?}) /nvme/gopath/go-1.23.0/src/runtime/panic.go:785 +0x132 go/types.(*StdSizes).Sizeof(0x0, {0x7a5420, 0x957e40}) /nvme/gopath/go-1.23.0/src/go/types/sizes.go:229 +0x30f go/types.(*Config).sizeof(...) /nvme/gopath/go-1.23.0/src/go/types/sizes.go:334 go/types.representableConst.func1({0x7a5420?, 0x957e40?}) /nvme/gopath/go-1.23.0/src/go/types/const.go:77 +0x86 go/types.representableConst({0x7a7548, 0x94bb80}, 0xc0005bb340, 0x957e40, 0xc0001af588) /nvme/gopath/go-1.23.0/src/go/types/const.go:93 +0x173 go/types.(*Checker).representation(0xc0005bb340, 0xc00015be00, 0x957e40) /nvme/gopath/go-1.23.0/src/go/types/const.go:257 +0x65 go/types.(*Checker).implicitTypeAndValue(0xc0005bb340, 0xc00015be00, {0x7a5588, 0xc0000ec380}) /nvme/gopath/go-1.23.0/src/go/types/expr.go:377 +0x2d7 go/types.(*Checker).assignment(0xc0005bb340, 0xc00015be00, {0x7a5588, 0xc0000ec380}, {0x738a90, 0x14}) /nvme/gopath/go-1.23.0/src/go/types/assignments.go:70 +0x451 go/types.(*Checker).initConst(0xc0005bb340, 0xc000720a80, 0xc00015be00) /nvme/gopath/go-1.23.0/src/go/types/assignments.go:144 +0x1da go/types.(*Checker).constDecl(0xc0005bb340, 0xc000720a80, {0x7a6580, 0xc0002828a0}, {0x7a6580, 0xc0002828c0}, 0x0) /nvme/gopath/go-1.23.0/src/go/types/decl.go:482 +0x2e8 go/types.(*Checker).objDecl(0xc0005bb340, {0x7a9e78, 0xc000720a80}, 0x0) /nvme/gopath/go-1.23.0/src/go/types/decl.go:185 +0xa09 go/types.(*Checker).packageObjects(0xc0005bb340) /nvme/gopath/go-1.23.0/src/go/types/resolver.go:714 +0x454 go/types.(*Checker).checkFiles(0xc0005bb340, {0xc00016cb10, 0x3, 0x3}) /nvme/gopath/go-1.23.0/src/go/types/check.go:467 +0x15a go/types.(*Checker).Files(0xc0001b2000?, {0xc00016cb10?, 0xc000474120?, 0x6?}) /nvme/gopath/go-1.23.0/src/go/types/check.go:430 +0x75 golang.org/x/tools/go/packages.(*loader).loadPackage(0xc0001b2000, 0xc000432780) /nvme/gopath/pkg/mod/golang.org/x/[email protected]/go/packages/packages.go:961 +0x70f golang.org/x/tools/go/packages.(*loader).loadRecursive.func1() /nvme/gopath/pkg/mod/golang.org/x/[email protected]/go/packages/packages.go:799 +0x1a9 sync.(*Once).doSlow(0x0?, 0x0?) /nvme/gopath/go-1.23.0/src/sync/once.go:76 +0xb4 sync.(*Once).Do(...) /nvme/gopath/go-1.23.0/src/sync/once.go:67 golang.org/x/tools/go/packages.(*loader).loadRecursive(0x0?, 0x0?) /nvme/gopath/pkg/mod/golang.org/x/[email protected]/go/packages/packages.go:787 +0x3b golang.org/x/tools/go/packages.(*loader).loadRecursive.func1.1(0x0?) /nvme/gopath/pkg/mod/golang.org/x/[email protected]/go/packages/packages.go:794 +0x26 created by golang.org/x/tools/go/packages.(*loader).loadRecursive.func1 in goroutine 96 /nvme/gopath/pkg/mod/golang.org/x/[email protected]/go/packages/packages.go:793 +0x94 FAIL k8s.io/klog/examples/go_vet 0.438s FAIL
bdddf29
to
ab1b250
Compare
This should be ready for a review. @mengjiao-liu, @harshanarayana, who has time? |
Checking @pohly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I debugged the log outputs, and these tests were passed.
Just a non-blocking comment, LGTM otherwise!
test/output.go
Outdated
t.Errorf("Output mismatch. Expected:\n%s\nActual:\n%s\n", expectedWithPlaceholder, actual) | ||
t.Errorf("Output mismatch.\n\nExpected with placeholders:\n%s\nExpected without placeholds:\n%s\nActual:\n%s\n", expectedWithPlaceholder, expected, actual) | ||
} else { | ||
t.Errorf("Output mismatch. klog:\n%s\nExpected:\n%s\nActual:\n%s\n", test.expectedOutput, expectedWithPlaceholder, actual) | ||
t.Errorf("Output mismatch. klog:\n%s\n\nExpected with placeholders:\n%s\nExpected without placeholders:\n%s\nActual:\n%s\n", test.expectedOutput, expectedWithPlaceholder, expected, actual) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the code in lines 844-850 also be changed? Their output is similar.
Lines 844 to 850 in 7f02688
if actual != expected { | |
if expectedWithPlaceholder == test.output { | |
t.Errorf("Output mismatch. Expected:\n%s\nActual:\n%s\n", expectedWithPlaceholder, actual) | |
} else { | |
t.Errorf("Output mismatch. klog:\n%s\nExpected:\n%s\nActual:\n%s\n", test.output, expectedWithPlaceholder, actual) | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense. Changed.
Printing the actual expected string makes it easier to run a diff, something that we cannot do here automatically because we cannot depend on testify or go-cmp.
Span and trace ID are potentially going to be duplicated a lot.
MergeKVs is about to become a lot simpler. Removing the tests and benchmarks simplifies the before/after performance comparison.
Previously, de-duplication was limited such that keys passed to a log call as parameters overwrote keys passed earlier to WithValues. Using the same key in different WithValues calls or multiple times in a WithValues call or the parameters led to duplicates in the output. Now all key/value pairs get checked for duplicates. The last one wins and is printed at the place where it was added. A potential improvement would be to keep it at the place where it first appeared if the value is the same because then different log entries are more likely to have it at the beginning of their key/value pairs because it was added by a common WithValues call. Because the new implementation avoids memory allocations, the performance impact is minimal. Some cases even become faster. goos: linux goarch: amd64 pkg: k8s.io/klog/examples/benchmarks cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz │ before │ after1 │ │ sec/op │ sec/op vs base │ Output/klog/fail-verbosity-non-standard-struct-key-check/objects/0-36 385.5n ± 9% 358.9n ± 5% -6.91% (p=0.041 n=6) Output/klog/fail-verbosity-non-standard-struct-key-check/objects/10-36 374.3n ± 4% 357.3n ± 5% -4.55% (p=0.015 n=6) Output/klog/fail-verbosity-non-standard-struct-key-check/objects/20-36 388.4n ± 3% 359.2n ± 4% -7.53% (p=0.002 n=6) ... Output/zapr/fail-verbosity-non-standard-int-key-check/objects/90-36 380.9n ± 4% 379.4n ± 2% ~ (p=0.937 n=6) Output/zapr/fail-verbosity-non-standard-int-key-check/objects/100-36 375.1n ± 3% 381.9n ± 4% ~ (p=0.093 n=6) geomean 3.461µ 3.442µ -0.55% pkg: k8s.io/klog/examples/output_test ZaprOutput/regular_error_types_as_value-36 3.270µ ± 10% 3.530µ ± 9% ~ (p=0.132 n=6) ZaprOutput/struct_values-36 3.932µ ± 11% 4.107µ ± 10% ~ (p=1.000 n=6) ZaprOutput/klog.Format-36 4.276µ ± 12% 4.439µ ± 7% ~ (p=0.132 n=6) ZaprOutput/regular_error_types_when_using_logr.Error-36 2.790µ ± 7% 2.834µ ± 15% ~ (p=0.699 n=6) ZaprOutput/map_values-36 5.802µ ± 9% 5.829µ ± 3% ~ (p=1.000 n=6) ZaprOutput/log_with_name_and_values-36 4.051µ ± 4% 3.864µ ± 6% -4.63% (p=0.015 n=6) ... geomean 5.581µ 5.646µ +1.16% pkg: k8s.io/klog/v2 Header-36 1.339µ ± 5% 1.311µ ± 6% ~ (p=0.513 n=6) ... KlogOutput/klog.Format-36 4.207µ ± 10% 4.667µ ± 3% +10.95% (p=0.009 n=6) KlogOutput/cyclic_list-36 151.3µ ± 1% 151.1µ ± 1% ~ (p=0.818 n=6) KlogOutput/override_single_value-36 2.308µ ± 9% 3.501µ ± 15% +51.66% (p=0.002 n=6) KlogOutput/multiple_WithValues-36 10.91µ ± 9% 12.46µ ± 7% +14.16% (p=0.004 n=6) ... KlogOutput/duplicates-36 21.34µ ± 2% 22.52µ ± 7% ~ (p=0.065 n=6) ... geomean 252.6n 260.6n +3.18% pkg: k8s.io/klog/v2/textlogger TextloggerOutput/html_characters-36 1.982µ ± 12% 1.978µ ± 6% ~ (p=0.937 n=6) TextloggerOutput/map_values-36 3.215µ ± 6% 3.429µ ± 4% +6.67% (p=0.041 n=6) ... TextloggerOutput/mixed_duplicates-36 19.33µ ± 5% 19.59µ ± 7% ~ (p=0.818 n=6) ... TextloggerOutput/print_duplicate_keys_in_arguments-36 2.067µ ± 7% 2.388µ ± 6% +15.53% (p=0.002 n=6) ... TextloggerOutput/duplicates-36 16.80µ ± 11% 17.06µ ± 4% ~ (p=0.310 n=6) ... geomean 2.798µ 2.860µ +2.24%
Suppose the same key/value pair gets logged multiple times: WithValues("sameKey", 1).WithValues("newKey", 2).WithValues("sameKey", 1) Previously, the de-duplication code would have emitted it near the end: "hello world" newKey=2 sameKey=1 Now, it gets emitted at the place where it first appeared: "hello world" sameKey=1 newKey=2 This makes log entries more consistent when some repeat the key/value pair and others don't. Performance is about the same as before. goos: linux goarch: amd64 pkg: k8s.io/klog/examples/benchmarks cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz │ after1 │ after2 │ │ sec/op │ sec/op vs base │ ... geomean 3.442µ 3.445µ +0.09% pkg: k8s.io/klog/examples/output_test geomean 5.646µ 5.631µ -0.26% pkg: k8s.io/klog/v2 geomean 260.6n 260.5n -0.07% pkg: k8s.io/klog/v2/textlogger TextloggerOutput/odd_WithValues-36 6.956µ ± 4% 7.263µ ± 4% +4.42% (p=0.041 n=6) ... TextloggerOutput/log_with_multiple_names_and_values-36 3.353µ ± 4% 3.172µ ± 2% -5.40% (p=0.002 n=6) ... TextloggerOutput/duplicates-36 17.06µ ± 4% 17.16µ ± 9% ~ (p=0.937 n=6) ... TextloggerOutput/mixed_duplicates-36 19.59µ ± 7% 20.35µ ± 5% ~ (p=0.065 n=6) ... TextloggerOutput/print_duplicate_keys_in_arguments-36 2.388µ ± 6% 2.218µ ± 7% -7.12% (p=0.026 n=6) ... geomean 2.860µ 2.849µ -0.38%
This is a follow-up which removes several functions which don't provide enough additional value anymore. This wasn't done earlier to keep the previous commits simple.
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/cc @dims
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: harshanarayana, mengjiao-liu, pohly The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/assign @dims For approval - I don't want to self-approve this. Please lift the hold if you agree to merge this. |
/hold cancel thanks @pohly @harshanarayana |
Thx! |
What this PR does / why we need it:
Previously, de-duplication was limited such that keys passed to a log call as
parameters overwrote keys passed earlier to WithValues. Using the same key in
different WithValues calls or multiple times in a WithValues call or the
parameters led to duplicates in the output.
Now all key/value pairs get checked for duplicates. The last one wins and is
printed at the place where it was added. If a second occurrence is the same as
a prior one, the first one is kept and the second is discarded.
This is partly motivated by the concern that adding OpenTelemetry will lead to
duplicate IDs (cc @dashpole), partly by prior discussions around instrumentation
not being able to avoid duplicates when using WithValue when the instrumentation
is happening in unrelated packages (cc @sbueringer, if I remember correctly).
Special notes for your reviewer:
Because the new implementation avoids memory allocations does very little extra work
when there is nothing which needs to be de-duplicated, the performance
impact is minimal. Some cases even become faster. See commit messages for
benchmark results.
Release note: