-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarking paper #26
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like what's written now still has this "I need to use these weird workarounds because of this list of issues that I'm now aware of" aura that I wanted to avoid. Yes, in the current version, the reader will probably understand the reasons behind these, since these are well-explained, but we're still not supplying a simple coherent view on testing/counting that doesn't depend on the reader considering these miniscule implementation details of our lib (which causes insecurity in people).
For example: I've never run into issues benchmarking with our library. Why? Because I usually benchmark complete processes between multiple parties (say, a blind signing protocol). I just run the protocol and I serialize/deserialize whenever a party would send/receive (basically, an actual run, just without opening a socket). In this scenario, there is no need to consider secret precomputation state or artificially call computeSync()
on stuff. It's all just happening as it would in an actual application. As soon as the signer gets their verification bit, that's it. As soon as the receiver's signature object is there, it's done. No issue.
Why do we keep this view from people? I feel like it's one paragraph alá "you can of course benchmark your full running application with mock sockets, and then disregard everything we explain about synthetic benchmarks here" and lots of people would immediately go "ah. Okay. So then let's simply do that".
docs/benchmarking.md
Outdated
This is because non-blocking computation can lead to race conditions when printing the result of tracking the group operations, i.e. the computation has not been performed yet when the data is printed. | ||
So make sure to always call `compute()` on every `DebugGroupElement` before accessing any counter data. | ||
`DebugGroup` does use lazy evaluation, meaning that you need to ensure all lazy computations have finished before retrieving the tracked results. | ||
One way to do this is to call `computeSync()` on all operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean... don't call it on all operations. Just the values you are interested in, not intermediate ones.
|
||
```java | ||
import org.cryptimeleon.math.structures.groups.debug.DebugGroup; | ||
|
||
// instantiate the debug group with a name and its size | ||
// Instantiate the debug group with a name and its size | ||
DebugGroup debugGroup = new DebugGroup("DG1", 1000000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One remark here: to get useful results, the (debug) group size should be consistent with whatever (actual) group you want to count for. Otherwise, we get wrong results (e.g., if the group is small, also exponentiations take way fewer group ops).
docs/benchmarking.md
Outdated
|
||
### Mutability | ||
|
||
Another problem is the mutability of the `Signature` object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know whether there's a better word for it. Like ... the value represented by the object never changes. It's not mutable in this sense. There are just some helper values that may be added that don't influence the value, just performance characteristics.
docs/benchmarking.md
Outdated
// algorithm faster than it would be without, we recreate the objects | ||
// without precomputations. | ||
signature = scheme.restoreSignature(signatureRepr); | ||
verificationKey = scheme.restoreVerificationKey(verifyKeyRepr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could actually be discussed. This highly depends on what you want to benchmark. If you want to verify multiple signatures on the same key that you keep in memory, you should just use the same verificationKey object every time.
The current benchmark is "I get a list of unknown signatures/verification keys and need to verify them all".
Just to give my 40 cent to this: Why are we not just showing straightforward benchmarking (like in @JanBobolz's example) and then add a section "Checking for Problems: Your numbers seem a bit off". In this section we go into the things one has to consider. So miniscule implementation details are explained there. Advantage of this approach is that the "normal" use case can concentrate on answering the question "how can I benchmark my scheme" and further details are postponed to the extra section. |
These considerations all rely on the user being able to make that decision whether the problems are relevant to them, which they cannot do if they are not educated on what exactly the issues are. How do you even differentiate whether a benchmark is "synthetic" or not? Is any benchmark involving serialization automatically safe? Do we just say "If you are benchmarking a complete application, then these issues don't apply to you" (what is a complete application?)? I don't want to give the reader a free ticket for just ignoring our advice and (potentially) obtaining inaccurate numbers. |
I'm not arguing not to tell people what the issues are. I'm saying we shouldn't leave people with "these are some weird issues" and "here's how to work around it with serialization because internally, this clears caches, verification results, etc.". For me, this does not convey a good intuition of how to write my tests. I guess blindly serializing and deserializing everything will do it?! Or is that too much? When do I need to work around and when do I not? I'm suggesting that we tell people clearly what kinds of assumptions they must not make (e.g., that I feel like this conveys the solution to the issue clearer, i.e. it (hopefully) equips the reader with a state of mind where they can check their assumptions and design correct test cases. Or am I just thinking weirdly here? |
Rewrites the benchmarking page partially to address #21.
Also contains rewrite of group operation counting section to adress new bucket system.