Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking paper #26

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Benchmarking paper #26

wants to merge 7 commits into from

Conversation

rheitjoh
Copy link
Member

@rheitjoh rheitjoh commented Jun 30, 2021

Rewrites the benchmarking page partially to address #21.

Also contains rewrite of group operation counting section to adress new bucket system.

@rheitjoh rheitjoh requested a review from JanBobolz June 30, 2021 14:47
Copy link
Member

@JanBobolz JanBobolz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like what's written now still has this "I need to use these weird workarounds because of this list of issues that I'm now aware of" aura that I wanted to avoid. Yes, in the current version, the reader will probably understand the reasons behind these, since these are well-explained, but we're still not supplying a simple coherent view on testing/counting that doesn't depend on the reader considering these miniscule implementation details of our lib (which causes insecurity in people).

For example: I've never run into issues benchmarking with our library. Why? Because I usually benchmark complete processes between multiple parties (say, a blind signing protocol). I just run the protocol and I serialize/deserialize whenever a party would send/receive (basically, an actual run, just without opening a socket). In this scenario, there is no need to consider secret precomputation state or artificially call computeSync() on stuff. It's all just happening as it would in an actual application. As soon as the signer gets their verification bit, that's it. As soon as the receiver's signature object is there, it's done. No issue.
Why do we keep this view from people? I feel like it's one paragraph alá "you can of course benchmark your full running application with mock sockets, and then disregard everything we explain about synthetic benchmarks here" and lots of people would immediately go "ah. Okay. So then let's simply do that".

This is because non-blocking computation can lead to race conditions when printing the result of tracking the group operations, i.e. the computation has not been performed yet when the data is printed.
So make sure to always call `compute()` on every `DebugGroupElement` before accessing any counter data.
`DebugGroup` does use lazy evaluation, meaning that you need to ensure all lazy computations have finished before retrieving the tracked results.
One way to do this is to call `computeSync()` on all operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean... don't call it on all operations. Just the values you are interested in, not intermediate ones.


```java
import org.cryptimeleon.math.structures.groups.debug.DebugGroup;

// instantiate the debug group with a name and its size
// Instantiate the debug group with a name and its size
DebugGroup debugGroup = new DebugGroup("DG1", 1000000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One remark here: to get useful results, the (debug) group size should be consistent with whatever (actual) group you want to count for. Otherwise, we get wrong results (e.g., if the group is small, also exponentiations take way fewer group ops).


### Mutability

Another problem is the mutability of the `Signature` object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know whether there's a better word for it. Like ... the value represented by the object never changes. It's not mutable in this sense. There are just some helper values that may be added that don't influence the value, just performance characteristics.

// algorithm faster than it would be without, we recreate the objects
// without precomputations.
signature = scheme.restoreSignature(signatureRepr);
verificationKey = scheme.restoreVerificationKey(verifyKeyRepr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could actually be discussed. This highly depends on what you want to benchmark. If you want to verify multiple signatures on the same key that you keep in memory, you should just use the same verificationKey object every time.

The current benchmark is "I get a list of unknown signatures/verification keys and need to verify them all".

@feidens
Copy link
Member

feidens commented Jul 1, 2021

I feel like what's written now still has this "I need to use these weird workarounds because of this list of issues that I'm now aware of" aura that I wanted to avoid. Yes, in the current version, the reader will probably understand the reasons behind these, since these are well-explained, but we're still not supplying a simple coherent view on testing/counting that doesn't depend on the reader considering these miniscule implementation details of our lib (which causes insecurity in people).

For example: I've never run into issues benchmarking with our library. Why? Because I usually benchmark complete processes between multiple parties (say, a blind signing protocol). I just run the protocol and I serialize/deserialize whenever a party would send/receive (basically, an actual run, just without opening a socket). In this scenario, there is no need to consider secret precomputation state or artificially call computeSync() on stuff. It's all just happening as it would in an actual application. As soon as the signer gets their verification bit, that's it. As soon as the receiver's signature object is there, it's done. No issue.
Why do we keep this view from people? I feel like it's one paragraph alá "you can of course benchmark your full running application with mock sockets, and then disregard everything we explain about synthetic benchmarks here" and lots of people would immediately go "ah. Okay. So then let's simply do that".

Just to give my 40 cent to this: Why are we not just showing straightforward benchmarking (like in @JanBobolz's example) and then add a section "Checking for Problems: Your numbers seem a bit off". In this section we go into the things one has to consider. So miniscule implementation details are explained there. Advantage of this approach is that the "normal" use case can concentrate on answering the question "how can I benchmark my scheme" and further details are postponed to the extra section.

@rheitjoh
Copy link
Member Author

For example: I've never run into issues benchmarking with our library. Why? Because I usually benchmark complete processes between multiple parties (say, a blind signing protocol). I just run the protocol and I serialize/deserialize whenever a party would send/receive (basically, an actual run, just without opening a socket). In this scenario, there is no need to consider secret precomputation state or artificially call computeSync() on stuff. It's all just happening as it would in an actual application. As soon as the signer gets their verification bit, that's it. As soon as the receiver's signature object is there, it's done. No issue.
Why do we keep this view from people? I feel like it's one paragraph alá "you can of course benchmark your full running application with mock sockets, and then disregard everything we explain about synthetic benchmarks here" and lots of people would immediately go "ah. Okay. So then let's simply do that".

These considerations all rely on the user being able to make that decision whether the problems are relevant to them, which they cannot do if they are not educated on what exactly the issues are. How do you even differentiate whether a benchmark is "synthetic" or not? Is any benchmark involving serialization automatically safe? Do we just say "If you are benchmarking a complete application, then these issues don't apply to you" (what is a complete application?)? I don't want to give the reader a free ticket for just ignoring our advice and (potentially) obtaining inaccurate numbers.

@JanBobolz
Copy link
Member

I'm not arguing not to tell people what the issues are. I'm saying we shouldn't leave people with "these are some weird issues" and "here's how to work around it with serialization because internally, this clears caches, verification results, etc.". For me, this does not convey a good intuition of how to write my tests. I guess blindly serializing and deserializing everything will do it?! Or is that too much? When do I need to work around and when do I not?

I'm suggesting that we tell people clearly what kinds of assumptions they must not make (e.g., that .op()/.sign()/... returns only after the result is fully finished computing or that a signature object doesn't change state after verify(), etc.).
But then I'd also tell them "look, this makes sense because you're not actually interested in measuring how fast sign() returns. You're interested in measuring how long it takes to get a serialized byte string signature that you can write to network. So design your benchmarks to simulate an actual use-case". And "You're not actually interested in measuring how long it takes to verify the same signature object a hundred times. You want to know how long it takes between receiving a byte string until you get the boolean whether or not that was a valid signature".

I feel like this conveys the solution to the issue clearer, i.e. it (hopefully) equips the reader with a state of mind where they can check their assumptions and design correct test cases. Or am I just thinking weirdly here?

@JanBobolz JanBobolz self-assigned this Aug 19, 2021
@JanBobolz JanBobolz changed the base branch from develop to main October 21, 2021 07:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants