Benchmarking paper #26

rheitjoh · 2021-06-30T14:47:26Z

Rewrites the benchmarking page partially to address #21.

Also contains rewrite of group operation counting section to adress new bucket system.

JanBobolz

I feel like what's written now still has this "I need to use these weird workarounds because of this list of issues that I'm now aware of" aura that I wanted to avoid. Yes, in the current version, the reader will probably understand the reasons behind these, since these are well-explained, but we're still not supplying a simple coherent view on testing/counting that doesn't depend on the reader considering these miniscule implementation details of our lib (which causes insecurity in people).

For example: I've never run into issues benchmarking with our library. Why? Because I usually benchmark complete processes between multiple parties (say, a blind signing protocol). I just run the protocol and I serialize/deserialize whenever a party would send/receive (basically, an actual run, just without opening a socket). In this scenario, there is no need to consider secret precomputation state or artificially call computeSync() on stuff. It's all just happening as it would in an actual application. As soon as the signer gets their verification bit, that's it. As soon as the receiver's signature object is there, it's done. No issue.
Why do we keep this view from people? I feel like it's one paragraph alá "you can of course benchmark your full running application with mock sockets, and then disregard everything we explain about synthetic benchmarks here" and lots of people would immediately go "ah. Okay. So then let's simply do that".

JanBobolz · 2021-07-01T10:23:10Z

docs/benchmarking.md

-This is because non-blocking computation can lead to race conditions when printing the result of tracking the group operations, i.e. the computation has not been performed yet when the data is printed.
-So make sure to always call `compute()` on every `DebugGroupElement` before accessing any counter data.
+`DebugGroup` does use lazy evaluation, meaning that you need to ensure all lazy computations have finished before retrieving the tracked results.
+One way to do this is to call `computeSync()` on all operations.


I mean... don't call it on all operations. Just the values you are interested in, not intermediate ones.

JanBobolz · 2021-07-01T10:25:56Z

docs/benchmarking.md


 ```java
 import org.cryptimeleon.math.structures.groups.debug.DebugGroup;

-// instantiate the debug group with a name and its size
+// Instantiate the debug group with a name and its size
 DebugGroup debugGroup = new DebugGroup("DG1", 1000000);


One remark here: to get useful results, the (debug) group size should be consistent with whatever (actual) group you want to count for. Otherwise, we get wrong results (e.g., if the group is small, also exponentiations take way fewer group ops).

JanBobolz · 2021-07-01T10:30:25Z

docs/benchmarking.md

+
+### Mutability
+
+Another problem is the mutability of the `Signature` object.


I don't know whether there's a better word for it. Like ... the value represented by the object never changes. It's not mutable in this sense. There are just some helper values that may be added that don't influence the value, just performance characteristics.

JanBobolz · 2021-07-01T10:33:13Z

docs/benchmarking.md

+        //  algorithm faster than it would be without, we recreate the objects
+        //  without precomputations.
+        signature = scheme.restoreSignature(signatureRepr);
+        verificationKey = scheme.restoreVerificationKey(verifyKeyRepr);


This could actually be discussed. This highly depends on what you want to benchmark. If you want to verify multiple signatures on the same key that you keep in memory, you should just use the same verificationKey object every time.

The current benchmark is "I get a list of unknown signatures/verification keys and need to verify them all".

feidens · 2021-07-01T11:38:59Z

I feel like what's written now still has this "I need to use these weird workarounds because of this list of issues that I'm now aware of" aura that I wanted to avoid. Yes, in the current version, the reader will probably understand the reasons behind these, since these are well-explained, but we're still not supplying a simple coherent view on testing/counting that doesn't depend on the reader considering these miniscule implementation details of our lib (which causes insecurity in people).

For example: I've never run into issues benchmarking with our library. Why? Because I usually benchmark complete processes between multiple parties (say, a blind signing protocol). I just run the protocol and I serialize/deserialize whenever a party would send/receive (basically, an actual run, just without opening a socket). In this scenario, there is no need to consider secret precomputation state or artificially call computeSync() on stuff. It's all just happening as it would in an actual application. As soon as the signer gets their verification bit, that's it. As soon as the receiver's signature object is there, it's done. No issue.
Why do we keep this view from people? I feel like it's one paragraph alá "you can of course benchmark your full running application with mock sockets, and then disregard everything we explain about synthetic benchmarks here" and lots of people would immediately go "ah. Okay. So then let's simply do that".

Just to give my 40 cent to this: Why are we not just showing straightforward benchmarking (like in @JanBobolz's example) and then add a section "Checking for Problems: Your numbers seem a bit off". In this section we go into the things one has to consider. So miniscule implementation details are explained there. Advantage of this approach is that the "normal" use case can concentrate on answering the question "how can I benchmark my scheme" and further details are postponed to the extra section.

rheitjoh · 2021-07-22T11:22:45Z

For example: I've never run into issues benchmarking with our library. Why? Because I usually benchmark complete processes between multiple parties (say, a blind signing protocol). I just run the protocol and I serialize/deserialize whenever a party would send/receive (basically, an actual run, just without opening a socket). In this scenario, there is no need to consider secret precomputation state or artificially call computeSync() on stuff. It's all just happening as it would in an actual application. As soon as the signer gets their verification bit, that's it. As soon as the receiver's signature object is there, it's done. No issue.
Why do we keep this view from people? I feel like it's one paragraph alá "you can of course benchmark your full running application with mock sockets, and then disregard everything we explain about synthetic benchmarks here" and lots of people would immediately go "ah. Okay. So then let's simply do that".

These considerations all rely on the user being able to make that decision whether the problems are relevant to them, which they cannot do if they are not educated on what exactly the issues are. How do you even differentiate whether a benchmark is "synthetic" or not? Is any benchmark involving serialization automatically safe? Do we just say "If you are benchmarking a complete application, then these issues don't apply to you" (what is a complete application?)? I don't want to give the reader a free ticket for just ignoring our advice and (potentially) obtaining inaccurate numbers.

…issues by example

JanBobolz · 2021-08-16T21:24:15Z

I'm not arguing not to tell people what the issues are. I'm saying we shouldn't leave people with "these are some weird issues" and "here's how to work around it with serialization because internally, this clears caches, verification results, etc.". For me, this does not convey a good intuition of how to write my tests. I guess blindly serializing and deserializing everything will do it?! Or is that too much? When do I need to work around and when do I not?

I'm suggesting that we tell people clearly what kinds of assumptions they must not make (e.g., that .op()/.sign()/... returns only after the result is fully finished computing or that a signature object doesn't change state after verify(), etc.).
But then I'd also tell them "look, this makes sense because you're not actually interested in measuring how fast sign() returns. You're interested in measuring how long it takes to get a serialized byte string signature that you can write to network. So design your benchmarks to simulate an actual use-case". And "You're not actually interested in measuring how long it takes to verify the same signature object a hundred times. You want to know how long it takes between receiving a byte string until you get the boolean whether or not that was a valid signature".

I feel like this conveys the solution to the issue clearer, i.e. it (hopefully) equips the reader with a state of mind where they can check their assumptions and design correct test cases. Or am I just thinking weirdly here?

rheitjoh added 4 commits June 24, 2021 14:56

Rework benchmarking.md

0458ca2

Rewrite the problems with lazy evaluation section of benchmarking.md

97dbf03

Fix indentation in JMH example

9b8b605

Add note about selecting exp algs for debug groups

87bd104

rheitjoh requested a review from JanBobolz June 30, 2021 14:47

JanBobolz requested changes Jul 1, 2021

View reviewed changes

rheitjoh added 3 commits July 22, 2021 13:23

Reorder jmh benchmarking to first present basic example and then fix …

07362ac

…issues by example

Add explanation of new bucket system

b725852

Rewrite benchmarking document to remove Default suffix stuff

f4a4057

JanBobolz self-assigned this Aug 19, 2021

JanBobolz changed the base branch from develop to main October 21, 2021 07:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarking paper #26

Benchmarking paper #26

rheitjoh commented Jun 30, 2021 •

edited

Loading

JanBobolz left a comment

JanBobolz Jul 1, 2021

JanBobolz Jul 1, 2021

JanBobolz Jul 1, 2021

JanBobolz Jul 1, 2021

feidens commented Jul 1, 2021 •

edited

Loading

rheitjoh commented Jul 22, 2021

JanBobolz commented Aug 16, 2021


		### Mutability

		Another problem is the mutability of the `Signature` object.

Benchmarking paper #26

Are you sure you want to change the base?

Benchmarking paper #26

Conversation

rheitjoh commented Jun 30, 2021 • edited Loading

JanBobolz left a comment

Choose a reason for hiding this comment

JanBobolz Jul 1, 2021

Choose a reason for hiding this comment

JanBobolz Jul 1, 2021

Choose a reason for hiding this comment

JanBobolz Jul 1, 2021

Choose a reason for hiding this comment

JanBobolz Jul 1, 2021

Choose a reason for hiding this comment

feidens commented Jul 1, 2021 • edited Loading

rheitjoh commented Jul 22, 2021

JanBobolz commented Aug 16, 2021

rheitjoh commented Jun 30, 2021 •

edited

Loading

feidens commented Jul 1, 2021 •

edited

Loading