Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New add_crc pack function #7

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

New add_crc pack function #7

wants to merge 12 commits into from

Conversation

rurban
Copy link
Contributor

@rurban rurban commented Jun 29, 2012

Dependent on the msgpack/msgpack#114 pull request

Review on Reviewable

Reini Urban added 2 commits June 28, 2012 15:11
Use empty reserved msgpack 0xc6 type for a crc32 as uint32 at the end of the buffer.
See msgpack/msgpack#114 pull request.

Optional CRC tag for basic security and data corruption. If pack provides a crc checksum,
unpack should check the result against the given crc and report an error otherwise.
The crc tag must be the last tag in the buffer. The crc checksum does not include itself
and its type tag.
@rurban
Copy link
Contributor Author

rurban commented Jul 2, 2012

Performance loss by adding a seperate add_crc call: 15%.
Which is still much faster than anything else.

I also tried a CRC32 lookup implementation, which was 20% slower than without crc, 5% slower than the current inlined version.

-- serialize
JSON::XS: 2.32
Data::MessagePack: 0.46_01
Storable: 2.30
Benchmark: running json, mp, mp_crc, storable for at least 1 CPU seconds...
json: 2 wallclock secs ( 1.12 usr + 0.00 sys = 1.12 CPU) @ 139635.71/s (n=156392)
mp: 2 wallclock secs ( 1.03 usr + 0.00 sys = 1.03 CPU) @ 196495.15/s (n=202390)
mp_crc: 3 wallclock secs ( 1.03 usr + 0.00 sys = 1.03 CPU) @ 167020.39/s (n=172031)
storable: 5 wallclock secs ( 1.07 usr + 0.00 sys = 1.07 CPU) @ 89320.56/s (n=95573)
Rate storable json mp_crc mp
storable 89321/s -- -36% -47% -55%
json 139636/s 56% -- -16% -29%
mp_crc 167020/s 87% 20% -- -15%
mp 196495/s 120% 41% 18% --

benchmark/serialize.pl:
cmpthese timethese(
-1 => {
json => sub { JSON::encode_json($a) },
storable => sub { Storable::freeze($a) },
mp => sub { Data::MessagePack->pack($a) },
mp_crc => sub { my $p = Data::MessagePack->pack($a); Data::MessagePack->add_crc($p); },
}
);

lookup table:
json: 5 wallclock secs ( 1.10 usr + 0.01 sys = 1.11 CPU) @ 129152.25/s (n=143359)
mp: 3 wallclock secs ( 1.03 usr + 0.00 sys = 1.03 CPU) @ 196495.15/s (n=202390)
mp_crc: 5 wallclock secs ( 0.99 usr + 0.01 sys = 1.00 CPU) @ 156392.00/s (n=156392)
storable: 4 wallclock secs ( 1.07 usr + 0.03 sys = 1.10 CPU) @ 86884.55/s (n=95573)
Rate storable json mp_crc mp
storable 86885/s -- -33% -44% -56%
json 129152/s 49% -- -17% -34%
mp_crc 156392/s 80% 21% -- -20%
mp 196495/s 126% 52% 26% --

Reini Urban added 10 commits July 20, 2012 10:46
Seperate simple crc32.c (byte advance).
It is about 5% slower than the inlined macro version.
With zlib the crc overhead is only 2% for small buffers.
Previously it was 10%.

with -lz:
$ pb benchmark/serialize.pl; pb benchmark/deserialize.pl
-- serialize
JSON::XS: 2.32
Data::MessagePack: 0.46_01
Storable: 2.35
Benchmark: running json, mp, mp_crc, storable for at least 1 CPU seconds...
      json:  1 wallclock secs ( 1.06 usr +  0.03 sys =  1.09 CPU) @ 136436.70/s (n=148716)
        mp:  1 wallclock secs ( 1.02 usr +  0.01 sys =  1.03 CPU) @ 185578.64/s (n=191146)
    mp_crc:  0 wallclock secs ( 1.06 usr +  0.00 sys =  1.06 CPU) @ 162293.40/s (n=172031)
  storable:  1 wallclock secs ( 1.05 usr +  0.00 sys =  1.05 CPU) @ 91021.90/s (n=95573)
             Rate storable     json   mp_crc       mp
storable  91022/s       --     -33%     -44%     -51%
json     136437/s      50%       --     -16%     -26%
mp_crc   162293/s      78%      19%       --     -13%
mp       185579/s     104%      36%      14%       --
-- deserialize
JSON::XS: 2.32
Data::MessagePack: 0.46_01
Storable: 2.35
Benchmark: running json, mp, mp_crc, storable for at least 1 CPU seconds...
      json:  1 wallclock secs ( 1.03 usr +  0.01 sys =  1.04 CPU) @ 97302.88/s (n=101195)
        mp:  1 wallclock secs ( 1.12 usr +  0.01 sys =  1.13 CPU) @ 126866.37/s (n=143359)
    mp_crc:  1 wallclock secs ( 1.06 usr +  0.00 sys =  1.06 CPU) @ 124841.51/s (n=132332)
  storable:  1 wallclock secs ( 1.07 usr +  0.00 sys =  1.07 CPU) @ 80387.85/s (n=86015)
             Rate storable     json   mp_crc       mp
storable  80388/s       --     -17%     -36%     -37%
json      97303/s      21%       --     -22%     -23%
mp_crc   124842/s      55%      28%       --      -2%
mp       126866/s      58%      30%       2%       --

Compared to the old implementation, which had 10% overhead.
And 18% with packing, but this difference is only the perl
XS function call, not crc itself.

$ pb benchmark/deserialize.pl
-- deserialize
JSON::XS: 2.32
Data::MessagePack: 0.46_01
Storable: 2.35
Benchmark: running json, mp, mp_crc, storable for at least 1 CPU seconds...
      json:  1 wallclock secs ( 1.09 usr +  0.00 sys =  1.09 CPU) @ 98642.20/s (n=107520)
        mp:  1 wallclock secs ( 1.11 usr +  0.00 sys =  1.11 CPU) @ 129153.15/s (n=143360)
    mp_crc:  1 wallclock secs ( 1.05 usr +  0.00 sys =  1.05 CPU) @ 117027.62/s (n=122879)
  storable:  1 wallclock secs ( 1.08 usr +  0.00 sys =  1.08 CPU) @ 79643.52/s (n=86015)
             Rate storable     json   mp_crc       mp
storable  79644/s       --     -19%     -32%     -38%
json      98642/s      24%       --     -16%     -24%
mp_crc   117028/s      47%      19%       --      -9%
mp       129153/s      62%      31%      10%       --
Fixed conflicts:
	Makefile.PL
	lib/Data/MessagePack.pm
	xs-src/pack.c
Properly cast x->cur
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant