Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Rust optimization when target-features has "allow-non-simd" and the CPU flag is set #188

Closed
amanjeev opened this issue May 2, 2021 · 17 comments

Comments

@amanjeev
Copy link

amanjeev commented May 2, 2021

Summary

To enable SIMD one has to do these things

  1. In Cargo.toml set the target-features
simd-json = { version = "0.4", features = ["allow-non-simd", "known-key", "serde_impl"] }
  1. Pass CPU flag either directly to rustc or via adding the following in .cargo/config
[build]
rustflags = "-C target-cpu=native"

But sometimes, despite adding the above, and especially with allow-non-simd in the features, the result is still not SIMD compatible code.

To do

Write tests that help identify if the rustc optimization is prevented by modifying Cargo.toml

simd-json = { version = "0.4", features = ["known-key", "serde_impl"] }
@Licenser
Copy link
Member

Licenser commented May 3, 2021

I tried this and my hunch here was wrong, at least as far as that I could not detect performance changes when adding the allow-non-simd flag :(

@amanjeev
Copy link
Author

amanjeev commented May 3, 2021

@Licenser oh so close this then? I feel like this is likely my code.

Non SIMD code

#[derive(Debug, Serialize, Deserialize)]
struct SIMDExample<'sin> {
    id: u64,
    #[serde(borrow)]
    id_str: &'sin str,
}

fn main() {
    let data_file = File::open("/home/aj/deploy/rust/zcapjson/data/fake_data.json").unwrap();
    let reader = BufReader::new(data_file);

    for line in reader.lines() {
        //println!("{}", line.unwrap());
        let row: &mut str = &mut line.unwrap();
        let row: SIMDExample = serde_json::from_str(row).unwrap();
        match row.id {
            2807149942735425369 => println!("look ma! a match! - {}", row.id_str),
            _ => println!("No match yet"),
        }
    }
}

SIMD code


#[derive(Debug, Serialize, Deserialize)]
struct SIMDExample<'sin> {
    id: u64,
    #[serde(borrow)]
    id_str: &'sin str,
}

fn main() {
    let data_file = File::open("/home/aj/deploy/rust/zcapjson/data/fake_data.json").unwrap();
    let reader = BufReader::new(data_file);

    for line in reader.lines() {
        //println!("{}", line.unwrap());
        let row: &mut str = &mut line.unwrap();
        let row: SIMDExample = simd_json::serde::from_str(row).unwrap();
        match row.id {
            2807149942735425369 => println!("look ma! a match! - {}", row.id_str),
            _ => println!("No match yet"),
        }
    }
}

As you can see the only difference is the line let row: SIMDExample = simd_json::serde::from_str(row).unwrap();. Maybe my thinking is wrong on how this is supposed to work.

@Licenser
Copy link
Member

Licenser commented May 4, 2021

lets keep it open until we find the solution :) , even if it's not the issue I first thought it is, it's not any less of an issue for you.

Could it be related to #189?

@amanjeev
Copy link
Author

amanjeev commented May 4, 2021

That did not work for me. Also, I tried with new version of simd_json_derive but somehow I need to impl serde deserialize for my type

error[E0277]: the trait bound `SIMDExample<'_>: serde::de::Deserialize<'_>` is not satisfied
  --> src/main.rs:20:32
   |
20 |         let row: SIMDExample = simd_json::serde::from_str(row).unwrap();
   |                                ^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `serde::de::Deserialize<'_>` is not implemented for `SIMDExample<'_>`
   | 
  ::: /home/aj/.cargo/registry/src/github.com-1ecc6299db9ec823/simd-json-0.4.4/src/serde.rs:75:8
   |
75 |     T: Deserialize<'a>,
   |        --------------- required by this bound in `simd_json::from_str`

@Licenser
Copy link
Member

Licenser commented May 6, 2021

Sorry for the wrong suggestion and late reply.

the derive offers direct function <Struct>::from_str() (I didn't mention that before since I forgot -.-)

for example:

https://github.com/simd-lite/simd-json-derive/blob/main/tests/struct.rs#L74

@amanjeev
Copy link
Author

amanjeev commented May 9, 2021

So strange, because of this #[serde(borrow)] https://github.com/simd-lite/simd-json-derive/blob/main/tests/struct.rs#L78 in my code fails to build.

error: proc-macro derive panicked
 --> src/main.rs:7:10
  |
7 | #[derive(simd_json_derive::Serialize, simd_json_derive::Deserialize,PartialEq, Debug)]
  |          ^^^^
  |
  = help: message: failed to parse attributes: Error("unexpected attribute `borrow`")

I have a feeling I am not doing something right. The same code works for you but for me it throws an error if #[serde(borrow)] is included in the struct. It compiles fine without #[serde(borrow)] but then the SIMD code is still slower.

https://github.com/amanjeev/simdjson-playground/blob/simd-search/src/main.rs

@Licenser
Copy link
Member

You did nothing wrong, I was just an silly and forgot to package/release the -int crate 🤦 I forget it every time :( I downloaded your example and with cargo update; cargo run --release it compiles and runs (so I couldn't test the performance as I don't have the data file)

@amanjeev
Copy link
Author

phew! I should have checked, I did look at the -int but did not check whether it was updated in this repo. Sorry and thank you so much!

Cool, it compiles and runs fine but the SIMD version is still slower. Which makes me think that may be my code sucks. As in, I am using the SIMD library incorrectly.

The large JSON file looks like this in structure. I can upload it somewhere if you want. It is not a valid json but each line is an object.

tail /home/aj/deploy/rust/zcapjson/data/fake_data.json
{"id":2783400491575463740,"id_str":"yL8dXstRORnjzdLxBpW5NKqdL8zRZ0vg3MJ2bcpqq0XHapKWgjQKvcii8gkBK8IwJIcQLoW8O0VRVjcgaVrtii5itUtxzrIXGTYdFVHz2yeMnmwWEMWbtVPPuiWiH6usEGNHuAEAhyQB23HftgzgiK"}
{"id":2790026148644442355,"id_str":"R0T5ieKvhWZPtHp5MoSP6I3wtS4cotwC7NO5OVJq7i9LU5ByBakep3EoD8TVUMAzF66l2cWk4wlXsk4WCjughRWrA1EFYsfjDd0JW1RJ4yggzBK8N8G4wEqHO0uDVvUVQgwAENBQFtCQLa1kSH4slZ"}
{"id":2805742567851871689,"id_str":"auLPbW8wp8WiV4cR6Vwf2o82EsrlyvqQRTGPDI8LfKMCka9i0fV85ZLIAeS2gHfLKFOFpzui6dV4twfqPFmJNQ5qLk2LC8GdDD6orLGUgmVe2l4oJVbVpyKWozaVHaM1B0MqLulutM7ZAuvYBAf2Ju"}
{"id":2783313630156869072,"id_str":"0hUODXm9494UYj6wte7jQFh7Lu3abNYA8nSllYrlSLxMrt3zKSvbhWYqZeWrQI2uVzr41joTAiKg3R60xsBVytqfBZAhPYwFXBnW8hQpcaIrPdVjT0vmL4qyDNK6YK49VFvzRwZCk1PuyrtIMurZ95"}
{"id":2794118530923024318,"id_str":"VUSQ0iAapdpzWcfEoLqG0iMCqglqUdHVSQ5SXURrfB1bA8jBFbkoqavgfoFRdB2VDiMx02gzks6TMnDxo7KvAtV7YoaYek0PQitNnyHfKbLW1coBT1QcudSuUHzBBqgQ3kQ21ztOmauLxZXa6EKR79"}
{"id":2793357668876603087,"id_str":"u9a9dup6NR6F1UDM6dOGBW2KXQaUzs6yYNYO4CFHwTUp7YxgoEpMbsie9Iq6ube5o4qJNHmcbZgXgdG5yY1qOKBlGK6vmUxfancS2sa7bAq6Nkay249YNojpOT1GPpdOECNNOb92wI0h84MxEewzXd"}
{"id":2794105886539304589,"id_str":"I8XOj0r7FW8dXIISGbRV7FO2uFFknSXhqqy4LVbfipq163iYPiRMt3V1WRfgHOkJ1KzObamHZksA5ZRDyHoA0XjOLbSe8e47fiVpW3ImPoJix6Ukf5kkfg8duXLu9Qw5eWCPOZHCCwRda6GjXZQX0O"}
{"id":2792425283016248611,"id_str":"pvzSFkceSXaJP8G3CxHnN2KfWEUmrpDrekhM5HLpMsQfRd2PT14NeAKCtOG6sf6KSfZ5yaQbmDUAWOT2gNEKi9xeGXFCvpwFlHgKMxyrwkciho29uO1pAUrlhIW803NJ5HuxT3VnooPlfTu9wNhN39"}
{"id":6341068275337711363,"id_str":"QIkE4pd8uU4hCUjaEl0a4NzIPWE89cHlMb1LIIck87LU1TGjmOA72utEUeOMjOKCffAJ4UeBG5wg0j6HWIoNRtj9qcDfDOHkevoWKXawmr4jrtt7K1e2QeVTujgw6SkrsDH1SqosWwQ4SvHXtmHakv"}
{"id":2807149942735425369,"id_str":"yZAGT31M2aaQF0XgyegrhPhucoYDoY4lMTy7ciB6l0fecLTU82PD1V4ODmdqZV4TkRoR7YilpV3QtGhu8YxxuaRATwgibJtCO7UJrxy1I8tEdRWZ2ihlg9ZIVRxpZ4Kb0im37Z6sns4GqwnFUG4p4V"}

@Licenser
Copy link
Member

You couldn't have caught it :) the crate wasn't published no chance to catch that from your end, I'll give that a try tomorrow I can generate a file like that :D

@Licenser
Copy link
Member

I got this working with example data, but even with 10k entries, the things end so fast that it's hard to measure anything for me :( that said since it prints on every entry I'm pretty sure that for this code the time spend in println! will dominate any benchmark, followed by IO times.

How are you measuring 'faster' or 'slower' exactly?

@Licenser
Copy link
Member

Based on thoughts, this is going to be the optimal code:

fn main() {
    let data_file = File::open("data/fake_data.json").unwrap();
    let mut reader = BufReader::new(data_file);
    let mut data = Vec::with_capacity(1024);
    let mut string_buffer = Vec::with_capacity(2048);
    let mut input_buffer = AlignedBuf::with_capacity(1024);
    while reader.read_until(b'\n', &mut data).unwrap() > 0 {
        let  row =
            SIMDExample::from_slice_with_buffers(&mut data, &mut input_buffer, &mut string_buffer)
                .unwrap();

        if row.id == 2807149942735425369 {
            println!("look ma! a match! - {}", row.id_str)
        }
        data.clear();
    }
}

It:

  • avoids allocations by using preallocated buffers
  • avoids the rather expensive lines() call
  • takes advantage of simd_jsons faster utf8 validation by skipping it and reading the file as bytes not strings

amanjeev added a commit to amanjeev/simdjson-playground that referenced this issue May 12, 2021
amanjeev added a commit to amanjeev/simdjson-playground that referenced this issue May 12, 2021
These changes make the cod faster by

* avoids allocations by using preallocated buffers
* avoids the rather expensive lines() call
* takes advantage of simd_jsons faster utf8 validation by skipping
  it and reading the file as bytes not strings

simd-lite/simd-json#188 (comment)
@amanjeev
Copy link
Author

Your code is so slick that it cut about 4 seconds off of the 5GB file I am using for this experiment. However, the non-SIMD version is still faster. For the sake of "similarity" I changed the non-SIMD code to match the style of your faster SIMD code. Pasting it here inline

    let data_file = File::open("data/fake_data.json").unwrap();
    let mut reader = BufReader::new(data_file);

    let mut data = Vec::with_capacity(1024);

    while reader.read_until(b'\n', &mut data).unwrap() > 0 {
        let row: SIMDExample = serde_json::from_slice(data.as_mut_slice()).unwrap();
        if row.id == 2807149942735425369 {
            println!("look ma! a match! - {}", row.id_str);
        }
        data.clear();
    }

The hyperfine results are as follows (I hope I am reading them correctly, if not please forgive me)

SIMD run

  Time (mean ± σ):      7.370 s ±  0.146 s    [User: 6.920 s, System: 0.448 s]
  Range (min … max):    7.058 s …  7.496 s    10 runs

Non SIMD run

  Time (mean ± σ):      5.977 s ±  0.109 s    [User: 5.503 s, System: 0.472 s]
  Range (min … max):    5.773 s …  6.080 s    10 runs

What do you think I should do to proceed?

SIMD search compilation has this flag https://github.com/amanjeev/simdjson-playground/blob/simd-search/.cargo/config and these features https://github.com/amanjeev/simdjson-playground/blob/simd-search/Cargo.toml#L9

@hkratz
Copy link
Contributor

hkratz commented May 15, 2021

Calling simd-json on individual lines is really not a good usecase for it. Ideally we should have special support for newline-delimited JSON (NDJSON) like upstream.

@Licenser
Copy link
Member

Ja the lines are fairly short too the advantages are a lot smaller (sometimes detrimental) as there is an initial cost to pay for filling the registers, doing multiple runs etc. can overshadow the performance gain for very small payloads.

NDJSON would be incredibly cool (especially if we manage to realize in a streaming fashion / as an iterator), sadly so far no use case has presented it self that justified the time 😭 but the want is there :D

@amanjeev
Copy link
Author

amanjeev commented May 15, 2021

Calling simd-json on individual lines is really not a good usecase for it. Ideally we should have special support for newline-delimited JSON (NDJSON) like upstream.

Yes! Thank you! I was unable to enunciate or explain. I did think something was off, as in, naturally there is more work being done per line in SIMD version than normal Serde version. I wrote this earlier xD -

Which makes me think that may be my code sucks. As in, I am using the SIMD library incorrectly.

Do you think we can repurpose this ticket or open a new one for the upstream alignment? Also, who will be doing that? That is, I am assuming that's a good way for me to achieve this SIMD test with my JSON.

edit: they missed the chance in calling the thing parsimony?

@Licenser
Copy link
Member

Absolutely both repurposing the ticket or opening a new one is fine :) and documenting that the need is now there us a really good way to improve simd-json

@amanjeev
Copy link
Author

#194 has been opened where this work needs to be done. Closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants