You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
0.3.0 was released today, and I was surprised by the zip file size:
File name
0.2.0
0.3.0
i686-pc-windows-msvc
1.74 MB
3.36 MB
x86_64-apple-darwin
1.7 MB
3.41 MB
x86_64-pc-windows-msvc
1.74 MB
3.36 MB
Clearly something has changed!
Note that these are compressed sizes reported on the GitHub releases page. The rest of this analysis will focus on uncompressed executable sizes. For reference, the size of the uncompressed x86_64 Windows exe is 4,410,880 bytes for 0.2.0, and 7,805,952 bytes for 0.3.0.
My first suspect was ureq, added in #11. This crate handles HTTP requests. I ran some high-level analysis with cargo-bloat and was able to verify that #11 increased the executable size by about 1.3 MB and added about 800 functions (after optimizations including inlining and dead code elimination). None of these new functions show up in the top-20 list provided by cargo-bloat, meaning they are all less than 23 KB.
The dependency tree for ureq points to some things that can likely be removed or replaced with a smaller alternative:
Number of dependencies is not a great proxy for contribution to executable size, but rustls, serde, and url looks like the heaviest transitive dependencies. We need TLS, but ureq can use either rustls or native-tls under the hood, so we might get a smaller exe just by switching? That's one theory to test.
On the other hand, serde_json is not necessary at all! I use it for conveniently parsing the JSON response from the GitHub API. But JSON parsing can be done with other crates like the aptly-named json, which has zero additional dependencies (beyond std): https://crates.io/crates/json/0.12.4/dependencies Switching JSON parsing to the json crate may or may not reduce overall exe size. That's another theory to test. (Aside: serde is used by other crates, so getting rid of all of it is highly unlikely.)
We need to handle URLs, so I guess the url crate is strictly necessary. For my particular use-case, I don't need to care about IDNA at all. Maybe there is something I could do to trick ureq into not calling into the url crate, and LTO will trim all the useless IDNA stuff. That's theory three.
But that doesn't account for all 3,395,072 bytes added between the two releases.
My second suspicion was something added when updating dependencies in #28. cargo-bloat pointed out an additional 1.8 MB from about 550 new functions added by this specific PR. Combined, that's roughly 3.1 MB added to the exe size from two PRs. That falls into the ballpark.
This time, there are two differences of note in the dependency tree. The first is rfd (which handles dialog windows).
Could windows-rs be responsible? It doesn't seem like it, since that crate is almost entirely a flat hierarchy of leaf functions. In my experience, these kinds of bloat issues tend to be caused by deeply nested functions with code paths that will never run: rust-console/cargo-n64@8cd9123
The next suspect was ... ureq again! A minor update added support for gzip, which includes a small tree of its own with `flate2:
gzip is totally optional (both in the HTTP spec, and as a feature flag supported by ureq) so we can just disable this and see how much it saves. That's the fourth (and final?) theory for where the bloat comes from.
Based on the above analysis, I have four things to check. I'm not planning to test these combinatorically, just sequentially. (For instance, step 3 is additive with steps 1 and 2; I don't intend to test what steps 1 and 3 look like while skipping step 2.) For each step, I keep the result that produces the smaller exe. Keep in mind, that may end up being exactly what we have in 0.3.0 if all four steps show that the status quo is optimal.
The basic plan is:
Replace rustls with native-tls.
Replace serde_json with json.
Find a way to trim IDNA from URL parsing.
Disable gzip support.
I'll start with a baseline, building the 0.3.0 tag as-is.
$ cargo build --release && ls -l target/release/cartunes.exe
[...snip...]
-rwxr-xr-x 2 jay 197609 7769088 Jan 24 20:19 target/release/cartunes.exe*
The file size is a little different than what's in the zip because I'm locally building with a recent nightly compiler (our CI uses the latest stable compiler) and I'm also using a different linker and build flags between my local machine and CI.
From this baseline, replace rustls with native-tls:
-rwxr-xr-x 2 jay 197609 7010304 Jan 24 21:36 target/release/cartunes.exe*
That satisfies my curiosity for whether native-tls produces smaller code than rustls. Yes, it does. About 750 KB worth of savings in my case.
Next, replace serde_json with json:
-rwxr-xr-x 2 jay 197609 7001088 Jan 24 21:40 target/release/cartunes.exe*
This theory only gave us a marginal win. About 9 KB. Well within the noise threshold, really. That result doesn't justify the additional code to maintain:
Abandoned patch
diff --git a/Cargo.toml b/Cargo.toml
index 50e062f..812ea78 100644
--- a/Cargo.toml+++ b/Cargo.toml@@ -25,6 +25,7 @@ epaint = { version = "0.16", default-features = false, features = ["single_threa
font-loader = "0.11"
human-sort = "0.2"
hotwatch = "0.4"
+json = "0.12"
kuchiki = "0.8"
log = "0.4"
native-tls = "0.2"
@@ -34,11 +35,10 @@ pollster = "0.2"
raw-window-handle = "0.4"
rfd = "0.6"
semver = "1.0"
-serde = { version = "1.0", features = ["derive"] }
thiserror = "1.0"
toml_edit = "0.13"
unicode-segmentation = "1.7"
-ureq = { version = "2.3", default-features = false, features = ["gzip", "json", "native-tls"] }+ureq = { version = "2.3", default-features = false, features = ["gzip", "native-tls"] }
walkdir = "2.3"
webbrowser = "0.5"
wgpu = "0.12"
diff --git a/src/updates.rs b/src/updates.rs
index 0c7fde2..65ed2a6 100644
--- a/src/updates.rs+++ b/src/updates.rs@@ -8,7 +8,6 @@ use crate::framework::UserEvent;
use crate::timer::Timer;
use log::error;
use semver::Version;
-use serde::Deserialize;
use std::any::Any;
use std::sync::mpsc::{sync_channel, Receiver, SyncSender};
use std::sync::Arc;
@@ -39,6 +38,22 @@ pub(crate) enum Error {
Persist(#[from] PersistError),
}
+/// All the ways in which parsing the JSON response body can fail.+#[derive(Debug, Error)]+pub(crate) enum JsonParseError {+ /// Invalid name.+ #[error("Invalid release name")]+ Name,++ /// Invalid body.+ #[error("Invalid release body")]+ Body,++ /// Invalid html_url.+ #[error("Invalid release URL")]+ Url,+}+
/// How often to check for updates.
#[derive(Debug, Copy, Clone, Eq, PartialEq)]
pub(crate) enum UpdateFrequency {
@@ -75,7 +90,7 @@ struct UpdateCheckerThread {
}
/// Parsed API response body.
-#[derive(Debug, Deserialize)]+#[derive(Debug)]
pub(crate) struct ReleaseBody {
name: String,
body: String,
@@ -264,7 +279,7 @@ impl UpdateCheckerThread {
};
// Parse the response
- let body: ReleaseBody = match res.into_json() {+ let body_str = match res.into_string() {
Ok(body) => body,
Err(error) => {
error!("HTTP response error: {:?}", error);
@@ -272,6 +287,22 @@ impl UpdateCheckerThread {
}
};
+ let body_json = match json::parse(&body_str) {+ Ok(body) => body,+ Err(error) => {+ error!("JSON parse error: {:?}", error);+ return self.duration;+ }+ };++ let body: ReleaseBody = match body_json.try_into() {+ Ok(body) => body,+ Err(error) => {+ error!("JSON parse error: {:?}", error);+ return self.duration;+ }+ };+
// Parse the version in the response
let version = match Version::parse(&body.name) {
Ok(version) => version,
@@ -317,3 +348,31 @@ impl UpdateCheckerThread {
}
}
}
++impl TryFrom<json::JsonValue> for ReleaseBody {+ type Error = JsonParseError;++ fn try_from(mut value: json::JsonValue) -> Result<Self, Self::Error> {+ let name = value+ .remove("name")+ .as_str()+ .ok_or(Self::Error::Name)?+ .to_string();+ let body = value+ .remove("body")+ .as_str()+ .ok_or(Self::Error::Body)?+ .to_string();+ let html_url = value+ .remove("html_url")+ .as_str()+ .ok_or(Self::Error::Url)?+ .to_string();++ Ok(Self {+ name,+ body,+ html_url,+ })+ }+}
For this reason, I will continue using serde_json.
Third, find a way to trim IDNA from URL parsing. There is a merged but as-yet unreleased PR for url that makes idna optional: servo/rust-url#728
Patching ureq locally to disable this, we get:
-rwxr-xr-x 2 jay 197609 6682112 Jan 24 21:45 target/release/cartunes.exe*
Saving 328,192 bytes over step 1. This is roughly in line with observations from servo/rust-url#727. Unfortunately, until url is released with this feature flag, I have to eat the >300 KB bloat.
The final theory I had was that gzip was unnecessarily bloating the executable. Since I'm not going to be using results from steps 2 or 3, this is the "last chance" for a big win. I still have about 2.3 MB completely unaccounted for in these savings figures. Let's disable gzip and hope for the best:
-rwxr-xr-x 2 jay 197609 6953984 Jan 24 21:55 target/release/cartunes.exe*
That saves 56,320 bytes over step 1. It's a modest win, but not nearly what I was looking for.
Now I need to continue digging for the cause of those extra megabytes. Using cargo-bloat's JSON output and a little bash-fu with jql points out two other potential crates to look at:
std added 340 functions since the 0.2.0 release
inplace_it added 322 functions since the 0.2.0 release
That's odd, why did std usage grow so much? Unfortunately, build-std didn't seem to help, plus I ran into issues with panic = "abort" when I tried: rust-lang/cargo#7359 The additional std usage is responsible for about 124 KB of executable size. This is most likely from sockets and threads, but that's just a guess.
0.3.0 was released today, and I was surprised by the zip file size:
0.2.0
0.3.0
i686-pc-windows-msvc
x86_64-apple-darwin
x86_64-pc-windows-msvc
Clearly something has changed!
Note that these are compressed sizes reported on the GitHub releases page. The rest of this analysis will focus on uncompressed executable sizes. For reference, the size of the uncompressed
x86_64
Windows exe is 4,410,880 bytes for 0.2.0, and 7,805,952 bytes for 0.3.0.My first suspect was
ureq
, added in #11. This crate handles HTTP requests. I ran some high-level analysis withcargo-bloat
and was able to verify that #11 increased the executable size by about 1.3 MB and added about 800 functions (after optimizations including inlining and dead code elimination). None of these new functions show up in the top-20 list provided bycargo-bloat
, meaning they are all less than 23 KB.The dependency tree for
ureq
points to some things that can likely be removed or replaced with a smaller alternative:ureq
v2.3.0Number of dependencies is not a great proxy for contribution to executable size, but
rustls
,serde
, andurl
looks like the heaviest transitive dependencies. We need TLS, butureq
can use eitherrustls
ornative-tls
under the hood, so we might get a smaller exe just by switching? That's one theory to test.On the other hand,
serde_json
is not necessary at all! I use it for conveniently parsing the JSON response from the GitHub API. But JSON parsing can be done with other crates like the aptly-namedjson
, which has zero additional dependencies (beyondstd
): https://crates.io/crates/json/0.12.4/dependencies Switching JSON parsing to thejson
crate may or may not reduce overall exe size. That's another theory to test. (Aside:serde
is used by other crates, so getting rid of all of it is highly unlikely.)We need to handle URLs, so I guess the
url
crate is strictly necessary. For my particular use-case, I don't need to care about IDNA at all. Maybe there is something I could do to trickureq
into not calling into theurl
crate, and LTO will trim all the useless IDNA stuff. That's theory three.But that doesn't account for all 3,395,072 bytes added between the two releases.
My second suspicion was something added when updating dependencies in #28.
cargo-bloat
pointed out an additional 1.8 MB from about 550 new functions added by this specific PR. Combined, that's roughly 3.1 MB added to the exe size from two PRs. That falls into the ballpark.This time, there are two differences of note in the dependency tree. The first is
rfd
(which handles dialog windows).Used in 0.2.0:
rfd
v0.5.1Used in 0.3.0:
rfd
v0.6.4Could
windows-rs
be responsible? It doesn't seem like it, since that crate is almost entirely a flat hierarchy of leaf functions. In my experience, these kinds of bloat issues tend to be caused by deeply nested functions with code paths that will never run: rust-console/cargo-n64@8cd9123The next suspect was ...
ureq
again! A minor update added support for gzip, which includes a small tree of its own with `flate2:ureq
v2.4.0gzip
is totally optional (both in the HTTP spec, and as a feature flag supported byureq
) so we can just disable this and see how much it saves. That's the fourth (and final?) theory for where the bloat comes from.Based on the above analysis, I have four things to check. I'm not planning to test these combinatorically, just sequentially. (For instance, step 3 is additive with steps 1 and 2; I don't intend to test what steps 1 and 3 look like while skipping step 2.) For each step, I keep the result that produces the smaller exe. Keep in mind, that may end up being exactly what we have in 0.3.0 if all four steps show that the status quo is optimal.
The basic plan is:
rustls
withnative-tls
.serde_json
withjson
.I'll start with a baseline, building the
0.3.0
tag as-is.The file size is a little different than what's in the zip because I'm locally building with a recent nightly compiler (our CI uses the latest stable compiler) and I'm also using a different linker and build flags between my local machine and CI.
From this baseline, replace
rustls
withnative-tls
:That satisfies my curiosity for whether
native-tls
produces smaller code thanrustls
. Yes, it does. About 750 KB worth of savings in my case.Next, replace
serde_json
withjson
:This theory only gave us a marginal win. About 9 KB. Well within the noise threshold, really. That result doesn't justify the additional code to maintain:
Abandoned patch
For this reason, I will continue using
serde_json
.Third, find a way to trim IDNA from URL parsing. There is a merged but as-yet unreleased PR for
url
that makesidna
optional: servo/rust-url#728Patching
ureq
locally to disable this, we get:Saving 328,192 bytes over step 1. This is roughly in line with observations from servo/rust-url#727. Unfortunately, until
url
is released with this feature flag, I have to eat the >300 KB bloat.The final theory I had was that gzip was unnecessarily bloating the executable. Since I'm not going to be using results from steps 2 or 3, this is the "last chance" for a big win. I still have about 2.3 MB completely unaccounted for in these savings figures. Let's disable gzip and hope for the best:
That saves 56,320 bytes over step 1. It's a modest win, but not nearly what I was looking for.
Now I need to continue digging for the cause of those extra megabytes. Using
cargo-bloat
's JSON output and a little bash-fu withjql
points out two other potential crates to look at:std
added 340 functions since the 0.2.0 releaseinplace_it
added 322 functions since the 0.2.0 releaseThat's odd, why did
std
usage grow so much? Unfortunately,build-std
didn't seem to help, plus I ran into issues withpanic = "abort"
when I tried: rust-lang/cargo#7359 The additionalstd
usage is responsible for about 124 KB of executable size. This is most likely from sockets and threads, but that's just a guess.I have no idea what
inplace_it
is:It's used by
wgpu
. 🤷 And it appears to contribute directly to about 32 KB of executable size.At this point, I still don't know what the last 2 MB is from. That's a crazy number to not stick out like a sore thumb.
The text was updated successfully, but these errors were encountered: