Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] DQN compatibility with compile #2571

Open
wants to merge 41 commits into
base: gh/vmoens/41/base
Choose a base branch
from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 15, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Nov 15, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2571

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 2 New Failures, 17 Unrelated Failures

As of commit 5245398 with merge base 7d7cd95 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 15, 2024
vmoens added a commit that referenced this pull request Nov 15, 2024
ghstack-source-id: 3d2b4d32e61eae7ef867057b4bcc4ba82d8118f7
Pull Request resolved: #2571
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Copy link

github-actions bot commented Dec 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}33$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4430s 0.4373s 2.2870 Ops/s 2.2456 Ops/s $\color{#35bf28}+1.84\%$
test_transformed 0.6275s 0.6235s 1.6038 Ops/s 1.5865 Ops/s $\color{#35bf28}+1.09\%$
test_serial 1.3845s 1.3752s 0.7272 Ops/s 0.7220 Ops/s $\color{#35bf28}+0.71\%$
test_parallel 1.3043s 1.2997s 0.7694 Ops/s 0.7540 Ops/s $\color{#35bf28}+2.04\%$
test_step_mdp_speed[True-True-True-True-True] 0.2168ms 29.6038μs 33.7795 KOps/s 33.4190 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[True-True-True-True-False] 45.2340μs 17.5854μs 56.8655 KOps/s 56.9625 KOps/s $\color{#d91a1a}-0.17\%$
test_step_mdp_speed[True-True-True-False-True] 59.8010μs 16.6733μs 59.9762 KOps/s 59.3334 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[True-True-True-False-False] 40.7960μs 9.8840μs 101.1734 KOps/s 101.5589 KOps/s $\color{#d91a1a}-0.38\%$
test_step_mdp_speed[True-True-False-True-True] 88.1130μs 31.5993μs 31.6463 KOps/s 30.9196 KOps/s $\color{#35bf28}+2.35\%$
test_step_mdp_speed[True-True-False-True-False] 60.6820μs 19.5256μs 51.2147 KOps/s 50.8824 KOps/s $\color{#35bf28}+0.65\%$
test_step_mdp_speed[True-True-False-False-True] 54.6110μs 18.4718μs 54.1365 KOps/s 53.1607 KOps/s $\color{#35bf28}+1.84\%$
test_step_mdp_speed[True-True-False-False-False] 42.2680μs 11.6085μs 86.1436 KOps/s 85.3609 KOps/s $\color{#35bf28}+0.92\%$
test_step_mdp_speed[True-False-True-True-True] 74.1680μs 33.6585μs 29.7101 KOps/s 29.1461 KOps/s $\color{#35bf28}+1.94\%$
test_step_mdp_speed[True-False-True-True-False] 53.6700μs 21.1064μs 47.3790 KOps/s 46.9015 KOps/s $\color{#35bf28}+1.02\%$
test_step_mdp_speed[True-False-True-False-True] 57.5270μs 18.2872μs 54.6830 KOps/s 53.4541 KOps/s $\color{#35bf28}+2.30\%$
test_step_mdp_speed[True-False-True-False-False] 39.4940μs 11.5632μs 86.4812 KOps/s 86.0385 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[True-False-False-True-True] 92.7430μs 35.1269μs 28.4682 KOps/s 28.3374 KOps/s $\color{#35bf28}+0.46\%$
test_step_mdp_speed[True-False-False-True-False] 55.0620μs 22.8023μs 43.8551 KOps/s 43.5157 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[True-False-False-False-True] 57.1770μs 20.1695μs 49.5799 KOps/s 49.1637 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[True-False-False-False-False] 50.7740μs 13.2936μs 75.2242 KOps/s 74.8616 KOps/s $\color{#35bf28}+0.48\%$
test_step_mdp_speed[False-True-True-True-True] 0.1100ms 33.3419μs 29.9923 KOps/s 29.0048 KOps/s $\color{#35bf28}+3.40\%$
test_step_mdp_speed[False-True-True-True-False] 51.0640μs 21.2410μs 47.0787 KOps/s 46.2450 KOps/s $\color{#35bf28}+1.80\%$
test_step_mdp_speed[False-True-True-False-True] 66.7240μs 21.0710μs 47.4585 KOps/s 46.6524 KOps/s $\color{#35bf28}+1.73\%$
test_step_mdp_speed[False-True-True-False-False] 40.9560μs 12.9375μs 77.2945 KOps/s 77.8083 KOps/s $\color{#d91a1a}-0.66\%$
test_step_mdp_speed[False-True-False-True-True] 80.2190μs 34.9861μs 28.5828 KOps/s 28.3419 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[False-True-False-True-False] 56.2240μs 22.9275μs 43.6157 KOps/s 43.5309 KOps/s $\color{#35bf28}+0.19\%$
test_step_mdp_speed[False-True-False-False-True] 2.8883ms 22.5082μs 44.4282 KOps/s 43.4272 KOps/s $\color{#35bf28}+2.31\%$
test_step_mdp_speed[False-True-False-False-False] 38.6220μs 14.5723μs 68.6234 KOps/s 68.0223 KOps/s $\color{#35bf28}+0.88\%$
test_step_mdp_speed[False-False-True-True-True] 69.0980μs 36.5996μs 27.3227 KOps/s 26.2719 KOps/s $\color{#35bf28}+4.00\%$
test_step_mdp_speed[False-False-True-True-False] 68.1670μs 24.5988μs 40.6524 KOps/s 39.7556 KOps/s $\color{#35bf28}+2.26\%$
test_step_mdp_speed[False-False-True-False-True] 51.1250μs 22.4580μs 44.5276 KOps/s 43.9270 KOps/s $\color{#35bf28}+1.37\%$
test_step_mdp_speed[False-False-True-False-False] 41.9880μs 14.5368μs 68.7909 KOps/s 68.3462 KOps/s $\color{#35bf28}+0.65\%$
test_step_mdp_speed[False-False-False-True-True] 75.0390μs 38.0290μs 26.2957 KOps/s 25.3470 KOps/s $\color{#35bf28}+3.74\%$
test_step_mdp_speed[False-False-False-True-False] 59.5000μs 26.3377μs 37.9684 KOps/s 37.9539 KOps/s $\color{#35bf28}+0.04\%$
test_step_mdp_speed[False-False-False-False-True] 57.6670μs 23.8993μs 41.8423 KOps/s 41.0447 KOps/s $\color{#35bf28}+1.94\%$
test_step_mdp_speed[False-False-False-False-False] 39.9540μs 16.0779μs 62.1971 KOps/s 61.7301 KOps/s $\color{#35bf28}+0.76\%$
test_values[generalized_advantage_estimate-True-True] 12.3539ms 9.6157ms 103.9961 Ops/s 102.7479 Ops/s $\color{#35bf28}+1.21\%$
test_values[vec_generalized_advantage_estimate-True-True] 35.4525ms 33.5698ms 29.7887 Ops/s 28.0231 Ops/s $\textbf{\color{#35bf28}+6.30\%}$
test_values[td0_return_estimate-False-False] 0.2500ms 0.1951ms 5.1248 KOps/s 5.2743 KOps/s $\color{#d91a1a}-2.83\%$
test_values[td1_return_estimate-False-False] 27.1187ms 24.5673ms 40.7045 Ops/s 42.4020 Ops/s $\color{#d91a1a}-4.00\%$
test_values[vec_td1_return_estimate-False-False] 35.8240ms 33.7225ms 29.6538 Ops/s 27.7647 Ops/s $\textbf{\color{#35bf28}+6.80\%}$
test_values[td_lambda_return_estimate-True-False] 37.5328ms 35.0542ms 28.5273 Ops/s 29.6252 Ops/s $\color{#d91a1a}-3.71\%$
test_values[vec_td_lambda_return_estimate-True-False] 46.2665ms 34.2032ms 29.2370 Ops/s 28.0591 Ops/s $\color{#35bf28}+4.20\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 11.9112ms 8.2739ms 120.8624 Ops/s 121.3651 Ops/s $\color{#d91a1a}-0.41\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.0297ms 1.8994ms 526.4858 Ops/s 504.5151 Ops/s $\color{#35bf28}+4.35\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.6432ms 0.3678ms 2.7188 KOps/s 2.8098 KOps/s $\color{#d91a1a}-3.24\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 50.8100ms 45.9279ms 21.7733 Ops/s 20.2531 Ops/s $\textbf{\color{#35bf28}+7.51\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.8849ms 3.1178ms 320.7356 Ops/s 323.0843 Ops/s $\color{#d91a1a}-0.73\%$
test_dqn_speed[False-None] 6.2223ms 1.4340ms 697.3355 Ops/s 714.3151 Ops/s $\color{#d91a1a}-2.38\%$
test_dqn_speed[False-backward] 1.9686ms 1.9165ms 521.7869 Ops/s 538.6910 Ops/s $\color{#d91a1a}-3.14\%$
test_dqn_speed[True-None] 0.6141ms 0.4626ms 2.1615 KOps/s 2.1489 KOps/s $\color{#35bf28}+0.59\%$
test_dqn_speed[True-backward] 0.9591ms 0.9115ms 1.0970 KOps/s 1.1108 KOps/s $\color{#d91a1a}-1.24\%$
test_dqn_speed[reduce-overhead-None] 0.6747ms 0.4647ms 2.1518 KOps/s 2.1305 KOps/s $\color{#35bf28}+1.00\%$
test_dqn_speed[reduce-overhead-backward] 0.9792ms 0.9177ms 1.0897 KOps/s 1.1136 KOps/s $\color{#d91a1a}-2.15\%$
test_ddpg_speed[False-None] 3.6993ms 2.9272ms 341.6244 Ops/s 347.5716 Ops/s $\color{#d91a1a}-1.71\%$
test_ddpg_speed[False-backward] 4.1819ms 4.0353ms 247.8149 Ops/s 250.5479 Ops/s $\color{#d91a1a}-1.09\%$
test_ddpg_speed[True-None] 2.0567ms 1.0084ms 991.6845 Ops/s 998.2760 Ops/s $\color{#d91a1a}-0.66\%$
test_ddpg_speed[True-backward] 1.9367ms 1.8834ms 530.9607 Ops/s 515.8387 Ops/s $\color{#35bf28}+2.93\%$
test_ddpg_speed[reduce-overhead-None] 1.2776ms 0.9982ms 1.0018 KOps/s 998.6684 Ops/s $\color{#35bf28}+0.32\%$
test_ddpg_speed[reduce-overhead-backward] 2.4937ms 1.9372ms 516.2093 Ops/s 514.3183 Ops/s $\color{#35bf28}+0.37\%$
test_sac_speed[False-None] 9.2332ms 8.1704ms 122.3923 Ops/s 119.1083 Ops/s $\color{#35bf28}+2.76\%$
test_sac_speed[False-backward] 11.3866ms 10.9107ms 91.6529 Ops/s 90.1336 Ops/s $\color{#35bf28}+1.69\%$
test_sac_speed[True-None] 2.1277ms 1.8383ms 543.9830 Ops/s 535.2898 Ops/s $\color{#35bf28}+1.62\%$
test_sac_speed[True-backward] 3.6703ms 3.5447ms 282.1136 Ops/s 277.4054 Ops/s $\color{#35bf28}+1.70\%$
test_sac_speed[reduce-overhead-None] 2.1719ms 1.8416ms 543.0138 Ops/s 539.4662 Ops/s $\color{#35bf28}+0.66\%$
test_sac_speed[reduce-overhead-backward] 3.6021ms 3.5117ms 284.7617 Ops/s 271.3017 Ops/s $\color{#35bf28}+4.96\%$
test_redq_speed[False-None] 0.2361s 16.0566ms 62.2796 Ops/s 71.7884 Ops/s $\textbf{\color{#d91a1a}-13.25\%}$
test_redq_speed[False-backward] 23.3291ms 22.3895ms 44.6638 Ops/s 43.2739 Ops/s $\color{#35bf28}+3.21\%$
test_redq_speed[True-None] 5.2742ms 4.5251ms 220.9878 Ops/s 204.6947 Ops/s $\textbf{\color{#35bf28}+7.96\%}$
test_redq_speed[True-backward] 12.6001ms 12.0163ms 83.2201 Ops/s 76.9362 Ops/s $\textbf{\color{#35bf28}+8.17\%}$
test_redq_speed[reduce-overhead-None] 5.5500ms 4.7752ms 209.4152 Ops/s 182.5120 Ops/s $\textbf{\color{#35bf28}+14.74\%}$
test_redq_speed[reduce-overhead-backward] 13.5676ms 12.3105ms 81.2315 Ops/s 77.6452 Ops/s $\color{#35bf28}+4.62\%$
test_redq_deprec_speed[False-None] 14.5295ms 12.9345ms 77.3127 Ops/s 73.2156 Ops/s $\textbf{\color{#35bf28}+5.60\%}$
test_redq_deprec_speed[False-backward] 21.1034ms 18.6035ms 53.7533 Ops/s 50.1873 Ops/s $\textbf{\color{#35bf28}+7.11\%}$
test_redq_deprec_speed[True-None] 4.5988ms 3.6577ms 273.3965 Ops/s 266.2215 Ops/s $\color{#35bf28}+2.70\%$
test_redq_deprec_speed[True-backward] 8.3622ms 7.9867ms 125.2081 Ops/s 106.1319 Ops/s $\textbf{\color{#35bf28}+17.97\%}$
test_redq_deprec_speed[reduce-overhead-None] 4.3609ms 3.5644ms 280.5496 Ops/s 240.2511 Ops/s $\textbf{\color{#35bf28}+16.77\%}$
test_redq_deprec_speed[reduce-overhead-backward] 9.4407ms 8.0533ms 124.1729 Ops/s 105.7145 Ops/s $\textbf{\color{#35bf28}+17.46\%}$
test_td3_speed[False-None] 8.4540ms 8.0606ms 124.0596 Ops/s 117.7312 Ops/s $\textbf{\color{#35bf28}+5.38\%}$
test_td3_speed[False-backward] 10.6580ms 10.4386ms 95.7985 Ops/s 89.9592 Ops/s $\textbf{\color{#35bf28}+6.49\%}$
test_td3_speed[True-None] 1.8982ms 1.7084ms 585.3530 Ops/s 561.4223 Ops/s $\color{#35bf28}+4.26\%$
test_td3_speed[True-backward] 3.3833ms 3.3102ms 302.0961 Ops/s 285.5901 Ops/s $\textbf{\color{#35bf28}+5.78\%}$
test_td3_speed[reduce-overhead-None] 1.9334ms 1.7020ms 587.5604 Ops/s 560.0219 Ops/s $\color{#35bf28}+4.92\%$
test_td3_speed[reduce-overhead-backward] 3.5044ms 3.3983ms 294.2645 Ops/s 283.2067 Ops/s $\color{#35bf28}+3.90\%$
test_cql_speed[False-None] 40.3555ms 36.5778ms 27.3390 Ops/s 26.0955 Ops/s $\color{#35bf28}+4.76\%$
test_cql_speed[False-backward] 48.2095ms 46.5093ms 21.5011 Ops/s 20.8680 Ops/s $\color{#35bf28}+3.03\%$
test_cql_speed[True-None] 16.9814ms 15.9977ms 62.5090 Ops/s 62.1562 Ops/s $\color{#35bf28}+0.57\%$
test_cql_speed[True-backward] 23.7576ms 22.8380ms 43.7867 Ops/s 43.7065 Ops/s $\color{#35bf28}+0.18\%$
test_cql_speed[reduce-overhead-None] 17.0644ms 15.5926ms 64.1331 Ops/s 62.7079 Ops/s $\color{#35bf28}+2.27\%$
test_cql_speed[reduce-overhead-backward] 23.9537ms 22.3244ms 44.7939 Ops/s 42.9562 Ops/s $\color{#35bf28}+4.28\%$
test_a2c_speed[False-None] 8.6752ms 7.1790ms 139.2944 Ops/s 132.3854 Ops/s $\textbf{\color{#35bf28}+5.22\%}$
test_a2c_speed[False-backward] 15.4408ms 14.5748ms 68.6118 Ops/s 67.8965 Ops/s $\color{#35bf28}+1.05\%$
test_a2c_speed[True-None] 5.6299ms 4.2037ms 237.8857 Ops/s 221.4804 Ops/s $\textbf{\color{#35bf28}+7.41\%}$
test_a2c_speed[True-backward] 11.8959ms 11.3967ms 87.7447 Ops/s 87.7654 Ops/s $\color{#d91a1a}-0.02\%$
test_a2c_speed[reduce-overhead-None] 5.8742ms 4.3749ms 228.5785 Ops/s 218.2081 Ops/s $\color{#35bf28}+4.75\%$
test_a2c_speed[reduce-overhead-backward] 11.6318ms 11.0307ms 90.6559 Ops/s 88.2868 Ops/s $\color{#35bf28}+2.68\%$
test_ppo_speed[False-None] 8.5399ms 7.4648ms 133.9628 Ops/s 127.7556 Ops/s $\color{#35bf28}+4.86\%$
test_ppo_speed[False-backward] 15.7289ms 14.9259ms 66.9977 Ops/s 64.7933 Ops/s $\color{#35bf28}+3.40\%$
test_ppo_speed[True-None] 4.4382ms 3.7373ms 267.5732 Ops/s 262.6545 Ops/s $\color{#35bf28}+1.87\%$
test_ppo_speed[True-backward] 10.6812ms 9.7800ms 102.2496 Ops/s 97.5458 Ops/s $\color{#35bf28}+4.82\%$
test_ppo_speed[reduce-overhead-None] 4.6837ms 3.7311ms 268.0156 Ops/s 250.9733 Ops/s $\textbf{\color{#35bf28}+6.79\%}$
test_ppo_speed[reduce-overhead-backward] 10.4627ms 9.7031ms 103.0594 Ops/s 97.1219 Ops/s $\textbf{\color{#35bf28}+6.11\%}$
test_reinforce_speed[False-None] 7.4701ms 6.7640ms 147.8425 Ops/s 143.7353 Ops/s $\color{#35bf28}+2.86\%$
test_reinforce_speed[False-backward] 10.5673ms 10.0875ms 99.1325 Ops/s 95.2817 Ops/s $\color{#35bf28}+4.04\%$
test_reinforce_speed[True-None] 4.0625ms 2.6446ms 378.1267 Ops/s 351.2102 Ops/s $\textbf{\color{#35bf28}+7.66\%}$
test_reinforce_speed[True-backward] 9.3885ms 8.7101ms 114.8092 Ops/s 104.4437 Ops/s $\textbf{\color{#35bf28}+9.92\%}$
test_reinforce_speed[reduce-overhead-None] 2.9420ms 2.6382ms 379.0501 Ops/s 352.0362 Ops/s $\textbf{\color{#35bf28}+7.67\%}$
test_reinforce_speed[reduce-overhead-backward] 9.4454ms 8.8852ms 112.5463 Ops/s 105.8189 Ops/s $\textbf{\color{#35bf28}+6.36\%}$
test_iql_speed[False-None] 33.1813ms 32.1550ms 31.0993 Ops/s 29.7539 Ops/s $\color{#35bf28}+4.52\%$
test_iql_speed[False-backward] 46.5300ms 45.0688ms 22.1883 Ops/s 20.9693 Ops/s $\textbf{\color{#35bf28}+5.81\%}$
test_iql_speed[True-None] 11.4727ms 10.9476ms 91.3442 Ops/s 87.5401 Ops/s $\color{#35bf28}+4.35\%$
test_iql_speed[True-backward] 23.3975ms 21.8649ms 45.7354 Ops/s 44.4142 Ops/s $\color{#35bf28}+2.97\%$
test_iql_speed[reduce-overhead-None] 12.0949ms 11.0556ms 90.4518 Ops/s 88.4991 Ops/s $\color{#35bf28}+2.21\%$
test_iql_speed[reduce-overhead-backward] 22.2461ms 21.3671ms 46.8009 Ops/s 43.8283 Ops/s $\textbf{\color{#35bf28}+6.78\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.3471ms 4.9742ms 201.0393 Ops/s 192.0292 Ops/s $\color{#35bf28}+4.69\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1842ms 0.5172ms 1.9336 KOps/s 1.9448 KOps/s $\color{#d91a1a}-0.58\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8433ms 0.4911ms 2.0364 KOps/s 2.0178 KOps/s $\color{#35bf28}+0.92\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.6469ms 4.7674ms 209.7583 Ops/s 192.9673 Ops/s $\textbf{\color{#35bf28}+8.70\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.3540s 0.7669ms 1.3040 KOps/s 1.9595 KOps/s $\textbf{\color{#d91a1a}-33.45\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8723ms 0.4768ms 2.0971 KOps/s 2.0609 KOps/s $\color{#35bf28}+1.76\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.4592ms 1.6449ms 607.9235 Ops/s 601.3071 Ops/s $\color{#35bf28}+1.10\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.1128ms 1.5820ms 632.1010 Ops/s 620.0360 Ops/s $\color{#35bf28}+1.95\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.8549ms 4.8088ms 207.9506 Ops/s 184.3013 Ops/s $\textbf{\color{#35bf28}+12.83\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0939ms 0.6440ms 1.5527 KOps/s 1.4635 KOps/s $\textbf{\color{#35bf28}+6.09\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9196ms 0.6211ms 1.6100 KOps/s 1.5479 KOps/s $\color{#35bf28}+4.02\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.9779ms 4.7329ms 211.2876 Ops/s 197.5699 Ops/s $\textbf{\color{#35bf28}+6.94\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9304ms 0.5182ms 1.9299 KOps/s 1.8503 KOps/s $\color{#35bf28}+4.30\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8567ms 0.5007ms 1.9973 KOps/s 1.9998 KOps/s $\color{#d91a1a}-0.12\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.5371ms 4.8920ms 204.4168 Ops/s 197.1713 Ops/s $\color{#35bf28}+3.67\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8231ms 0.5113ms 1.9559 KOps/s 1.8933 KOps/s $\color{#35bf28}+3.31\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7850ms 0.4989ms 2.0043 KOps/s 2.0228 KOps/s $\color{#d91a1a}-0.91\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.1943ms 4.9938ms 200.2475 Ops/s 187.6227 Ops/s $\textbf{\color{#35bf28}+6.73\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.2606ms 0.6682ms 1.4966 KOps/s 1.4817 KOps/s $\color{#35bf28}+1.00\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9167ms 0.6345ms 1.5760 KOps/s 1.5776 KOps/s $\color{#d91a1a}-0.10\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4265s 12.7211ms 78.6093 Ops/s 241.3768 Ops/s $\textbf{\color{#d91a1a}-67.43\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.3918ms 2.3406ms 427.2407 Ops/s 440.2653 Ops/s $\color{#d91a1a}-2.96\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.3062ms 1.3787ms 725.3115 Ops/s 791.3374 Ops/s $\textbf{\color{#d91a1a}-8.34\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.8234ms 4.3469ms 230.0468 Ops/s 248.1994 Ops/s $\textbf{\color{#d91a1a}-7.31\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.7058ms 2.3555ms 424.5294 Ops/s 386.9022 Ops/s $\textbf{\color{#35bf28}+9.73\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 4.6686ms 1.3219ms 756.4936 Ops/s 741.7564 Ops/s $\color{#35bf28}+1.99\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4314s 13.0843ms 76.4276 Ops/s 32.7094 Ops/s $\textbf{\color{#35bf28}+133.66\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 11.5828ms 2.5147ms 397.6649 Ops/s 434.4869 Ops/s $\textbf{\color{#d91a1a}-8.47\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.1483ms 1.4271ms 700.7137 Ops/s 644.5104 Ops/s $\textbf{\color{#35bf28}+8.72\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 18.6097ms 11.3733ms 87.9252 Ops/s 82.9516 Ops/s $\textbf{\color{#35bf28}+6.00\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.0717ms 15.3780ms 65.0282 Ops/s 64.8512 Ops/s $\color{#35bf28}+0.27\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.3698ms 19.8092ms 50.4815 Ops/s 48.9131 Ops/s $\color{#35bf28}+3.21\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.2034ms 15.5693ms 64.2290 Ops/s 65.1643 Ops/s $\color{#d91a1a}-1.44\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.5693ms 19.8638ms 50.3428 Ops/s 48.8117 Ops/s $\color{#35bf28}+3.14\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 18.7702ms 16.8438ms 59.3692 Ops/s 59.5751 Ops/s $\color{#d91a1a}-0.35\%$

Copy link

github-actions bot commented Dec 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}25$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7545s 0.7473s 1.3381 Ops/s 1.3412 Ops/s $\color{#d91a1a}-0.23\%$
test_transformed 1.0014s 1.0002s 0.9998 Ops/s 0.9982 Ops/s $\color{#35bf28}+0.16\%$
test_serial 2.1277s 2.1267s 0.4702 Ops/s 0.4682 Ops/s $\color{#35bf28}+0.43\%$
test_parallel 2.0126s 1.9977s 0.5006 Ops/s 0.5051 Ops/s $\color{#d91a1a}-0.89\%$
test_step_mdp_speed[True-True-True-True-True] 0.1638ms 38.9984μs 25.6421 KOps/s 26.0615 KOps/s $\color{#d91a1a}-1.61\%$
test_step_mdp_speed[True-True-True-True-False] 0.2171ms 22.9351μs 43.6013 KOps/s 44.3107 KOps/s $\color{#d91a1a}-1.60\%$
test_step_mdp_speed[True-True-True-False-True] 50.2120μs 22.4710μs 44.5019 KOps/s 46.5785 KOps/s $\color{#d91a1a}-4.46\%$
test_step_mdp_speed[True-True-True-False-False] 47.6220μs 12.9779μs 77.0541 KOps/s 78.4476 KOps/s $\color{#d91a1a}-1.78\%$
test_step_mdp_speed[True-True-False-True-True] 75.7530μs 41.8730μs 23.8818 KOps/s 23.9639 KOps/s $\color{#d91a1a}-0.34\%$
test_step_mdp_speed[True-True-False-True-False] 91.2340μs 25.2530μs 39.5993 KOps/s 40.0114 KOps/s $\color{#d91a1a}-1.03\%$
test_step_mdp_speed[True-True-False-False-True] 0.2130ms 24.4523μs 40.8959 KOps/s 42.2588 KOps/s $\color{#d91a1a}-3.23\%$
test_step_mdp_speed[True-True-False-False-False] 76.1630μs 15.1787μs 65.8818 KOps/s 66.7695 KOps/s $\color{#d91a1a}-1.33\%$
test_step_mdp_speed[True-False-True-True-True] 0.2390ms 45.2140μs 22.1171 KOps/s 22.5654 KOps/s $\color{#d91a1a}-1.99\%$
test_step_mdp_speed[True-False-True-True-False] 0.1413ms 27.4322μs 36.4536 KOps/s 36.9694 KOps/s $\color{#d91a1a}-1.40\%$
test_step_mdp_speed[True-False-True-False-True] 97.3350μs 24.2936μs 41.1631 KOps/s 41.8172 KOps/s $\color{#d91a1a}-1.56\%$
test_step_mdp_speed[True-False-True-False-False] 45.5620μs 15.1163μs 66.1539 KOps/s 67.6031 KOps/s $\color{#d91a1a}-2.14\%$
test_step_mdp_speed[True-False-False-True-True] 81.8630μs 47.3318μs 21.1274 KOps/s 21.9222 KOps/s $\color{#d91a1a}-3.63\%$
test_step_mdp_speed[True-False-False-True-False] 61.8030μs 29.4113μs 34.0005 KOps/s 34.4497 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[True-False-False-False-True] 88.8640μs 26.8502μs 37.2437 KOps/s 38.5808 KOps/s $\color{#d91a1a}-3.47\%$
test_step_mdp_speed[True-False-False-False-False] 81.1040μs 17.2707μs 57.9015 KOps/s 59.0057 KOps/s $\color{#d91a1a}-1.87\%$
test_step_mdp_speed[False-True-True-True-True] 72.2530μs 45.1303μs 22.1580 KOps/s 22.6793 KOps/s $\color{#d91a1a}-2.30\%$
test_step_mdp_speed[False-True-True-True-False] 0.1002ms 27.2688μs 36.6720 KOps/s 36.9798 KOps/s $\color{#d91a1a}-0.83\%$
test_step_mdp_speed[False-True-True-False-True] 67.8340μs 28.1524μs 35.5210 KOps/s 36.1208 KOps/s $\color{#d91a1a}-1.66\%$
test_step_mdp_speed[False-True-True-False-False] 56.5320μs 16.8743μs 59.2618 KOps/s 61.6668 KOps/s $\color{#d91a1a}-3.90\%$
test_step_mdp_speed[False-True-False-True-True] 77.2840μs 46.2971μs 21.5996 KOps/s 21.4804 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[False-True-False-True-False] 62.3930μs 29.5189μs 33.8766 KOps/s 34.4261 KOps/s $\color{#d91a1a}-1.60\%$
test_step_mdp_speed[False-True-False-False-True] 3.2147ms 30.3987μs 32.8962 KOps/s 33.7644 KOps/s $\color{#d91a1a}-2.57\%$
test_step_mdp_speed[False-True-False-False-False] 0.1064ms 19.2875μs 51.8470 KOps/s 54.6689 KOps/s $\textbf{\color{#d91a1a}-5.16\%}$
test_step_mdp_speed[False-False-True-True-True] 97.3650μs 49.3682μs 20.2560 KOps/s 20.8826 KOps/s $\color{#d91a1a}-3.00\%$
test_step_mdp_speed[False-False-True-True-False] 0.2384ms 31.8119μs 31.4348 KOps/s 31.8123 KOps/s $\color{#d91a1a}-1.19\%$
test_step_mdp_speed[False-False-True-False-True] 94.1240μs 30.0113μs 33.3208 KOps/s 33.3028 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[False-False-True-False-False] 0.1432ms 19.2190μs 52.0319 KOps/s 54.3768 KOps/s $\color{#d91a1a}-4.31\%$
test_step_mdp_speed[False-False-False-True-True] 80.4340μs 50.7390μs 19.7087 KOps/s 20.0387 KOps/s $\color{#d91a1a}-1.65\%$
test_step_mdp_speed[False-False-False-True-False] 67.1930μs 34.0466μs 29.3715 KOps/s 30.1697 KOps/s $\color{#d91a1a}-2.65\%$
test_step_mdp_speed[False-False-False-False-True] 58.9420μs 31.7657μs 31.4805 KOps/s 31.6966 KOps/s $\color{#d91a1a}-0.68\%$
test_step_mdp_speed[False-False-False-False-False] 0.2085ms 20.8388μs 47.9875 KOps/s 49.2955 KOps/s $\color{#d91a1a}-2.65\%$
test_values[generalized_advantage_estimate-True-True] 25.3260ms 24.4590ms 40.8848 Ops/s 41.0310 Ops/s $\color{#d91a1a}-0.36\%$
test_values[vec_generalized_advantage_estimate-True-True] 97.8402ms 2.8487ms 351.0389 Ops/s 322.7811 Ops/s $\textbf{\color{#35bf28}+8.75\%}$
test_values[td0_return_estimate-False-False] 0.1129ms 79.3295μs 12.6057 KOps/s 12.5993 KOps/s $\color{#35bf28}+0.05\%$
test_values[td1_return_estimate-False-False] 54.5204ms 54.1909ms 18.4533 Ops/s 18.4234 Ops/s $\color{#35bf28}+0.16\%$
test_values[vec_td1_return_estimate-False-False] 1.3779ms 1.0770ms 928.4905 Ops/s 930.4743 Ops/s $\color{#d91a1a}-0.21\%$
test_values[td_lambda_return_estimate-True-False] 86.4458ms 86.0959ms 11.6150 Ops/s 11.5964 Ops/s $\color{#35bf28}+0.16\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.5181ms 1.0778ms 927.8008 Ops/s 930.6151 Ops/s $\color{#d91a1a}-0.30\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.6839ms 24.3407ms 41.0835 Ops/s 41.3766 Ops/s $\color{#d91a1a}-0.71\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0559ms 0.7424ms 1.3470 KOps/s 1.3358 KOps/s $\color{#35bf28}+0.84\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7990ms 0.6597ms 1.5159 KOps/s 1.5056 KOps/s $\color{#35bf28}+0.69\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6455ms 1.4741ms 678.3998 Ops/s 675.3520 Ops/s $\color{#35bf28}+0.45\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8275ms 0.6740ms 1.4836 KOps/s 1.4647 KOps/s $\color{#35bf28}+1.29\%$
test_dqn_speed[False-None] 7.0490ms 1.5082ms 663.0246 Ops/s 671.7257 Ops/s $\color{#d91a1a}-1.30\%$
test_dqn_speed[False-backward] 2.8430ms 2.1065ms 474.7183 Ops/s 474.9814 Ops/s $\color{#d91a1a}-0.06\%$
test_dqn_speed[True-None] 0.6994ms 0.5340ms 1.8726 KOps/s 1.8620 KOps/s $\color{#35bf28}+0.57\%$
test_dqn_speed[True-backward] 1.3281ms 1.1842ms 844.4322 Ops/s 828.7974 Ops/s $\color{#35bf28}+1.89\%$
test_dqn_speed[reduce-overhead-None] 0.7487ms 0.5524ms 1.8104 KOps/s 1.7157 KOps/s $\textbf{\color{#35bf28}+5.52\%}$
test_dqn_speed[reduce-overhead-backward] 1.1399ms 1.0528ms 949.8132 Ops/s 932.4933 Ops/s $\color{#35bf28}+1.86\%$
test_ddpg_speed[False-None] 3.1452ms 2.8221ms 354.3503 Ops/s 349.2218 Ops/s $\color{#35bf28}+1.47\%$
test_ddpg_speed[False-backward] 4.5741ms 4.1403ms 241.5271 Ops/s 240.3507 Ops/s $\color{#35bf28}+0.49\%$
test_ddpg_speed[True-None] 1.2999ms 1.0967ms 911.8660 Ops/s 890.7113 Ops/s $\color{#35bf28}+2.38\%$
test_ddpg_speed[True-backward] 2.4126ms 2.2549ms 443.4805 Ops/s 434.7419 Ops/s $\color{#35bf28}+2.01\%$
test_ddpg_speed[reduce-overhead-None] 1.4423ms 1.1212ms 891.9046 Ops/s 917.7089 Ops/s $\color{#d91a1a}-2.81\%$
test_ddpg_speed[reduce-overhead-backward] 2.0090ms 1.7402ms 574.6581 Ops/s 561.8859 Ops/s $\color{#35bf28}+2.27\%$
test_sac_speed[False-None] 8.5476ms 7.9657ms 125.5375 Ops/s 125.6320 Ops/s $\color{#d91a1a}-0.08\%$
test_sac_speed[False-backward] 11.4693ms 10.9737ms 91.1273 Ops/s 90.7214 Ops/s $\color{#35bf28}+0.45\%$
test_sac_speed[True-None] 1.7658ms 1.5203ms 657.7556 Ops/s 650.1322 Ops/s $\color{#35bf28}+1.17\%$
test_sac_speed[True-backward] 3.3591ms 3.2001ms 312.4915 Ops/s 310.7809 Ops/s $\color{#35bf28}+0.55\%$
test_sac_speed[reduce-overhead-None] 22.9407ms 12.7141ms 78.6531 Ops/s 78.4324 Ops/s $\color{#35bf28}+0.28\%$
test_sac_speed[reduce-overhead-backward] 1.4450ms 1.3208ms 757.1044 Ops/s 750.1743 Ops/s $\color{#35bf28}+0.92\%$
test_redq_speed[False-None] 8.2228ms 7.4288ms 134.6106 Ops/s 133.2521 Ops/s $\color{#35bf28}+1.02\%$
test_redq_speed[False-backward] 11.7239ms 11.0863ms 90.2018 Ops/s 89.5695 Ops/s $\color{#35bf28}+0.71\%$
test_redq_speed[True-None] 2.1468ms 1.9740ms 506.5942 Ops/s 501.2243 Ops/s $\color{#35bf28}+1.07\%$
test_redq_speed[True-backward] 3.9140ms 3.6308ms 275.4207 Ops/s 259.5181 Ops/s $\textbf{\color{#35bf28}+6.13\%}$
test_redq_speed[reduce-overhead-None] 2.3407ms 1.9958ms 501.0586 Ops/s 503.0953 Ops/s $\color{#d91a1a}-0.40\%$
test_redq_speed[reduce-overhead-backward] 3.8198ms 3.6351ms 275.0934 Ops/s 259.4489 Ops/s $\textbf{\color{#35bf28}+6.03\%}$
test_redq_deprec_speed[False-None] 9.5228ms 8.9228ms 112.0721 Ops/s 111.4147 Ops/s $\color{#35bf28}+0.59\%$
test_redq_deprec_speed[False-backward] 12.4235ms 11.7820ms 84.8752 Ops/s 82.9984 Ops/s $\color{#35bf28}+2.26\%$
test_redq_deprec_speed[True-None] 2.7574ms 2.3411ms 427.1532 Ops/s 409.8217 Ops/s $\color{#35bf28}+4.23\%$
test_redq_deprec_speed[True-backward] 4.1117ms 3.9645ms 252.2383 Ops/s 251.2186 Ops/s $\color{#35bf28}+0.41\%$
test_redq_deprec_speed[reduce-overhead-None] 2.5937ms 2.3150ms 431.9642 Ops/s 414.7241 Ops/s $\color{#35bf28}+4.16\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.2465ms 3.9445ms 253.5195 Ops/s 251.5891 Ops/s $\color{#35bf28}+0.77\%$
test_td3_speed[False-None] 8.0558ms 7.7988ms 128.2253 Ops/s 120.1291 Ops/s $\textbf{\color{#35bf28}+6.74\%}$
test_td3_speed[False-backward] 10.4787ms 9.9550ms 100.4518 Ops/s 97.2761 Ops/s $\color{#35bf28}+3.26\%$
test_td3_speed[True-None] 1.6294ms 1.5652ms 638.8910 Ops/s 644.4257 Ops/s $\color{#d91a1a}-0.86\%$
test_td3_speed[True-backward] 3.2457ms 3.0760ms 325.1008 Ops/s 320.8061 Ops/s $\color{#35bf28}+1.34\%$
test_td3_speed[reduce-overhead-None] 50.2582ms 25.7672ms 38.8090 Ops/s 36.5964 Ops/s $\textbf{\color{#35bf28}+6.05\%}$
test_td3_speed[reduce-overhead-backward] 1.3326ms 1.2631ms 791.7261 Ops/s 771.0539 Ops/s $\color{#35bf28}+2.68\%$
test_cql_speed[False-None] 17.2604ms 16.5362ms 60.4732 Ops/s 59.6099 Ops/s $\color{#35bf28}+1.45\%$
test_cql_speed[False-backward] 22.4267ms 21.5157ms 46.4777 Ops/s 46.3152 Ops/s $\color{#35bf28}+0.35\%$
test_cql_speed[True-None] 3.1997ms 2.9208ms 342.3683 Ops/s 339.4887 Ops/s $\color{#35bf28}+0.85\%$
test_cql_speed[True-backward] 5.4467ms 5.0228ms 199.0938 Ops/s 189.6585 Ops/s $\color{#35bf28}+4.97\%$
test_cql_speed[reduce-overhead-None] 22.0707ms 13.2019ms 75.7467 Ops/s 75.1557 Ops/s $\color{#35bf28}+0.79\%$
test_cql_speed[reduce-overhead-backward] 1.6168ms 1.4877ms 672.1585 Ops/s 599.1869 Ops/s $\textbf{\color{#35bf28}+12.18\%}$
test_a2c_speed[False-None] 3.4183ms 3.1508ms 317.3804 Ops/s 314.4321 Ops/s $\color{#35bf28}+0.94\%$
test_a2c_speed[False-backward] 6.6175ms 5.9308ms 168.6116 Ops/s 159.7476 Ops/s $\textbf{\color{#35bf28}+5.55\%}$
test_a2c_speed[True-None] 1.1905ms 0.9939ms 1.0061 KOps/s 992.3137 Ops/s $\color{#35bf28}+1.39\%$
test_a2c_speed[True-backward] 2.8734ms 2.5758ms 388.2325 Ops/s 382.0803 Ops/s $\color{#35bf28}+1.61\%$
test_a2c_speed[reduce-overhead-None] 21.5286ms 11.6512ms 85.8280 Ops/s 86.0437 Ops/s $\color{#d91a1a}-0.25\%$
test_a2c_speed[reduce-overhead-backward] 1.1341ms 0.9597ms 1.0420 KOps/s 884.5994 Ops/s $\textbf{\color{#35bf28}+17.79\%}$
test_ppo_speed[False-None] 3.9117ms 3.6326ms 275.2834 Ops/s 275.1829 Ops/s $\color{#35bf28}+0.04\%$
test_ppo_speed[False-backward] 7.0685ms 6.6630ms 150.0828 Ops/s 144.7064 Ops/s $\color{#35bf28}+3.72\%$
test_ppo_speed[True-None] 1.1081ms 0.9481ms 1.0547 KOps/s 1.0384 KOps/s $\color{#35bf28}+1.57\%$
test_ppo_speed[True-backward] 2.6379ms 2.5050ms 399.2015 Ops/s 367.8101 Ops/s $\textbf{\color{#35bf28}+8.53\%}$
test_ppo_speed[reduce-overhead-None] 0.7500ms 0.5045ms 1.9820 KOps/s 1.8335 KOps/s $\textbf{\color{#35bf28}+8.10\%}$
test_ppo_speed[reduce-overhead-backward] 1.5204ms 1.1118ms 899.4734 Ops/s 887.3658 Ops/s $\color{#35bf28}+1.36\%$
test_reinforce_speed[False-None] 2.7601ms 2.3447ms 426.4910 Ops/s 445.6189 Ops/s $\color{#d91a1a}-4.29\%$
test_reinforce_speed[False-backward] 3.9350ms 3.3825ms 295.6433 Ops/s 301.0021 Ops/s $\color{#d91a1a}-1.78\%$
test_reinforce_speed[True-None] 1.3371ms 0.8708ms 1.1484 KOps/s 1.2032 KOps/s $\color{#d91a1a}-4.56\%$
test_reinforce_speed[True-backward] 2.5383ms 2.3760ms 420.8745 Ops/s 388.3101 Ops/s $\textbf{\color{#35bf28}+8.39\%}$
test_reinforce_speed[reduce-overhead-None] 21.9314ms 11.6229ms 86.0367 Ops/s 87.4099 Ops/s $\color{#d91a1a}-1.57\%$
test_reinforce_speed[reduce-overhead-backward] 1.0752ms 1.0212ms 979.2838 Ops/s 846.9812 Ops/s $\textbf{\color{#35bf28}+15.62\%}$
test_iql_speed[False-None] 9.6263ms 9.1303ms 109.5253 Ops/s 109.7271 Ops/s $\color{#d91a1a}-0.18\%$
test_iql_speed[False-backward] 13.5300ms 12.6989ms 78.7471 Ops/s 76.8456 Ops/s $\color{#35bf28}+2.47\%$
test_iql_speed[True-None] 2.0329ms 1.7398ms 574.7876 Ops/s 575.3268 Ops/s $\color{#d91a1a}-0.09\%$
test_iql_speed[True-backward] 4.5131ms 4.1834ms 239.0426 Ops/s 224.0594 Ops/s $\textbf{\color{#35bf28}+6.69\%}$
test_iql_speed[reduce-overhead-None] 21.2884ms 11.7918ms 84.8048 Ops/s 86.8807 Ops/s $\color{#d91a1a}-2.39\%$
test_iql_speed[reduce-overhead-backward] 1.9955ms 1.5862ms 630.4401 Ops/s 713.7234 Ops/s $\textbf{\color{#d91a1a}-11.67\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.8324ms 6.4029ms 156.1787 Ops/s 153.5472 Ops/s $\color{#35bf28}+1.71\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6214ms 0.3258ms 3.0697 KOps/s 2.9021 KOps/s $\textbf{\color{#35bf28}+5.78\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6190ms 0.3037ms 3.2931 KOps/s 3.0582 KOps/s $\textbf{\color{#35bf28}+7.68\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.6145ms 6.1400ms 162.8672 Ops/s 160.4302 Ops/s $\color{#35bf28}+1.52\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.3425ms 0.2957ms 3.3818 KOps/s 3.8441 KOps/s $\textbf{\color{#d91a1a}-12.03\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5493ms 0.2841ms 3.5198 KOps/s 3.1987 KOps/s $\textbf{\color{#35bf28}+10.04\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.8071ms 1.3623ms 734.0544 Ops/s 686.7430 Ops/s $\textbf{\color{#35bf28}+6.89\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5808ms 1.1862ms 843.0003 Ops/s 794.4225 Ops/s $\textbf{\color{#35bf28}+6.11\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6218ms 6.3018ms 158.6856 Ops/s 155.9338 Ops/s $\color{#35bf28}+1.76\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.9785ms 0.4181ms 2.3919 KOps/s 2.0608 KOps/s $\textbf{\color{#35bf28}+16.07\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8471ms 0.3971ms 2.5180 KOps/s 2.1125 KOps/s $\textbf{\color{#35bf28}+19.20\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.5919ms 6.1732ms 161.9900 Ops/s 159.7558 Ops/s $\color{#35bf28}+1.40\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8253ms 0.3419ms 2.9246 KOps/s 3.2434 KOps/s $\textbf{\color{#d91a1a}-9.83\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7114ms 0.3055ms 3.2731 KOps/s 3.4275 KOps/s $\color{#d91a1a}-4.51\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.6909ms 6.1410ms 162.8404 Ops/s 160.6272 Ops/s $\color{#35bf28}+1.38\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.6548ms 0.3530ms 2.8330 KOps/s 3.3918 KOps/s $\textbf{\color{#d91a1a}-16.47\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5776ms 0.3338ms 2.9961 KOps/s 3.6152 KOps/s $\textbf{\color{#d91a1a}-17.12\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.7009ms 6.3211ms 158.2000 Ops/s 156.1455 Ops/s $\color{#35bf28}+1.32\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1842ms 0.4721ms 2.1183 KOps/s 2.2451 KOps/s $\textbf{\color{#d91a1a}-5.65\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8744ms 0.4428ms 2.2586 KOps/s 2.3636 KOps/s $\color{#d91a1a}-4.44\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9369ms 5.1993ms 192.3349 Ops/s 192.5948 Ops/s $\color{#d91a1a}-0.13\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 11.0550ms 1.9456ms 513.9807 Ops/s 436.8520 Ops/s $\textbf{\color{#35bf28}+17.66\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.0270ms 1.2222ms 818.2025 Ops/s 849.1843 Ops/s $\color{#d91a1a}-3.65\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.5128s 15.4200ms 64.8507 Ops/s 191.0531 Ops/s $\textbf{\color{#d91a1a}-66.06\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 11.4681ms 2.0683ms 483.4884 Ops/s 425.1356 Ops/s $\textbf{\color{#35bf28}+13.73\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.1391ms 1.1515ms 868.4156 Ops/s 869.9385 Ops/s $\color{#d91a1a}-0.18\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 9.0351ms 5.5116ms 181.4353 Ops/s 32.8087 Ops/s $\textbf{\color{#35bf28}+453.01\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 6.9313ms 2.1930ms 455.9878 Ops/s 529.4166 Ops/s $\textbf{\color{#d91a1a}-13.87\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 9.1194ms 1.4621ms 683.9656 Ops/s 837.1296 Ops/s $\textbf{\color{#d91a1a}-18.30\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.8966ms 13.2722ms 75.3457 Ops/s 73.7222 Ops/s $\color{#35bf28}+2.20\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.0746ms 17.8104ms 56.1470 Ops/s 58.7583 Ops/s $\color{#d91a1a}-4.44\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.0572ms 17.5637ms 56.9355 Ops/s 54.1100 Ops/s $\textbf{\color{#35bf28}+5.22\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.1075ms 18.0263ms 55.4746 Ops/s 57.4858 Ops/s $\color{#d91a1a}-3.50\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.1566ms 17.4704ms 57.2398 Ops/s 55.2055 Ops/s $\color{#35bf28}+3.68\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.0708ms 19.2432ms 51.9663 Ops/s 53.4462 Ops/s $\color{#d91a1a}-2.77\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants