Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] LLMHashingEnv #2635

Merged
merged 3 commits into from
Dec 12, 2024
Merged

[Feature] LLMHashingEnv #2635

merged 3 commits into from
Dec 12, 2024

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Dec 6, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2635

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures, 19 Unrelated Failures

As of commit d1e0fd7 with merge base 19dfefc (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link

github-actions bot commented Dec 6, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}13$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4293s 0.4274s 2.3397 Ops/s 2.1464 Ops/s $\textbf{\color{#35bf28}+9.01\%}$
test_transformed 0.6107s 0.6069s 1.6478 Ops/s 1.5615 Ops/s $\textbf{\color{#35bf28}+5.53\%}$
test_serial 1.3698s 1.3534s 0.7389 Ops/s 0.7205 Ops/s $\color{#35bf28}+2.55\%$
test_parallel 1.3975s 1.3176s 0.7590 Ops/s 0.7529 Ops/s $\color{#35bf28}+0.80\%$
test_step_mdp_speed[True-True-True-True-True] 0.2453ms 29.4041μs 34.0089 KOps/s 34.3182 KOps/s $\color{#d91a1a}-0.90\%$
test_step_mdp_speed[True-True-True-True-False] 58.1790μs 17.8700μs 55.9597 KOps/s 58.8413 KOps/s $\color{#d91a1a}-4.90\%$
test_step_mdp_speed[True-True-True-False-True] 58.4390μs 16.9482μs 59.0033 KOps/s 61.2901 KOps/s $\color{#d91a1a}-3.73\%$
test_step_mdp_speed[True-True-True-False-False] 62.6270μs 10.0255μs 99.7456 KOps/s 103.7469 KOps/s $\color{#d91a1a}-3.86\%$
test_step_mdp_speed[True-True-False-True-True] 98.1820μs 31.7596μs 31.4866 KOps/s 32.2813 KOps/s $\color{#d91a1a}-2.46\%$
test_step_mdp_speed[True-True-False-True-False] 70.2310μs 19.5610μs 51.1221 KOps/s 53.1363 KOps/s $\color{#d91a1a}-3.79\%$
test_step_mdp_speed[True-True-False-False-True] 86.4510μs 18.6026μs 53.7560 KOps/s 54.8865 KOps/s $\color{#d91a1a}-2.06\%$
test_step_mdp_speed[True-True-False-False-False] 47.0380μs 11.9215μs 83.8819 KOps/s 91.1197 KOps/s $\textbf{\color{#d91a1a}-7.94\%}$
test_step_mdp_speed[True-False-True-True-True] 96.5910μs 33.3807μs 29.9575 KOps/s 30.8543 KOps/s $\color{#d91a1a}-2.91\%$
test_step_mdp_speed[True-False-True-True-False] 74.8300μs 21.4437μs 46.6338 KOps/s 48.6703 KOps/s $\color{#d91a1a}-4.18\%$
test_step_mdp_speed[True-False-True-False-True] 55.1330μs 18.4313μs 54.2556 KOps/s 53.8318 KOps/s $\color{#35bf28}+0.79\%$
test_step_mdp_speed[True-False-True-False-False] 53.2790μs 11.7397μs 85.1811 KOps/s 88.2466 KOps/s $\color{#d91a1a}-3.47\%$
test_step_mdp_speed[True-False-False-True-True] 88.3750μs 35.3066μs 28.3233 KOps/s 29.1503 KOps/s $\color{#d91a1a}-2.84\%$
test_step_mdp_speed[True-False-False-True-False] 74.2090μs 23.0994μs 43.2912 KOps/s 45.2342 KOps/s $\color{#d91a1a}-4.30\%$
test_step_mdp_speed[True-False-False-False-True] 54.3210μs 20.3424μs 49.1584 KOps/s 51.1366 KOps/s $\color{#d91a1a}-3.87\%$
test_step_mdp_speed[True-False-False-False-False] 60.0020μs 13.3688μs 74.8008 KOps/s 77.1184 KOps/s $\color{#d91a1a}-3.01\%$
test_step_mdp_speed[False-True-True-True-True] 84.0470μs 33.5220μs 29.8312 KOps/s 30.6529 KOps/s $\color{#d91a1a}-2.68\%$
test_step_mdp_speed[False-True-True-True-False] 59.9920μs 21.3802μs 46.7721 KOps/s 48.3851 KOps/s $\color{#d91a1a}-3.33\%$
test_step_mdp_speed[False-True-True-False-True] 74.0080μs 20.9799μs 47.6646 KOps/s 46.0557 KOps/s $\color{#35bf28}+3.49\%$
test_step_mdp_speed[False-True-True-False-False] 50.2740μs 13.1456μs 76.0709 KOps/s 78.3876 KOps/s $\color{#d91a1a}-2.96\%$
test_step_mdp_speed[False-True-False-True-True] 96.6100μs 35.3776μs 28.2665 KOps/s 28.9257 KOps/s $\color{#d91a1a}-2.28\%$
test_step_mdp_speed[False-True-False-True-False] 63.9800μs 23.1921μs 43.1181 KOps/s 44.5868 KOps/s $\color{#d91a1a}-3.29\%$
test_step_mdp_speed[False-True-False-False-True] 2.7554ms 22.8782μs 43.7097 KOps/s 44.0822 KOps/s $\color{#d91a1a}-0.84\%$
test_step_mdp_speed[False-True-False-False-False] 46.9580μs 14.7948μs 67.5912 KOps/s 69.3249 KOps/s $\color{#d91a1a}-2.50\%$
test_step_mdp_speed[False-False-True-True-True] 98.0330μs 36.8379μs 27.1459 KOps/s 27.4641 KOps/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[False-False-True-True-False] 92.7030μs 24.9146μs 40.1371 KOps/s 41.3142 KOps/s $\color{#d91a1a}-2.85\%$
test_step_mdp_speed[False-False-True-False-True] 66.7640μs 22.7675μs 43.9223 KOps/s 45.0401 KOps/s $\color{#d91a1a}-2.48\%$
test_step_mdp_speed[False-False-True-False-False] 67.2650μs 14.9357μs 66.9536 KOps/s 70.6513 KOps/s $\textbf{\color{#d91a1a}-5.23\%}$
test_step_mdp_speed[False-False-False-True-True] 90.8990μs 38.3890μs 26.0491 KOps/s 26.9909 KOps/s $\color{#d91a1a}-3.49\%$
test_step_mdp_speed[False-False-False-True-False] 75.4610μs 26.3901μs 37.8930 KOps/s 39.4317 KOps/s $\color{#d91a1a}-3.90\%$
test_step_mdp_speed[False-False-False-False-True] 0.2507ms 24.3885μs 41.0030 KOps/s 42.7605 KOps/s $\color{#d91a1a}-4.11\%$
test_step_mdp_speed[False-False-False-False-False] 73.3970μs 16.3081μs 61.3194 KOps/s 63.4853 KOps/s $\color{#d91a1a}-3.41\%$
test_values[generalized_advantage_estimate-True-True] 12.9392ms 9.8733ms 101.2831 Ops/s 102.9156 Ops/s $\color{#d91a1a}-1.59\%$
test_values[vec_generalized_advantage_estimate-True-True] 40.2488ms 37.8010ms 26.4543 Ops/s 29.6622 Ops/s $\textbf{\color{#d91a1a}-10.81\%}$
test_values[td0_return_estimate-False-False] 0.2481ms 0.1906ms 5.2465 KOps/s 5.1856 KOps/s $\color{#35bf28}+1.17\%$
test_values[td1_return_estimate-False-False] 39.2774ms 24.8909ms 40.1753 Ops/s 40.8272 Ops/s $\color{#d91a1a}-1.60\%$
test_values[vec_td1_return_estimate-False-False] 53.6423ms 39.5712ms 25.2709 Ops/s 29.4554 Ops/s $\textbf{\color{#d91a1a}-14.21\%}$
test_values[td_lambda_return_estimate-True-False] 39.3021ms 35.3148ms 28.3167 Ops/s 28.3007 Ops/s $\color{#35bf28}+0.06\%$
test_values[vec_td_lambda_return_estimate-True-False] 41.6244ms 39.0140ms 25.6318 Ops/s 29.0593 Ops/s $\textbf{\color{#d91a1a}-11.79\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.3302ms 8.2630ms 121.0215 Ops/s 120.6337 Ops/s $\color{#35bf28}+0.32\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.6042ms 2.0223ms 494.4941 Ops/s 481.0939 Ops/s $\color{#35bf28}+2.79\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4568ms 0.3634ms 2.7517 KOps/s 2.6335 KOps/s $\color{#35bf28}+4.49\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 53.2154ms 46.7767ms 21.3782 Ops/s 24.7610 Ops/s $\textbf{\color{#d91a1a}-13.66\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.1284ms 3.0955ms 323.0458 Ops/s 327.8036 Ops/s $\color{#d91a1a}-1.45\%$
test_dqn_speed[False-None] 5.4823ms 1.4038ms 712.3466 Ops/s 705.7367 Ops/s $\color{#35bf28}+0.94\%$
test_dqn_speed[False-backward] 1.9185ms 1.8766ms 532.8913 Ops/s 527.5060 Ops/s $\color{#35bf28}+1.02\%$
test_dqn_speed[True-None] 0.6267ms 0.4711ms 2.1225 KOps/s 2.1305 KOps/s $\color{#d91a1a}-0.37\%$
test_dqn_speed[True-backward] 0.9850ms 0.9045ms 1.1056 KOps/s 1.0816 KOps/s $\color{#35bf28}+2.22\%$
test_dqn_speed[reduce-overhead-None] 0.6520ms 0.4690ms 2.1321 KOps/s 2.1212 KOps/s $\color{#35bf28}+0.51\%$
test_dqn_speed[reduce-overhead-backward] 0.9673ms 0.9061ms 1.1036 KOps/s 1.1167 KOps/s $\color{#d91a1a}-1.17\%$
test_ddpg_speed[False-None] 3.2035ms 2.8812ms 347.0793 Ops/s 344.1536 Ops/s $\color{#35bf28}+0.85\%$
test_ddpg_speed[False-backward] 5.0634ms 4.0908ms 244.4499 Ops/s 247.0286 Ops/s $\color{#d91a1a}-1.04\%$
test_ddpg_speed[True-None] 2.4003ms 1.0088ms 991.2820 Ops/s 997.6309 Ops/s $\color{#d91a1a}-0.64\%$
test_ddpg_speed[True-backward] 2.0744ms 1.9388ms 515.7714 Ops/s 510.1453 Ops/s $\color{#35bf28}+1.10\%$
test_ddpg_speed[reduce-overhead-None] 1.2530ms 1.0078ms 992.2593 Ops/s 990.7509 Ops/s $\color{#35bf28}+0.15\%$
test_ddpg_speed[reduce-overhead-backward] 2.5429ms 1.9880ms 503.0225 Ops/s 515.0125 Ops/s $\color{#d91a1a}-2.33\%$
test_sac_speed[False-None] 8.9290ms 8.1228ms 123.1103 Ops/s 121.8574 Ops/s $\color{#35bf28}+1.03\%$
test_sac_speed[False-backward] 11.3565ms 10.8713ms 91.9852 Ops/s 91.7298 Ops/s $\color{#35bf28}+0.28\%$
test_sac_speed[True-None] 2.2878ms 1.8502ms 540.4878 Ops/s 541.5834 Ops/s $\color{#d91a1a}-0.20\%$
test_sac_speed[True-backward] 3.6881ms 3.5825ms 279.1316 Ops/s 282.3620 Ops/s $\color{#d91a1a}-1.14\%$
test_sac_speed[reduce-overhead-None] 2.2992ms 1.8630ms 536.7653 Ops/s 543.3067 Ops/s $\color{#d91a1a}-1.20\%$
test_sac_speed[reduce-overhead-backward] 4.0900ms 3.6639ms 272.9367 Ops/s 277.4908 Ops/s $\color{#d91a1a}-1.64\%$
test_redq_speed[False-None] 24.2282ms 15.5803ms 64.1837 Ops/s 77.2455 Ops/s $\textbf{\color{#d91a1a}-16.91\%}$
test_redq_speed[False-backward] 24.4693ms 23.0568ms 43.3712 Ops/s 43.9762 Ops/s $\color{#d91a1a}-1.38\%$
test_redq_speed[True-None] 6.3015ms 5.3061ms 188.4613 Ops/s 183.0689 Ops/s $\color{#35bf28}+2.95\%$
test_redq_speed[True-backward] 13.7124ms 12.6952ms 78.7700 Ops/s 78.4830 Ops/s $\color{#35bf28}+0.37\%$
test_redq_speed[reduce-overhead-None] 5.4326ms 4.8705ms 205.3170 Ops/s 192.0714 Ops/s $\textbf{\color{#35bf28}+6.90\%}$
test_redq_speed[reduce-overhead-backward] 12.6991ms 12.3260ms 81.1296 Ops/s 75.1531 Ops/s $\textbf{\color{#35bf28}+7.95\%}$
test_redq_deprec_speed[False-None] 15.5235ms 13.0771ms 76.4697 Ops/s 69.3450 Ops/s $\textbf{\color{#35bf28}+10.27\%}$
test_redq_deprec_speed[False-backward] 19.6575ms 18.8329ms 53.0986 Ops/s 49.0934 Ops/s $\textbf{\color{#35bf28}+8.16\%}$
test_redq_deprec_speed[True-None] 4.4083ms 3.8683ms 258.5111 Ops/s 238.8370 Ops/s $\textbf{\color{#35bf28}+8.24\%}$
test_redq_deprec_speed[True-backward] 9.2375ms 8.6010ms 116.2649 Ops/s 122.7380 Ops/s $\textbf{\color{#d91a1a}-5.27\%}$
test_redq_deprec_speed[reduce-overhead-None] 5.9599ms 3.6793ms 271.7907 Ops/s 265.2500 Ops/s $\color{#35bf28}+2.47\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.0998ms 8.8952ms 112.4202 Ops/s 119.5666 Ops/s $\textbf{\color{#d91a1a}-5.98\%}$
test_td3_speed[False-None] 34.4181ms 8.3788ms 119.3493 Ops/s 121.8291 Ops/s $\color{#d91a1a}-2.04\%$
test_td3_speed[False-backward] 17.8921ms 11.3685ms 87.9623 Ops/s 94.1671 Ops/s $\textbf{\color{#d91a1a}-6.59\%}$
test_td3_speed[True-None] 2.0258ms 1.7219ms 580.7606 Ops/s 577.3963 Ops/s $\color{#35bf28}+0.58\%$
test_td3_speed[True-backward] 3.4023ms 3.3231ms 300.9195 Ops/s 301.5034 Ops/s $\color{#d91a1a}-0.19\%$
test_td3_speed[reduce-overhead-None] 1.9857ms 1.6949ms 589.9897 Ops/s 579.3735 Ops/s $\color{#35bf28}+1.83\%$
test_td3_speed[reduce-overhead-backward] 3.4281ms 3.3472ms 298.7547 Ops/s 296.1242 Ops/s $\color{#35bf28}+0.89\%$
test_cql_speed[False-None] 37.9218ms 35.9809ms 27.7925 Ops/s 27.6248 Ops/s $\color{#35bf28}+0.61\%$
test_cql_speed[False-backward] 51.4422ms 46.9764ms 21.2873 Ops/s 21.6003 Ops/s $\color{#d91a1a}-1.45\%$
test_cql_speed[True-None] 16.8362ms 15.5911ms 64.1392 Ops/s 64.1623 Ops/s $\color{#d91a1a}-0.04\%$
test_cql_speed[True-backward] 23.7425ms 22.5477ms 44.3504 Ops/s 44.0563 Ops/s $\color{#35bf28}+0.67\%$
test_cql_speed[reduce-overhead-None] 28.0417ms 16.0146ms 62.4430 Ops/s 63.7679 Ops/s $\color{#d91a1a}-2.08\%$
test_cql_speed[reduce-overhead-backward] 23.8513ms 22.5359ms 44.3736 Ops/s 44.6564 Ops/s $\color{#d91a1a}-0.63\%$
test_a2c_speed[False-None] 9.3570ms 7.2169ms 138.5644 Ops/s 135.3468 Ops/s $\color{#35bf28}+2.38\%$
test_a2c_speed[False-backward] 15.2821ms 14.4616ms 69.1487 Ops/s 64.3953 Ops/s $\textbf{\color{#35bf28}+7.38\%}$
test_a2c_speed[True-None] 4.6803ms 4.2555ms 234.9909 Ops/s 237.5377 Ops/s $\color{#d91a1a}-1.07\%$
test_a2c_speed[True-backward] 11.4865ms 10.8574ms 92.1027 Ops/s 93.0499 Ops/s $\color{#d91a1a}-1.02\%$
test_a2c_speed[reduce-overhead-None] 4.7616ms 4.2102ms 237.5159 Ops/s 237.4584 Ops/s $\color{#35bf28}+0.02\%$
test_a2c_speed[reduce-overhead-backward] 11.7626ms 10.6402ms 93.9835 Ops/s 93.0494 Ops/s $\color{#35bf28}+1.00\%$
test_ppo_speed[False-None] 9.9389ms 7.4858ms 133.5855 Ops/s 131.3729 Ops/s $\color{#35bf28}+1.68\%$
test_ppo_speed[False-backward] 16.8204ms 15.1164ms 66.1533 Ops/s 65.8487 Ops/s $\color{#35bf28}+0.46\%$
test_ppo_speed[True-None] 4.5655ms 3.7583ms 266.0756 Ops/s 264.2443 Ops/s $\color{#35bf28}+0.69\%$
test_ppo_speed[True-backward] 11.6471ms 9.9130ms 100.8775 Ops/s 103.4676 Ops/s $\color{#d91a1a}-2.50\%$
test_ppo_speed[reduce-overhead-None] 4.1145ms 3.7225ms 268.6399 Ops/s 270.6626 Ops/s $\color{#d91a1a}-0.75\%$
test_ppo_speed[reduce-overhead-backward] 10.8096ms 10.0354ms 99.6468 Ops/s 103.3832 Ops/s $\color{#d91a1a}-3.61\%$
test_reinforce_speed[False-None] 9.0297ms 6.7563ms 148.0100 Ops/s 151.2859 Ops/s $\color{#d91a1a}-2.17\%$
test_reinforce_speed[False-backward] 11.4885ms 10.3923ms 96.2252 Ops/s 100.8023 Ops/s $\color{#d91a1a}-4.54\%$
test_reinforce_speed[True-None] 3.7114ms 2.6872ms 372.1358 Ops/s 378.1403 Ops/s $\color{#d91a1a}-1.59\%$
test_reinforce_speed[True-backward] 9.6919ms 9.0355ms 110.6742 Ops/s 117.5246 Ops/s $\textbf{\color{#d91a1a}-5.83\%}$
test_reinforce_speed[reduce-overhead-None] 3.4917ms 2.6739ms 373.9806 Ops/s 373.8124 Ops/s $\color{#35bf28}+0.05\%$
test_reinforce_speed[reduce-overhead-backward] 9.8998ms 8.9660ms 111.5330 Ops/s 116.9823 Ops/s $\color{#d91a1a}-4.66\%$
test_iql_speed[False-None] 34.1182ms 32.8660ms 30.4266 Ops/s 30.9148 Ops/s $\color{#d91a1a}-1.58\%$
test_iql_speed[False-backward] 48.9133ms 46.3509ms 21.5746 Ops/s 22.1002 Ops/s $\color{#d91a1a}-2.38\%$
test_iql_speed[True-None] 11.6785ms 10.9150ms 91.6174 Ops/s 89.9641 Ops/s $\color{#35bf28}+1.84\%$
test_iql_speed[True-backward] 22.4272ms 21.7299ms 46.0194 Ops/s 44.2391 Ops/s $\color{#35bf28}+4.02\%$
test_iql_speed[reduce-overhead-None] 12.5557ms 10.8655ms 92.0343 Ops/s 91.1535 Ops/s $\color{#35bf28}+0.97\%$
test_iql_speed[reduce-overhead-backward] 24.3155ms 22.3786ms 44.6855 Ops/s 46.4231 Ops/s $\color{#d91a1a}-3.74\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.5210ms 5.0749ms 197.0488 Ops/s 200.0007 Ops/s $\color{#d91a1a}-1.48\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.2375ms 0.5267ms 1.8985 KOps/s 1.9442 KOps/s $\color{#d91a1a}-2.35\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7023ms 0.4976ms 2.0096 KOps/s 2.0303 KOps/s $\color{#d91a1a}-1.02\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.5253ms 5.0163ms 199.3483 Ops/s 210.7910 Ops/s $\textbf{\color{#d91a1a}-5.43\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1658ms 0.5110ms 1.9569 KOps/s 2.0022 KOps/s $\color{#d91a1a}-2.26\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7143ms 0.4931ms 2.0281 KOps/s 2.0857 KOps/s $\color{#d91a1a}-2.76\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.2524ms 1.6441ms 608.2390 Ops/s 585.1100 Ops/s $\color{#35bf28}+3.95\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.0929ms 1.5938ms 627.4131 Ops/s 626.8419 Ops/s $\color{#35bf28}+0.09\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.2059ms 5.1294ms 194.9536 Ops/s 203.1650 Ops/s $\color{#d91a1a}-4.04\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.0095ms 0.6499ms 1.5387 KOps/s 1.5217 KOps/s $\color{#35bf28}+1.11\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9004ms 0.6227ms 1.6060 KOps/s 1.4854 KOps/s $\textbf{\color{#35bf28}+8.12\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.5604ms 4.9400ms 202.4295 Ops/s 205.6632 Ops/s $\color{#d91a1a}-1.57\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.5480ms 0.5311ms 1.8828 KOps/s 1.9026 KOps/s $\color{#d91a1a}-1.04\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7293ms 0.5033ms 1.9868 KOps/s 1.9740 KOps/s $\color{#35bf28}+0.65\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.2834ms 4.9692ms 201.2396 Ops/s 202.1235 Ops/s $\color{#d91a1a}-0.44\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1402ms 0.5147ms 1.9430 KOps/s 1.9302 KOps/s $\color{#35bf28}+0.66\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7204ms 0.4979ms 2.0082 KOps/s 2.0003 KOps/s $\color{#35bf28}+0.40\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 8.0755ms 5.3451ms 187.0860 Ops/s 191.0675 Ops/s $\color{#d91a1a}-2.08\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9190ms 0.6676ms 1.4980 KOps/s 1.4832 KOps/s $\color{#35bf28}+0.99\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 8.1034ms 0.6487ms 1.5416 KOps/s 1.5772 KOps/s $\color{#d91a1a}-2.26\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.9185ms 4.3711ms 228.7743 Ops/s 228.7906 Ops/s $-0.01\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.1792ms 2.3890ms 418.5824 Ops/s 431.6551 Ops/s $\color{#d91a1a}-3.03\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 5.9139ms 1.3432ms 744.5168 Ops/s 723.2314 Ops/s $\color{#35bf28}+2.94\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.9968ms 4.2852ms 233.3631 Ops/s 233.1737 Ops/s $\color{#35bf28}+0.08\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.4616s 11.5159ms 86.8367 Ops/s 431.7381 Ops/s $\textbf{\color{#d91a1a}-79.89\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 4.7894ms 1.3502ms 740.6119 Ops/s 719.9707 Ops/s $\color{#35bf28}+2.87\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.2198ms 4.4671ms 223.8566 Ops/s 226.6534 Ops/s $\color{#d91a1a}-1.23\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 11.2092ms 2.5072ms 398.8443 Ops/s 374.9257 Ops/s $\textbf{\color{#35bf28}+6.38\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.1063ms 1.3874ms 720.7532 Ops/s 699.9554 Ops/s $\color{#35bf28}+2.97\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.8413ms 11.4305ms 87.4849 Ops/s 86.5020 Ops/s $\color{#35bf28}+1.14\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.8610ms 14.8312ms 67.4255 Ops/s 67.6686 Ops/s $\color{#d91a1a}-0.36\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 20.6059ms 20.0956ms 49.7623 Ops/s 49.9562 Ops/s $\color{#d91a1a}-0.39\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 15.2252ms 14.7921ms 67.6035 Ops/s 67.2833 Ops/s $\color{#35bf28}+0.48\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.2730ms 20.0429ms 49.8931 Ops/s 50.1252 Ops/s $\color{#d91a1a}-0.46\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 16.0053ms 15.7316ms 63.5664 Ops/s 63.0868 Ops/s $\color{#35bf28}+0.76\%$

Copy link

github-actions bot commented Dec 6, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}22$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7595s 0.7535s 1.3271 Ops/s 1.3216 Ops/s $\color{#35bf28}+0.42\%$
test_transformed 1.1059s 1.0219s 0.9786 Ops/s 1.0149 Ops/s $\color{#d91a1a}-3.58\%$
test_serial 2.2524s 2.1732s 0.4601 Ops/s 0.4739 Ops/s $\color{#d91a1a}-2.90\%$
test_parallel 2.0494s 1.9906s 0.5024 Ops/s 0.5107 Ops/s $\color{#d91a1a}-1.63\%$
test_step_mdp_speed[True-True-True-True-True] 0.1364ms 39.2174μs 25.4989 KOps/s 24.9495 KOps/s $\color{#35bf28}+2.20\%$
test_step_mdp_speed[True-True-True-True-False] 51.9920μs 22.8960μs 43.6757 KOps/s 43.8333 KOps/s $\color{#d91a1a}-0.36\%$
test_step_mdp_speed[True-True-True-False-True] 56.1330μs 22.0217μs 45.4098 KOps/s 45.8127 KOps/s $\color{#d91a1a}-0.88\%$
test_step_mdp_speed[True-True-True-False-False] 39.9820μs 12.9745μs 77.0742 KOps/s 77.3029 KOps/s $\color{#d91a1a}-0.30\%$
test_step_mdp_speed[True-True-False-True-True] 76.2740μs 42.4424μs 23.5613 KOps/s 23.5292 KOps/s $\color{#35bf28}+0.14\%$
test_step_mdp_speed[True-True-False-True-False] 57.9320μs 24.8521μs 40.2381 KOps/s 40.5276 KOps/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[True-True-False-False-True] 53.5920μs 24.4538μs 40.8935 KOps/s 40.5268 KOps/s $\color{#35bf28}+0.90\%$
test_step_mdp_speed[True-True-False-False-False] 44.6220μs 14.9418μs 66.9263 KOps/s 66.7366 KOps/s $\color{#35bf28}+0.28\%$
test_step_mdp_speed[True-False-True-True-True] 78.8340μs 44.1289μs 22.6609 KOps/s 22.6940 KOps/s $\color{#d91a1a}-0.15\%$
test_step_mdp_speed[True-False-True-True-False] 59.1630μs 27.3375μs 36.5797 KOps/s 36.5515 KOps/s $\color{#35bf28}+0.08\%$
test_step_mdp_speed[True-False-True-False-True] 51.4320μs 23.9687μs 41.7212 KOps/s 41.6511 KOps/s $\color{#35bf28}+0.17\%$
test_step_mdp_speed[True-False-True-False-False] 40.8720μs 14.8302μs 67.4298 KOps/s 66.7825 KOps/s $\color{#35bf28}+0.97\%$
test_step_mdp_speed[True-False-False-True-True] 89.3940μs 46.5420μs 21.4860 KOps/s 21.2748 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[True-False-False-True-False] 0.1843ms 29.1577μs 34.2962 KOps/s 34.1053 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[True-False-False-False-True] 0.2027ms 25.8105μs 38.7439 KOps/s 38.0674 KOps/s $\color{#35bf28}+1.78\%$
test_step_mdp_speed[True-False-False-False-False] 44.3020μs 16.8686μs 59.2816 KOps/s 58.5139 KOps/s $\color{#35bf28}+1.31\%$
test_step_mdp_speed[False-True-True-True-True] 78.6640μs 44.5038μs 22.4700 KOps/s 22.0686 KOps/s $\color{#35bf28}+1.82\%$
test_step_mdp_speed[False-True-True-True-False] 54.0330μs 27.1357μs 36.8518 KOps/s 37.0012 KOps/s $\color{#d91a1a}-0.40\%$
test_step_mdp_speed[False-True-True-False-True] 63.2530μs 28.5321μs 35.0483 KOps/s 35.5513 KOps/s $\color{#d91a1a}-1.42\%$
test_step_mdp_speed[False-True-True-False-False] 44.7520μs 16.5694μs 60.3520 KOps/s 60.7593 KOps/s $\color{#d91a1a}-0.67\%$
test_step_mdp_speed[False-True-False-True-True] 88.9940μs 47.2030μs 21.1851 KOps/s 21.6909 KOps/s $\color{#d91a1a}-2.33\%$
test_step_mdp_speed[False-True-False-True-False] 57.5420μs 28.9377μs 34.5570 KOps/s 34.3816 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[False-True-False-False-True] 3.2872ms 30.2452μs 33.0631 KOps/s 32.9302 KOps/s $\color{#35bf28}+0.40\%$
test_step_mdp_speed[False-True-False-False-False] 45.4420μs 18.7776μs 53.2548 KOps/s 53.4564 KOps/s $\color{#d91a1a}-0.38\%$
test_step_mdp_speed[False-False-True-True-True] 79.9530μs 48.8349μs 20.4772 KOps/s 20.3009 KOps/s $\color{#35bf28}+0.87\%$
test_step_mdp_speed[False-False-True-True-False] 55.9430μs 31.3965μs 31.8507 KOps/s 31.8260 KOps/s $\color{#35bf28}+0.08\%$
test_step_mdp_speed[False-False-True-False-True] 61.4630μs 29.6778μs 33.6952 KOps/s 32.5132 KOps/s $\color{#35bf28}+3.64\%$
test_step_mdp_speed[False-False-True-False-False] 49.6320μs 18.8085μs 53.1674 KOps/s 53.5438 KOps/s $\color{#d91a1a}-0.70\%$
test_step_mdp_speed[False-False-False-True-True] 84.5740μs 50.1854μs 19.9261 KOps/s 20.2820 KOps/s $\color{#d91a1a}-1.75\%$
test_step_mdp_speed[False-False-False-True-False] 68.1230μs 33.3439μs 29.9905 KOps/s 30.1275 KOps/s $\color{#d91a1a}-0.45\%$
test_step_mdp_speed[False-False-False-False-True] 61.1830μs 31.5052μs 31.7408 KOps/s 31.4009 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[False-False-False-False-False] 63.7730μs 20.4257μs 48.9580 KOps/s 49.4253 KOps/s $\color{#d91a1a}-0.95\%$
test_values[generalized_advantage_estimate-True-True] 24.4746ms 23.8468ms 41.9344 Ops/s 40.4872 Ops/s $\color{#35bf28}+3.57\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1048s 2.9793ms 335.6507 Ops/s 324.1579 Ops/s $\color{#35bf28}+3.55\%$
test_values[td0_return_estimate-False-False] 0.1024ms 77.4778μs 12.9069 KOps/s 12.6155 KOps/s $\color{#35bf28}+2.31\%$
test_values[td1_return_estimate-False-False] 53.8052ms 53.4146ms 18.7215 Ops/s 18.5861 Ops/s $\color{#35bf28}+0.73\%$
test_values[vec_td1_return_estimate-False-False] 1.3278ms 1.0628ms 940.9183 Ops/s 936.3655 Ops/s $\color{#35bf28}+0.49\%$
test_values[td_lambda_return_estimate-True-False] 84.9535ms 84.3089ms 11.8611 Ops/s 12.0127 Ops/s $\color{#d91a1a}-1.26\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3697ms 1.0588ms 944.4621 Ops/s 942.9772 Ops/s $\color{#35bf28}+0.16\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.0632ms 23.7030ms 42.1887 Ops/s 42.5911 Ops/s $\color{#d91a1a}-0.94\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0460ms 0.7358ms 1.3591 KOps/s 1.3562 KOps/s $\color{#35bf28}+0.21\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7444ms 0.6479ms 1.5435 KOps/s 1.5396 KOps/s $\color{#35bf28}+0.25\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5185ms 1.4602ms 684.8270 Ops/s 685.1856 Ops/s $\color{#d91a1a}-0.05\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7197ms 0.6628ms 1.5087 KOps/s 1.5107 KOps/s $\color{#d91a1a}-0.13\%$
test_dqn_speed[False-None] 6.9567ms 1.4910ms 670.7102 Ops/s 675.9039 Ops/s $\color{#d91a1a}-0.77\%$
test_dqn_speed[False-backward] 2.2595ms 2.1176ms 472.2265 Ops/s 478.2150 Ops/s $\color{#d91a1a}-1.25\%$
test_dqn_speed[True-None] 0.6400ms 0.5302ms 1.8859 KOps/s 1.8100 KOps/s $\color{#35bf28}+4.19\%$
test_dqn_speed[True-backward] 1.2543ms 1.1034ms 906.3106 Ops/s 824.2524 Ops/s $\textbf{\color{#35bf28}+9.96\%}$
test_dqn_speed[reduce-overhead-None] 0.6239ms 0.5600ms 1.7857 KOps/s 1.8080 KOps/s $\color{#d91a1a}-1.23\%$
test_dqn_speed[reduce-overhead-backward] 0.9909ms 0.9593ms 1.0425 KOps/s 1.0362 KOps/s $\color{#35bf28}+0.60\%$
test_ddpg_speed[False-None] 3.2363ms 2.8366ms 352.5390 Ops/s 352.7984 Ops/s $\color{#d91a1a}-0.07\%$
test_ddpg_speed[False-backward] 4.4384ms 4.0641ms 246.0542 Ops/s 245.6501 Ops/s $\color{#35bf28}+0.16\%$
test_ddpg_speed[True-None] 1.2002ms 1.0567ms 946.3137 Ops/s 938.8772 Ops/s $\color{#35bf28}+0.79\%$
test_ddpg_speed[True-backward] 2.3049ms 2.1960ms 455.3714 Ops/s 436.0359 Ops/s $\color{#35bf28}+4.43\%$
test_ddpg_speed[reduce-overhead-None] 1.1880ms 1.0791ms 926.6975 Ops/s 925.4730 Ops/s $\color{#35bf28}+0.13\%$
test_ddpg_speed[reduce-overhead-backward] 1.6900ms 1.6299ms 613.5389 Ops/s 565.5155 Ops/s $\textbf{\color{#35bf28}+8.49\%}$
test_sac_speed[False-None] 8.4620ms 7.9461ms 125.8477 Ops/s 124.9072 Ops/s $\color{#35bf28}+0.75\%$
test_sac_speed[False-backward] 11.3519ms 10.8849ms 91.8701 Ops/s 89.2635 Ops/s $\color{#35bf28}+2.92\%$
test_sac_speed[True-None] 1.7829ms 1.5966ms 626.3455 Ops/s 655.4662 Ops/s $\color{#d91a1a}-4.44\%$
test_sac_speed[True-backward] 3.4518ms 3.3808ms 295.7890 Ops/s 311.3651 Ops/s $\textbf{\color{#d91a1a}-5.00\%}$
test_sac_speed[reduce-overhead-None] 22.8292ms 12.5846ms 79.4619 Ops/s 78.8887 Ops/s $\color{#35bf28}+0.73\%$
test_sac_speed[reduce-overhead-backward] 1.5612ms 1.4896ms 671.3149 Ops/s 747.4053 Ops/s $\textbf{\color{#d91a1a}-10.18\%}$
test_redq_speed[False-None] 8.3263ms 7.4327ms 134.5412 Ops/s 133.2895 Ops/s $\color{#35bf28}+0.94\%$
test_redq_speed[False-backward] 12.7007ms 11.5429ms 86.6335 Ops/s 88.4352 Ops/s $\color{#d91a1a}-2.04\%$
test_redq_speed[True-None] 2.2319ms 1.9612ms 509.8793 Ops/s 487.2856 Ops/s $\color{#35bf28}+4.64\%$
test_redq_speed[True-backward] 3.8709ms 3.7904ms 263.8249 Ops/s 272.7592 Ops/s $\color{#d91a1a}-3.28\%$
test_redq_speed[reduce-overhead-None] 2.1319ms 1.9941ms 501.4702 Ops/s 503.8718 Ops/s $\color{#d91a1a}-0.48\%$
test_redq_speed[reduce-overhead-backward] 3.6853ms 3.6118ms 276.8675 Ops/s 272.0763 Ops/s $\color{#35bf28}+1.76\%$
test_redq_deprec_speed[False-None] 9.4729ms 8.9068ms 112.2736 Ops/s 110.2283 Ops/s $\color{#35bf28}+1.86\%$
test_redq_deprec_speed[False-backward] 12.1906ms 11.8681ms 84.2592 Ops/s 82.3782 Ops/s $\color{#35bf28}+2.28\%$
test_redq_deprec_speed[True-None] 2.3623ms 2.2840ms 437.8237 Ops/s 415.1565 Ops/s $\textbf{\color{#35bf28}+5.46\%}$
test_redq_deprec_speed[True-backward] 4.5930ms 4.1156ms 242.9767 Ops/s 252.4227 Ops/s $\color{#d91a1a}-3.74\%$
test_redq_deprec_speed[reduce-overhead-None] 2.3989ms 2.2728ms 439.9932 Ops/s 429.0022 Ops/s $\color{#35bf28}+2.56\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.1703ms 3.9651ms 252.1988 Ops/s 252.8053 Ops/s $\color{#d91a1a}-0.24\%$
test_td3_speed[False-None] 8.1338ms 8.0395ms 124.3861 Ops/s 127.7692 Ops/s $\color{#d91a1a}-2.65\%$
test_td3_speed[False-backward] 11.1238ms 10.2858ms 97.2214 Ops/s 98.5106 Ops/s $\color{#d91a1a}-1.31\%$
test_td3_speed[True-None] 1.6799ms 1.6167ms 618.5403 Ops/s 641.6929 Ops/s $\color{#d91a1a}-3.61\%$
test_td3_speed[True-backward] 3.3792ms 3.1337ms 319.1094 Ops/s 306.7596 Ops/s $\color{#35bf28}+4.03\%$
test_td3_speed[reduce-overhead-None] 49.3171ms 25.3755ms 39.4081 Ops/s 37.7845 Ops/s $\color{#35bf28}+4.30\%$
test_td3_speed[reduce-overhead-backward] 1.3494ms 1.2900ms 775.2000 Ops/s 699.9929 Ops/s $\textbf{\color{#35bf28}+10.74\%}$
test_cql_speed[False-None] 16.5911ms 16.0277ms 62.3918 Ops/s 62.2890 Ops/s $\color{#35bf28}+0.17\%$
test_cql_speed[False-backward] 21.6451ms 21.0492ms 47.5078 Ops/s 46.4365 Ops/s $\color{#35bf28}+2.31\%$
test_cql_speed[True-None] 3.1185ms 2.8932ms 345.6342 Ops/s 343.8339 Ops/s $\color{#35bf28}+0.52\%$
test_cql_speed[True-backward] 5.3660ms 5.0275ms 198.9052 Ops/s 191.5597 Ops/s $\color{#35bf28}+3.83\%$
test_cql_speed[reduce-overhead-None] 21.8038ms 13.1566ms 76.0076 Ops/s 75.6159 Ops/s $\color{#35bf28}+0.52\%$
test_cql_speed[reduce-overhead-backward] 1.7572ms 1.4963ms 668.3294 Ops/s 626.6543 Ops/s $\textbf{\color{#35bf28}+6.65\%}$
test_a2c_speed[False-None] 3.6854ms 3.2701ms 305.8026 Ops/s 312.6729 Ops/s $\color{#d91a1a}-2.20\%$
test_a2c_speed[False-backward] 6.6154ms 6.0663ms 164.8453 Ops/s 164.3344 Ops/s $\color{#35bf28}+0.31\%$
test_a2c_speed[True-None] 1.4453ms 1.0217ms 978.7432 Ops/s 995.4825 Ops/s $\color{#d91a1a}-1.68\%$
test_a2c_speed[True-backward] 3.0629ms 2.6276ms 380.5816 Ops/s 384.1390 Ops/s $\color{#d91a1a}-0.93\%$
test_a2c_speed[reduce-overhead-None] 0.3995s 12.2174ms 81.8505 Ops/s 87.0966 Ops/s $\textbf{\color{#d91a1a}-6.02\%}$
test_a2c_speed[reduce-overhead-backward] 1.0449ms 0.9900ms 1.0101 KOps/s 887.8034 Ops/s $\textbf{\color{#35bf28}+13.78\%}$
test_ppo_speed[False-None] 3.8127ms 3.6218ms 276.1088 Ops/s 274.4370 Ops/s $\color{#35bf28}+0.61\%$
test_ppo_speed[False-backward] 7.3772ms 6.7735ms 147.6349 Ops/s 142.8931 Ops/s $\color{#35bf28}+3.32\%$
test_ppo_speed[True-None] 1.0914ms 0.9288ms 1.0766 KOps/s 1.0200 KOps/s $\textbf{\color{#35bf28}+5.55\%}$
test_ppo_speed[True-backward] 2.6598ms 2.5979ms 384.9256 Ops/s 369.5127 Ops/s $\color{#35bf28}+4.17\%$
test_ppo_speed[reduce-overhead-None] 0.6406ms 0.4905ms 2.0387 KOps/s 1.9229 KOps/s $\textbf{\color{#35bf28}+6.02\%}$
test_ppo_speed[reduce-overhead-backward] 1.0073ms 0.9723ms 1.0285 KOps/s 888.3888 Ops/s $\textbf{\color{#35bf28}+15.77\%}$
test_reinforce_speed[False-None] 2.3917ms 2.2300ms 448.4282 Ops/s 441.3705 Ops/s $\color{#35bf28}+1.60\%$
test_reinforce_speed[False-backward] 3.7034ms 3.2358ms 309.0425 Ops/s 298.7062 Ops/s $\color{#35bf28}+3.46\%$
test_reinforce_speed[True-None] 0.9912ms 0.8256ms 1.2112 KOps/s 1.2087 KOps/s $\color{#35bf28}+0.21\%$
test_reinforce_speed[True-backward] 2.6369ms 2.4547ms 407.3892 Ops/s 389.7926 Ops/s $\color{#35bf28}+4.51\%$
test_reinforce_speed[reduce-overhead-None] 22.4168ms 11.6751ms 85.6523 Ops/s 88.7020 Ops/s $\color{#d91a1a}-3.44\%$
test_reinforce_speed[reduce-overhead-backward] 1.1252ms 1.0596ms 943.7080 Ops/s 837.9325 Ops/s $\textbf{\color{#35bf28}+12.62\%}$
test_iql_speed[False-None] 9.6566ms 9.1190ms 109.6617 Ops/s 109.0851 Ops/s $\color{#35bf28}+0.53\%$
test_iql_speed[False-backward] 13.3032ms 12.8074ms 78.0796 Ops/s 75.9918 Ops/s $\color{#35bf28}+2.75\%$
test_iql_speed[True-None] 1.9414ms 1.7799ms 561.8183 Ops/s 549.0949 Ops/s $\color{#35bf28}+2.32\%$
test_iql_speed[True-backward] 4.7107ms 4.2198ms 236.9793 Ops/s 225.2713 Ops/s $\textbf{\color{#35bf28}+5.20\%}$
test_iql_speed[reduce-overhead-None] 20.2491ms 11.4922ms 87.0156 Ops/s 87.8699 Ops/s $\color{#d91a1a}-0.97\%$
test_iql_speed[reduce-overhead-backward] 1.4821ms 1.4307ms 698.9699 Ops/s 707.3366 Ops/s $\color{#d91a1a}-1.18\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9146ms 6.4348ms 155.4061 Ops/s 153.7841 Ops/s $\color{#35bf28}+1.05\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5384ms 0.3025ms 3.3060 KOps/s 3.3377 KOps/s $\color{#d91a1a}-0.95\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4616ms 0.2551ms 3.9194 KOps/s 2.9920 KOps/s $\textbf{\color{#35bf28}+30.99\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5382ms 6.1487ms 162.6364 Ops/s 160.8076 Ops/s $\color{#35bf28}+1.14\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9997ms 0.2974ms 3.3622 KOps/s 2.6745 KOps/s $\textbf{\color{#35bf28}+25.71\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5434ms 0.2975ms 3.3613 KOps/s 2.9832 KOps/s $\textbf{\color{#35bf28}+12.67\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5487ms 1.2872ms 776.9062 Ops/s 750.1860 Ops/s $\color{#35bf28}+3.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5091ms 1.2168ms 821.8011 Ops/s 745.1304 Ops/s $\textbf{\color{#35bf28}+10.29\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4858ms 6.3259ms 158.0813 Ops/s 155.8708 Ops/s $\color{#35bf28}+1.42\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0751ms 0.5076ms 1.9700 KOps/s 2.3093 KOps/s $\textbf{\color{#d91a1a}-14.69\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6164ms 0.4299ms 2.3259 KOps/s 2.4810 KOps/s $\textbf{\color{#d91a1a}-6.25\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3565ms 6.1810ms 161.7864 Ops/s 161.0523 Ops/s $\color{#35bf28}+0.46\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6926ms 0.3085ms 3.2411 KOps/s 2.8405 KOps/s $\textbf{\color{#35bf28}+14.10\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5677ms 0.2917ms 3.4285 KOps/s 2.6305 KOps/s $\textbf{\color{#35bf28}+30.33\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4304ms 6.1260ms 163.2390 Ops/s 162.6323 Ops/s $\color{#35bf28}+0.37\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.5043ms 0.2630ms 3.8018 KOps/s 3.3566 KOps/s $\textbf{\color{#35bf28}+13.26\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 7.4040ms 0.2514ms 3.9783 KOps/s 3.6969 KOps/s $\textbf{\color{#35bf28}+7.61\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5290ms 6.3243ms 158.1212 Ops/s 155.7100 Ops/s $\color{#35bf28}+1.55\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1112ms 0.4523ms 2.2110 KOps/s 2.1275 KOps/s $\color{#35bf28}+3.92\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6477ms 0.4339ms 2.3048 KOps/s 2.0909 KOps/s $\textbf{\color{#35bf28}+10.23\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0270ms 5.3018ms 188.6146 Ops/s 189.8153 Ops/s $\color{#d91a1a}-0.63\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.5261ms 2.0460ms 488.7589 Ops/s 511.1268 Ops/s $\color{#d91a1a}-4.38\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.0264ms 1.2618ms 792.4880 Ops/s 855.4674 Ops/s $\textbf{\color{#d91a1a}-7.36\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4930s 15.1056ms 66.2007 Ops/s 191.2707 Ops/s $\textbf{\color{#d91a1a}-65.39\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.8125ms 2.1020ms 475.7332 Ops/s 439.8791 Ops/s $\textbf{\color{#35bf28}+8.15\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.9593ms 1.2030ms 831.2663 Ops/s 859.1095 Ops/s $\color{#d91a1a}-3.24\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.5004ms 5.5484ms 180.2318 Ops/s 33.0338 Ops/s $\textbf{\color{#35bf28}+445.60\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 6.3133ms 2.2064ms 453.2168 Ops/s 448.4489 Ops/s $\color{#35bf28}+1.06\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 10.3909ms 1.4763ms 677.3643 Ops/s 759.1449 Ops/s $\textbf{\color{#d91a1a}-10.77\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.7105ms 13.3207ms 75.0712 Ops/s 74.6019 Ops/s $\color{#35bf28}+0.63\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.1583ms 16.7116ms 59.8387 Ops/s 60.2322 Ops/s $\color{#d91a1a}-0.65\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.7191ms 17.8225ms 56.1088 Ops/s 55.0080 Ops/s $\color{#35bf28}+2.00\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.8343ms 16.9302ms 59.0661 Ops/s 59.3303 Ops/s $\color{#d91a1a}-0.45\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.0764ms 17.4657ms 57.2550 Ops/s 55.2037 Ops/s $\color{#35bf28}+3.72\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 18.8329ms 18.1076ms 55.2255 Ops/s 54.8791 Ops/s $\color{#35bf28}+0.63\%$

[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit d1e0fd7 into gh/vmoens/49/base Dec 12, 2024
4 checks passed
vmoens added a commit that referenced this pull request Dec 12, 2024
ghstack-source-id: d1a20ecd023008683cf18cf9e694340cfdbdac8a
Pull Request resolved: #2635
@vmoens vmoens deleted the gh/vmoens/49/head branch December 12, 2024 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants