Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] GAIL compatibility with compile #2573

Open
wants to merge 42 commits into
base: gh/vmoens/42/base
Choose a base branch
from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 18, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Nov 18, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2573

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 2 New Failures, 17 Unrelated Failures

As of commit 89b1cb3 with merge base 7d7cd95 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 18, 2024
vmoens added a commit that referenced this pull request Nov 18, 2024
ghstack-source-id: 23e5f83b36fa8dd316b9a85953a0dff60c90320a
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 18, 2024
ghstack-source-id: 8b2b9cb450641cb9e855f37d22caae086ad6ad14
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 18, 2024
ghstack-source-id: 27762629c9043e6bb706315aaa0a7d7e6a272dfb
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 18, 2024
ghstack-source-id: 97697f31e76537d8f5f5540c3499cf9d5c49e04e
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 18, 2024
ghstack-source-id: 54896135cc041eca718a1734659c05ca0b5fa8b1
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 18, 2024
ghstack-source-id: d5155889db337bde6cc118e1ee918486be12c66a
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 18, 2024
ghstack-source-id: c96cc493baeaa148a70a880eaa0e3fbc0d9e7528
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 18, 2024
ghstack-source-id: 6b301b3e622e2d05311a781165fdf1446ee4b137
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 18, 2024
ghstack-source-id: 91a1512c4683c0f3f51ae14b651ae0e2b914161a
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 18, 2024
ghstack-source-id: b10bdc3af590db76e073555dfc9eaab669890bf9
Pull Request resolved: #2573
[ghstack-poisoned]
Copy link

github-actions bot commented Dec 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}32$. Worsened: $\large\color{#d91a1a}1$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4294s 0.4275s 2.3392 Ops/s 2.1860 Ops/s $\textbf{\color{#35bf28}+7.01\%}$
test_transformed 0.6091s 0.6073s 1.6467 Ops/s 1.5756 Ops/s $\color{#35bf28}+4.51\%$
test_serial 1.3479s 1.3457s 0.7431 Ops/s 0.7260 Ops/s $\color{#35bf28}+2.36\%$
test_parallel 1.2977s 1.2851s 0.7781 Ops/s 0.7600 Ops/s $\color{#35bf28}+2.38\%$
test_step_mdp_speed[True-True-True-True-True] 0.1448ms 29.7584μs 33.6040 KOps/s 33.0722 KOps/s $\color{#35bf28}+1.61\%$
test_step_mdp_speed[True-True-True-True-False] 68.8760μs 17.5917μs 56.8451 KOps/s 56.6333 KOps/s $\color{#35bf28}+0.37\%$
test_step_mdp_speed[True-True-True-False-True] 49.3230μs 17.0760μs 58.5617 KOps/s 57.1507 KOps/s $\color{#35bf28}+2.47\%$
test_step_mdp_speed[True-True-True-False-False] 58.5790μs 10.1902μs 98.1339 KOps/s 98.8111 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[True-True-False-True-True] 94.4690μs 32.2025μs 31.0535 KOps/s 30.6921 KOps/s $\color{#35bf28}+1.18\%$
test_step_mdp_speed[True-True-False-True-False] 0.6599ms 19.3762μs 51.6098 KOps/s 50.2118 KOps/s $\color{#35bf28}+2.78\%$
test_step_mdp_speed[True-True-False-False-True] 55.6940μs 18.8883μs 52.9428 KOps/s 52.2729 KOps/s $\color{#35bf28}+1.28\%$
test_step_mdp_speed[True-True-False-False-False] 54.8420μs 11.9929μs 83.3826 KOps/s 83.7320 KOps/s $\color{#d91a1a}-0.42\%$
test_step_mdp_speed[True-False-True-True-True] 87.6540μs 34.0116μs 29.4017 KOps/s 29.3043 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[True-False-True-True-False] 72.5760μs 21.4855μs 46.5429 KOps/s 46.7924 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[True-False-True-False-True] 58.1890μs 19.1078μs 52.3348 KOps/s 51.6601 KOps/s $\color{#35bf28}+1.31\%$
test_step_mdp_speed[True-False-True-False-False] 64.2710μs 12.0638μs 82.8923 KOps/s 84.2642 KOps/s $\color{#d91a1a}-1.63\%$
test_step_mdp_speed[True-False-False-True-True] 0.1002ms 35.6334μs 28.0635 KOps/s 28.4675 KOps/s $\color{#d91a1a}-1.42\%$
test_step_mdp_speed[True-False-False-True-False] 74.0570μs 23.0569μs 43.3710 KOps/s 43.1500 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[True-False-False-False-True] 69.1490μs 20.6177μs 48.5020 KOps/s 47.9998 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[True-False-False-False-False] 39.2040μs 13.7232μs 72.8693 KOps/s 75.0947 KOps/s $\color{#d91a1a}-2.96\%$
test_step_mdp_speed[False-True-True-True-True] 91.3510μs 34.6331μs 28.8741 KOps/s 29.3130 KOps/s $\color{#d91a1a}-1.50\%$
test_step_mdp_speed[False-True-True-True-False] 65.2320μs 21.2921μs 46.9657 KOps/s 47.0062 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[False-True-True-False-True] 76.5040μs 21.6153μs 46.2634 KOps/s 46.2391 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[False-True-True-False-False] 71.6470μs 13.1534μs 76.0261 KOps/s 76.3712 KOps/s $\color{#d91a1a}-0.45\%$
test_step_mdp_speed[False-True-False-True-True] 90.8880μs 36.0145μs 27.7666 KOps/s 28.2069 KOps/s $\color{#d91a1a}-1.56\%$
test_step_mdp_speed[False-True-False-True-False] 51.8270μs 23.1193μs 43.2539 KOps/s 43.4644 KOps/s $\color{#d91a1a}-0.48\%$
test_step_mdp_speed[False-True-False-False-True] 2.7117ms 23.0402μs 43.4024 KOps/s 42.7800 KOps/s $\color{#35bf28}+1.45\%$
test_step_mdp_speed[False-True-False-False-False] 76.4050μs 14.8018μs 67.5594 KOps/s 67.5906 KOps/s $\color{#d91a1a}-0.05\%$
test_step_mdp_speed[False-False-True-True-True] 0.6882ms 37.7148μs 26.5148 KOps/s 26.6812 KOps/s $\color{#d91a1a}-0.62\%$
test_step_mdp_speed[False-False-True-True-False] 74.8280μs 24.9916μs 40.0135 KOps/s 40.4226 KOps/s $\color{#d91a1a}-1.01\%$
test_step_mdp_speed[False-False-True-False-True] 90.1210μs 23.9088μs 41.8255 KOps/s 43.1634 KOps/s $\color{#d91a1a}-3.10\%$
test_step_mdp_speed[False-False-True-False-False] 41.0680μs 14.6897μs 68.0751 KOps/s 67.5063 KOps/s $\color{#35bf28}+0.84\%$
test_step_mdp_speed[False-False-False-True-True] 77.9160μs 38.5927μs 25.9116 KOps/s 26.0967 KOps/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[False-False-False-True-False] 84.9590μs 26.4330μs 37.8314 KOps/s 38.2912 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[False-False-False-False-True] 46.8570μs 24.5790μs 40.6851 KOps/s 40.9872 KOps/s $\color{#d91a1a}-0.74\%$
test_step_mdp_speed[False-False-False-False-False] 51.6060μs 16.4972μs 60.6165 KOps/s 60.4002 KOps/s $\color{#35bf28}+0.36\%$
test_values[generalized_advantage_estimate-True-True] 11.1641ms 9.7285ms 102.7907 Ops/s 106.2357 Ops/s $\color{#d91a1a}-3.24\%$
test_values[vec_generalized_advantage_estimate-True-True] 37.4310ms 33.9611ms 29.4455 Ops/s 27.3703 Ops/s $\textbf{\color{#35bf28}+7.58\%}$
test_values[td0_return_estimate-False-False] 0.2728ms 0.1967ms 5.0831 KOps/s 5.1777 KOps/s $\color{#d91a1a}-1.83\%$
test_values[td1_return_estimate-False-False] 26.6285ms 24.1414ms 41.4226 Ops/s 41.4819 Ops/s $\color{#d91a1a}-0.14\%$
test_values[vec_td1_return_estimate-False-False] 36.3679ms 34.4423ms 29.0341 Ops/s 27.5687 Ops/s $\textbf{\color{#35bf28}+5.32\%}$
test_values[td_lambda_return_estimate-True-False] 38.5746ms 34.8413ms 28.7016 Ops/s 28.8625 Ops/s $\color{#d91a1a}-0.56\%$
test_values[vec_td_lambda_return_estimate-True-False] 37.5391ms 34.6388ms 28.8693 Ops/s 27.4197 Ops/s $\textbf{\color{#35bf28}+5.29\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.5924ms 8.3833ms 119.2844 Ops/s 120.3637 Ops/s $\color{#d91a1a}-0.90\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.5956ms 1.9543ms 511.6844 Ops/s 511.1998 Ops/s $\color{#35bf28}+0.09\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4419ms 0.3582ms 2.7916 KOps/s 2.5603 KOps/s $\textbf{\color{#35bf28}+9.04\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 44.9901ms 44.0737ms 22.6893 Ops/s 21.6953 Ops/s $\color{#35bf28}+4.58\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.0664ms 3.0187ms 331.2681 Ops/s 330.3532 Ops/s $\color{#35bf28}+0.28\%$
test_dqn_speed[False-None] 5.8097ms 1.3955ms 716.5790 Ops/s 706.4529 Ops/s $\color{#35bf28}+1.43\%$
test_dqn_speed[False-backward] 1.9831ms 1.8706ms 534.5823 Ops/s 525.6144 Ops/s $\color{#35bf28}+1.71\%$
test_dqn_speed[True-None] 0.5980ms 0.4628ms 2.1609 KOps/s 2.1072 KOps/s $\color{#35bf28}+2.55\%$
test_dqn_speed[True-backward] 0.9538ms 0.9031ms 1.1072 KOps/s 1.0942 KOps/s $\color{#35bf28}+1.19\%$
test_dqn_speed[reduce-overhead-None] 0.6264ms 0.4677ms 2.1381 KOps/s 2.1223 KOps/s $\color{#35bf28}+0.74\%$
test_dqn_speed[reduce-overhead-backward] 1.0190ms 0.9129ms 1.0954 KOps/s 1.0839 KOps/s $\color{#35bf28}+1.06\%$
test_ddpg_speed[False-None] 3.4733ms 2.8536ms 350.4324 Ops/s 338.2906 Ops/s $\color{#35bf28}+3.59\%$
test_ddpg_speed[False-backward] 4.8133ms 4.0102ms 249.3660 Ops/s 244.1916 Ops/s $\color{#35bf28}+2.12\%$
test_ddpg_speed[True-None] 1.1778ms 0.9986ms 1.0014 KOps/s 994.3095 Ops/s $\color{#35bf28}+0.71\%$
test_ddpg_speed[True-backward] 2.0051ms 1.9158ms 521.9676 Ops/s 516.5318 Ops/s $\color{#35bf28}+1.05\%$
test_ddpg_speed[reduce-overhead-None] 1.2195ms 1.0007ms 999.3056 Ops/s 995.4383 Ops/s $\color{#35bf28}+0.39\%$
test_ddpg_speed[reduce-overhead-backward] 2.0021ms 1.9173ms 521.5608 Ops/s 512.6742 Ops/s $\color{#35bf28}+1.73\%$
test_sac_speed[False-None] 8.5870ms 7.9099ms 126.4244 Ops/s 120.0494 Ops/s $\textbf{\color{#35bf28}+5.31\%}$
test_sac_speed[False-backward] 11.1580ms 10.6696ms 93.7239 Ops/s 86.6741 Ops/s $\textbf{\color{#35bf28}+8.13\%}$
test_sac_speed[True-None] 2.3730ms 1.8629ms 536.7899 Ops/s 544.7185 Ops/s $\color{#d91a1a}-1.46\%$
test_sac_speed[True-backward] 5.5824ms 3.6522ms 273.8038 Ops/s 279.8325 Ops/s $\color{#d91a1a}-2.15\%$
test_sac_speed[reduce-overhead-None] 2.1820ms 1.8362ms 544.6092 Ops/s 536.0110 Ops/s $\color{#35bf28}+1.60\%$
test_sac_speed[reduce-overhead-backward] 3.6779ms 3.5475ms 281.8867 Ops/s 277.2410 Ops/s $\color{#35bf28}+1.68\%$
test_redq_speed[False-None] 14.6235ms 12.9736ms 77.0799 Ops/s 66.8179 Ops/s $\textbf{\color{#35bf28}+15.36\%}$
test_redq_speed[False-backward] 23.7650ms 22.2265ms 44.9913 Ops/s 43.6432 Ops/s $\color{#35bf28}+3.09\%$
test_redq_speed[True-None] 5.8580ms 4.7472ms 210.6523 Ops/s 206.1879 Ops/s $\color{#35bf28}+2.17\%$
test_redq_speed[True-backward] 12.4633ms 11.9591ms 83.6184 Ops/s 79.8886 Ops/s $\color{#35bf28}+4.67\%$
test_redq_speed[reduce-overhead-None] 5.2854ms 4.6913ms 213.1610 Ops/s 200.6356 Ops/s $\textbf{\color{#35bf28}+6.24\%}$
test_redq_speed[reduce-overhead-backward] 12.5035ms 12.1718ms 82.1571 Ops/s 79.1189 Ops/s $\color{#35bf28}+3.84\%$
test_redq_deprec_speed[False-None] 14.2665ms 12.8368ms 77.9011 Ops/s 73.5217 Ops/s $\textbf{\color{#35bf28}+5.96\%}$
test_redq_deprec_speed[False-backward] 19.0717ms 18.5236ms 53.9852 Ops/s 50.9835 Ops/s $\textbf{\color{#35bf28}+5.89\%}$
test_redq_deprec_speed[True-None] 4.3964ms 3.5929ms 278.3232 Ops/s 271.7238 Ops/s $\color{#35bf28}+2.43\%$
test_redq_deprec_speed[True-backward] 9.3192ms 8.1251ms 123.0760 Ops/s 120.5422 Ops/s $\color{#35bf28}+2.10\%$
test_redq_deprec_speed[reduce-overhead-None] 4.1191ms 3.5803ms 279.3082 Ops/s 275.7375 Ops/s $\color{#35bf28}+1.29\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.1812ms 8.1823ms 122.2146 Ops/s 119.9127 Ops/s $\color{#35bf28}+1.92\%$
test_td3_speed[False-None] 8.7029ms 7.8564ms 127.2854 Ops/s 120.4002 Ops/s $\textbf{\color{#35bf28}+5.72\%}$
test_td3_speed[False-backward] 12.0601ms 10.2406ms 97.6502 Ops/s 91.0630 Ops/s $\textbf{\color{#35bf28}+7.23\%}$
test_td3_speed[True-None] 1.8773ms 1.6943ms 590.2107 Ops/s 560.0179 Ops/s $\textbf{\color{#35bf28}+5.39\%}$
test_td3_speed[True-backward] 3.4316ms 3.3561ms 297.9658 Ops/s 289.1033 Ops/s $\color{#35bf28}+3.07\%$
test_td3_speed[reduce-overhead-None] 1.8269ms 1.6960ms 589.6390 Ops/s 569.7081 Ops/s $\color{#35bf28}+3.50\%$
test_td3_speed[reduce-overhead-backward] 4.3958ms 3.3781ms 296.0206 Ops/s 287.6074 Ops/s $\color{#35bf28}+2.93\%$
test_cql_speed[False-None] 38.1578ms 36.1022ms 27.6992 Ops/s 26.3009 Ops/s $\textbf{\color{#35bf28}+5.32\%}$
test_cql_speed[False-backward] 50.2149ms 46.4684ms 21.5200 Ops/s 20.6703 Ops/s $\color{#35bf28}+4.11\%$
test_cql_speed[True-None] 16.7201ms 15.6352ms 63.9582 Ops/s 61.8425 Ops/s $\color{#35bf28}+3.42\%$
test_cql_speed[True-backward] 23.1971ms 22.1343ms 45.1788 Ops/s 43.6965 Ops/s $\color{#35bf28}+3.39\%$
test_cql_speed[reduce-overhead-None] 16.8663ms 15.5847ms 64.1653 Ops/s 61.0230 Ops/s $\textbf{\color{#35bf28}+5.15\%}$
test_cql_speed[reduce-overhead-backward] 23.3960ms 22.4991ms 44.4463 Ops/s 42.6565 Ops/s $\color{#35bf28}+4.20\%$
test_a2c_speed[False-None] 8.1339ms 7.1607ms 139.6510 Ops/s 133.5915 Ops/s $\color{#35bf28}+4.54\%$
test_a2c_speed[False-backward] 15.7244ms 14.4008ms 69.4404 Ops/s 67.3112 Ops/s $\color{#35bf28}+3.16\%$
test_a2c_speed[True-None] 4.9984ms 4.1446ms 241.2757 Ops/s 235.3181 Ops/s $\color{#35bf28}+2.53\%$
test_a2c_speed[True-backward] 11.5112ms 10.6627ms 93.7849 Ops/s 90.2467 Ops/s $\color{#35bf28}+3.92\%$
test_a2c_speed[reduce-overhead-None] 4.7427ms 4.1901ms 238.6556 Ops/s 232.7620 Ops/s $\color{#35bf28}+2.53\%$
test_a2c_speed[reduce-overhead-backward] 11.3546ms 10.7813ms 92.7529 Ops/s 90.1142 Ops/s $\color{#35bf28}+2.93\%$
test_ppo_speed[False-None] 9.6637ms 7.4168ms 134.8298 Ops/s 125.9443 Ops/s $\textbf{\color{#35bf28}+7.06\%}$
test_ppo_speed[False-backward] 14.7020ms 14.4321ms 69.2898 Ops/s 63.3672 Ops/s $\textbf{\color{#35bf28}+9.35\%}$
test_ppo_speed[True-None] 3.9665ms 3.6647ms 272.8723 Ops/s 267.6343 Ops/s $\color{#35bf28}+1.96\%$
test_ppo_speed[True-backward] 10.4448ms 9.5716ms 104.4754 Ops/s 103.1113 Ops/s $\color{#35bf28}+1.32\%$
test_ppo_speed[reduce-overhead-None] 4.0185ms 3.7266ms 268.3439 Ops/s 265.4521 Ops/s $\color{#35bf28}+1.09\%$
test_ppo_speed[reduce-overhead-backward] 11.1040ms 9.6159ms 103.9940 Ops/s 103.5022 Ops/s $\color{#35bf28}+0.48\%$
test_reinforce_speed[False-None] 7.2387ms 6.4562ms 154.8888 Ops/s 147.0845 Ops/s $\textbf{\color{#35bf28}+5.31\%}$
test_reinforce_speed[False-backward] 10.2366ms 9.7295ms 102.7802 Ops/s 96.0782 Ops/s $\textbf{\color{#35bf28}+6.98\%}$
test_reinforce_speed[True-None] 3.3011ms 2.6269ms 380.6802 Ops/s 362.7096 Ops/s $\color{#35bf28}+4.95\%$
test_reinforce_speed[True-backward] 9.3897ms 8.5523ms 116.9272 Ops/s 112.8698 Ops/s $\color{#35bf28}+3.59\%$
test_reinforce_speed[reduce-overhead-None] 2.9489ms 2.6076ms 383.4905 Ops/s 369.4254 Ops/s $\color{#35bf28}+3.81\%$
test_reinforce_speed[reduce-overhead-backward] 9.3542ms 8.5983ms 116.3025 Ops/s 114.9538 Ops/s $\color{#35bf28}+1.17\%$
test_iql_speed[False-None] 33.7355ms 32.3012ms 30.9586 Ops/s 29.7989 Ops/s $\color{#35bf28}+3.89\%$
test_iql_speed[False-backward] 50.9099ms 45.5377ms 21.9598 Ops/s 21.1806 Ops/s $\color{#35bf28}+3.68\%$
test_iql_speed[True-None] 11.6993ms 10.7696ms 92.8539 Ops/s 91.0539 Ops/s $\color{#35bf28}+1.98\%$
test_iql_speed[True-backward] 23.0171ms 21.9182ms 45.6241 Ops/s 44.6308 Ops/s $\color{#35bf28}+2.23\%$
test_iql_speed[reduce-overhead-None] 12.2890ms 10.7703ms 92.8477 Ops/s 91.8663 Ops/s $\color{#35bf28}+1.07\%$
test_iql_speed[reduce-overhead-backward] 23.1708ms 22.0123ms 45.4291 Ops/s 45.1703 Ops/s $\color{#35bf28}+0.57\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.3241ms 4.9624ms 201.5172 Ops/s 191.3289 Ops/s $\textbf{\color{#35bf28}+5.33\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0979ms 0.5216ms 1.9173 KOps/s 1.8693 KOps/s $\color{#35bf28}+2.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 1.0024ms 0.4940ms 2.0242 KOps/s 1.9896 KOps/s $\color{#35bf28}+1.74\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.1218ms 4.7755ms 209.4037 Ops/s 202.4341 Ops/s $\color{#35bf28}+3.44\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.6783ms 0.5002ms 1.9992 KOps/s 1.9668 KOps/s $\color{#35bf28}+1.65\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7604ms 0.4787ms 2.0889 KOps/s 2.0482 KOps/s $\color{#35bf28}+1.98\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.4266ms 1.6325ms 612.5398 Ops/s 594.2975 Ops/s $\color{#35bf28}+3.07\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.1880ms 1.5803ms 632.7848 Ops/s 614.4043 Ops/s $\color{#35bf28}+2.99\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.5448ms 4.9323ms 202.7458 Ops/s 193.1999 Ops/s $\color{#35bf28}+4.94\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.3692ms 0.6523ms 1.5330 KOps/s 619.4944 Ops/s $\textbf{\color{#35bf28}+147.46\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9024ms 0.6265ms 1.5963 KOps/s 1.5404 KOps/s $\color{#35bf28}+3.62\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.1785ms 4.8315ms 206.9729 Ops/s 198.0431 Ops/s $\color{#35bf28}+4.51\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.1494ms 0.5262ms 1.9005 KOps/s 1.8938 KOps/s $\color{#35bf28}+0.35\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7144ms 0.4917ms 2.0336 KOps/s 1.9569 KOps/s $\color{#35bf28}+3.92\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.6027ms 4.7880ms 208.8570 Ops/s 197.9318 Ops/s $\textbf{\color{#35bf28}+5.52\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0218ms 0.5083ms 1.9674 KOps/s 1.9564 KOps/s $\color{#35bf28}+0.56\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7708ms 0.4811ms 2.0787 KOps/s 2.0088 KOps/s $\color{#35bf28}+3.48\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.2935ms 4.8593ms 205.7889 Ops/s 192.5801 Ops/s $\textbf{\color{#35bf28}+6.86\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1260ms 0.6552ms 1.5263 KOps/s 1.5135 KOps/s $\color{#35bf28}+0.85\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8857ms 0.6275ms 1.5935 KOps/s 1.5692 KOps/s $\color{#35bf28}+1.55\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.5519ms 4.2069ms 237.7041 Ops/s 223.4613 Ops/s $\textbf{\color{#35bf28}+6.37\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.8937ms 2.3246ms 430.1746 Ops/s 430.7123 Ops/s $\color{#d91a1a}-0.12\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.7285ms 1.2220ms 818.3243 Ops/s 714.0654 Ops/s $\textbf{\color{#35bf28}+14.60\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.9830ms 4.2529ms 235.1317 Ops/s 241.7367 Ops/s $\color{#d91a1a}-2.73\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.4335s 10.8735ms 91.9666 Ops/s 437.1489 Ops/s $\textbf{\color{#d91a1a}-78.96\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.5427ms 1.3257ms 754.3152 Ops/s 705.7823 Ops/s $\textbf{\color{#35bf28}+6.88\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.7708ms 4.4199ms 226.2498 Ops/s 231.3648 Ops/s $\color{#d91a1a}-2.21\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.0107ms 2.4429ms 409.3478 Ops/s 403.8212 Ops/s $\color{#35bf28}+1.37\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.8449ms 1.4913ms 670.5550 Ops/s 648.2821 Ops/s $\color{#35bf28}+3.44\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.9462ms 11.1494ms 89.6911 Ops/s 83.3385 Ops/s $\textbf{\color{#35bf28}+7.62\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 16.7533ms 14.9738ms 66.7835 Ops/s 65.0119 Ops/s $\color{#35bf28}+2.73\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.5031ms 19.9557ms 50.1110 Ops/s 47.4506 Ops/s $\textbf{\color{#35bf28}+5.61\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.2077ms 15.1464ms 66.0225 Ops/s 62.4071 Ops/s $\textbf{\color{#35bf28}+5.79\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.4121ms 19.8607ms 50.3507 Ops/s 46.8136 Ops/s $\textbf{\color{#35bf28}+7.56\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.7180ms 16.4165ms 60.9142 Ops/s 57.7861 Ops/s $\textbf{\color{#35bf28}+5.41\%}$

Copy link

github-actions bot commented Dec 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7590s 0.7579s 1.3195 Ops/s 1.3176 Ops/s $\color{#35bf28}+0.14\%$
test_transformed 1.0200s 1.0162s 0.9841 Ops/s 0.9868 Ops/s $\color{#d91a1a}-0.27\%$
test_serial 2.1890s 2.1766s 0.4594 Ops/s 0.4595 Ops/s $\color{#d91a1a}-0.01\%$
test_parallel 2.0201s 1.9786s 0.5054 Ops/s 0.4993 Ops/s $\color{#35bf28}+1.22\%$
test_step_mdp_speed[True-True-True-True-True] 0.2350ms 40.1070μs 24.9333 KOps/s 26.0994 KOps/s $\color{#d91a1a}-4.47\%$
test_step_mdp_speed[True-True-True-True-False] 58.4410μs 22.7002μs 44.0526 KOps/s 43.8117 KOps/s $\color{#35bf28}+0.55\%$
test_step_mdp_speed[True-True-True-False-True] 49.9210μs 21.5707μs 46.3592 KOps/s 47.4319 KOps/s $\color{#d91a1a}-2.26\%$
test_step_mdp_speed[True-True-True-False-False] 43.3600μs 12.8176μs 78.0179 KOps/s 79.1934 KOps/s $\color{#d91a1a}-1.48\%$
test_step_mdp_speed[True-True-False-True-True] 82.2510μs 43.7979μs 22.8321 KOps/s 23.6283 KOps/s $\color{#d91a1a}-3.37\%$
test_step_mdp_speed[True-True-False-True-False] 50.8700μs 24.9502μs 40.0798 KOps/s 40.6299 KOps/s $\color{#d91a1a}-1.35\%$
test_step_mdp_speed[True-True-False-False-True] 48.5910μs 23.5855μs 42.3989 KOps/s 41.6820 KOps/s $\color{#35bf28}+1.72\%$
test_step_mdp_speed[True-True-False-False-False] 47.5800μs 15.0225μs 66.5666 KOps/s 67.2326 KOps/s $\color{#d91a1a}-0.99\%$
test_step_mdp_speed[True-False-True-True-True] 75.3320μs 44.8112μs 22.3158 KOps/s 22.7373 KOps/s $\color{#d91a1a}-1.85\%$
test_step_mdp_speed[True-False-True-True-False] 58.6410μs 27.4639μs 36.4114 KOps/s 36.8501 KOps/s $\color{#d91a1a}-1.19\%$
test_step_mdp_speed[True-False-True-False-True] 53.9910μs 23.9874μs 41.6885 KOps/s 41.1995 KOps/s $\color{#35bf28}+1.19\%$
test_step_mdp_speed[True-False-True-False-False] 42.7910μs 14.9058μs 67.0878 KOps/s 66.7456 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[True-False-False-True-True] 84.7810μs 47.0801μs 21.2404 KOps/s 21.3689 KOps/s $\color{#d91a1a}-0.60\%$
test_step_mdp_speed[True-False-False-True-False] 62.9120μs 29.1703μs 34.2814 KOps/s 34.2382 KOps/s $\color{#35bf28}+0.13\%$
test_step_mdp_speed[True-False-False-False-True] 53.7110μs 25.6340μs 39.0106 KOps/s 38.4809 KOps/s $\color{#35bf28}+1.38\%$
test_step_mdp_speed[True-False-False-False-False] 41.1710μs 16.9092μs 59.1395 KOps/s 58.9951 KOps/s $\color{#35bf28}+0.24\%$
test_step_mdp_speed[False-True-True-True-True] 72.1710μs 44.5940μs 22.4245 KOps/s 22.7408 KOps/s $\color{#d91a1a}-1.39\%$
test_step_mdp_speed[False-True-True-True-False] 64.7110μs 27.3862μs 36.5147 KOps/s 37.0983 KOps/s $\color{#d91a1a}-1.57\%$
test_step_mdp_speed[False-True-True-False-True] 55.8810μs 27.7607μs 36.0221 KOps/s 35.6942 KOps/s $\color{#35bf28}+0.92\%$
test_step_mdp_speed[False-True-True-False-False] 44.5000μs 16.6659μs 60.0028 KOps/s 60.1466 KOps/s $\color{#d91a1a}-0.24\%$
test_step_mdp_speed[False-True-False-True-True] 81.9220μs 47.0906μs 21.2356 KOps/s 21.6971 KOps/s $\color{#d91a1a}-2.13\%$
test_step_mdp_speed[False-True-False-True-False] 60.8810μs 29.3886μs 34.0268 KOps/s 34.3901 KOps/s $\color{#d91a1a}-1.06\%$
test_step_mdp_speed[False-True-False-False-True] 3.2949ms 30.7042μs 32.5689 KOps/s 33.8964 KOps/s $\color{#d91a1a}-3.92\%$
test_step_mdp_speed[False-True-False-False-False] 48.1010μs 18.6836μs 53.5228 KOps/s 54.2199 KOps/s $\color{#d91a1a}-1.29\%$
test_step_mdp_speed[False-False-True-True-True] 83.4420μs 49.2890μs 20.2885 KOps/s 20.4995 KOps/s $\color{#d91a1a}-1.03\%$
test_step_mdp_speed[False-False-True-True-False] 72.5510μs 31.5668μs 31.6788 KOps/s 31.9407 KOps/s $\color{#d91a1a}-0.82\%$
test_step_mdp_speed[False-False-True-False-True] 58.4210μs 29.8321μs 33.5209 KOps/s 33.5370 KOps/s $\color{#d91a1a}-0.05\%$
test_step_mdp_speed[False-False-True-False-False] 43.8010μs 18.3134μs 54.6050 KOps/s 54.2776 KOps/s $\color{#35bf28}+0.60\%$
test_step_mdp_speed[False-False-False-True-True] 87.7510μs 49.5212μs 20.1934 KOps/s 19.9448 KOps/s $\color{#35bf28}+1.25\%$
test_step_mdp_speed[False-False-False-True-False] 65.3110μs 33.6235μs 29.7411 KOps/s 30.0908 KOps/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[False-False-False-False-True] 95.4920μs 31.0677μs 32.1878 KOps/s 31.7674 KOps/s $\color{#35bf28}+1.32\%$
test_step_mdp_speed[False-False-False-False-False] 46.3410μs 20.1108μs 49.7245 KOps/s 48.9125 KOps/s $\color{#35bf28}+1.66\%$
test_values[generalized_advantage_estimate-True-True] 26.3509ms 26.0352ms 38.4095 Ops/s 38.5682 Ops/s $\color{#d91a1a}-0.41\%$
test_values[vec_generalized_advantage_estimate-True-True] 99.1799ms 2.8981ms 345.0500 Ops/s 350.2050 Ops/s $\color{#d91a1a}-1.47\%$
test_values[td0_return_estimate-False-False] 0.1070ms 83.5809μs 11.9645 KOps/s 11.9239 KOps/s $\color{#35bf28}+0.34\%$
test_values[td1_return_estimate-False-False] 57.7867ms 57.4347ms 17.4111 Ops/s 16.5474 Ops/s $\textbf{\color{#35bf28}+5.22\%}$
test_values[vec_td1_return_estimate-False-False] 1.2877ms 1.1060ms 904.1483 Ops/s 903.8383 Ops/s $\color{#35bf28}+0.03\%$
test_values[td_lambda_return_estimate-True-False] 92.1625ms 91.5705ms 10.9205 Ops/s 10.5320 Ops/s $\color{#35bf28}+3.69\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2868ms 1.1057ms 904.3644 Ops/s 902.2682 Ops/s $\color{#35bf28}+0.23\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.8988ms 25.7700ms 38.8048 Ops/s 38.7478 Ops/s $\color{#35bf28}+0.15\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0649ms 0.7777ms 1.2859 KOps/s 1.2875 KOps/s $\color{#d91a1a}-0.13\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7807ms 0.6929ms 1.4433 KOps/s 1.4010 KOps/s $\color{#35bf28}+3.01\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5418ms 1.5028ms 665.4384 Ops/s 661.9477 Ops/s $\color{#35bf28}+0.53\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7442ms 0.7078ms 1.4128 KOps/s 1.4173 KOps/s $\color{#d91a1a}-0.32\%$
test_dqn_speed[False-None] 6.9979ms 1.5347ms 651.6073 Ops/s 648.2946 Ops/s $\color{#35bf28}+0.51\%$
test_dqn_speed[False-backward] 2.4044ms 2.1834ms 458.0041 Ops/s 455.6640 Ops/s $\color{#35bf28}+0.51\%$
test_dqn_speed[True-None] 0.6628ms 0.5458ms 1.8322 KOps/s 1.7915 KOps/s $\color{#35bf28}+2.27\%$
test_dqn_speed[True-backward] 1.3187ms 1.1982ms 834.5957 Ops/s 889.0687 Ops/s $\textbf{\color{#d91a1a}-6.13\%}$
test_dqn_speed[reduce-overhead-None] 0.6903ms 0.5620ms 1.7792 KOps/s 1.7730 KOps/s $\color{#35bf28}+0.35\%$
test_dqn_speed[reduce-overhead-backward] 1.1474ms 1.0716ms 933.1858 Ops/s 1.0190 KOps/s $\textbf{\color{#d91a1a}-8.42\%}$
test_ddpg_speed[False-None] 3.1571ms 2.8682ms 348.6563 Ops/s 341.2273 Ops/s $\color{#35bf28}+2.18\%$
test_ddpg_speed[False-backward] 4.7328ms 4.3201ms 231.4757 Ops/s 234.1948 Ops/s $\color{#d91a1a}-1.16\%$
test_ddpg_speed[True-None] 1.1478ms 1.0710ms 933.7490 Ops/s 914.0321 Ops/s $\color{#35bf28}+2.16\%$
test_ddpg_speed[True-backward] 2.3909ms 2.2887ms 436.9330 Ops/s 451.1683 Ops/s $\color{#d91a1a}-3.16\%$
test_ddpg_speed[reduce-overhead-None] 1.1491ms 1.0865ms 920.3669 Ops/s 894.4201 Ops/s $\color{#35bf28}+2.90\%$
test_ddpg_speed[reduce-overhead-backward] 1.8190ms 1.7772ms 562.6980 Ops/s 592.4124 Ops/s $\textbf{\color{#d91a1a}-5.02\%}$
test_sac_speed[False-None] 8.5626ms 8.1608ms 122.5372 Ops/s 119.6726 Ops/s $\color{#35bf28}+2.39\%$
test_sac_speed[False-backward] 11.9087ms 11.5730ms 86.4080 Ops/s 86.6397 Ops/s $\color{#d91a1a}-0.27\%$
test_sac_speed[True-None] 1.6098ms 1.5354ms 651.2806 Ops/s 636.4644 Ops/s $\color{#35bf28}+2.33\%$
test_sac_speed[True-backward] 3.6082ms 3.4607ms 288.9624 Ops/s 288.0807 Ops/s $\color{#35bf28}+0.31\%$
test_sac_speed[reduce-overhead-None] 23.2736ms 12.6716ms 78.9168 Ops/s 79.4595 Ops/s $\color{#d91a1a}-0.68\%$
test_sac_speed[reduce-overhead-backward] 1.6247ms 1.5322ms 652.6351 Ops/s 727.3046 Ops/s $\textbf{\color{#d91a1a}-10.27\%}$
test_redq_speed[False-None] 8.3761ms 7.6302ms 131.0586 Ops/s 129.2029 Ops/s $\color{#35bf28}+1.44\%$
test_redq_speed[False-backward] 12.7987ms 12.0408ms 83.0507 Ops/s 84.8015 Ops/s $\color{#d91a1a}-2.06\%$
test_redq_speed[True-None] 2.4492ms 2.0256ms 493.6721 Ops/s 488.1670 Ops/s $\color{#35bf28}+1.13\%$
test_redq_speed[True-backward] 3.9869ms 3.8988ms 256.4893 Ops/s 251.0632 Ops/s $\color{#35bf28}+2.16\%$
test_redq_speed[reduce-overhead-None] 2.1775ms 2.0195ms 495.1789 Ops/s 480.8679 Ops/s $\color{#35bf28}+2.98\%$
test_redq_speed[reduce-overhead-backward] 3.9306ms 3.8826ms 257.5597 Ops/s 252.2800 Ops/s $\color{#35bf28}+2.09\%$
test_redq_deprec_speed[False-None] 9.8453ms 9.1800ms 108.9327 Ops/s 107.2812 Ops/s $\color{#35bf28}+1.54\%$
test_redq_deprec_speed[False-backward] 13.0817ms 12.6336ms 79.1538 Ops/s 78.3151 Ops/s $\color{#35bf28}+1.07\%$
test_redq_deprec_speed[True-None] 2.5475ms 2.3780ms 420.5144 Ops/s 400.4044 Ops/s $\textbf{\color{#35bf28}+5.02\%}$
test_redq_deprec_speed[True-backward] 4.6846ms 4.3034ms 232.3766 Ops/s 232.7393 Ops/s $\color{#d91a1a}-0.16\%$
test_redq_deprec_speed[reduce-overhead-None] 2.4901ms 2.3573ms 424.2229 Ops/s 422.4504 Ops/s $\color{#35bf28}+0.42\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.6526ms 4.2227ms 236.8161 Ops/s 233.5428 Ops/s $\color{#35bf28}+1.40\%$
test_td3_speed[False-None] 8.1273ms 8.0581ms 124.0994 Ops/s 123.3458 Ops/s $\color{#35bf28}+0.61\%$
test_td3_speed[False-backward] 11.1847ms 10.6969ms 93.4846 Ops/s 93.0706 Ops/s $\color{#35bf28}+0.44\%$
test_td3_speed[True-None] 1.5915ms 1.5698ms 637.0383 Ops/s 630.6439 Ops/s $\color{#35bf28}+1.01\%$
test_td3_speed[True-backward] 3.3776ms 3.2910ms 303.8586 Ops/s 298.2489 Ops/s $\color{#35bf28}+1.88\%$
test_td3_speed[reduce-overhead-None] 83.5933ms 26.2096ms 38.1540 Ops/s 36.5839 Ops/s $\color{#35bf28}+4.29\%$
test_td3_speed[reduce-overhead-backward] 1.5245ms 1.4726ms 679.0598 Ops/s 671.4759 Ops/s $\color{#35bf28}+1.13\%$
test_cql_speed[False-None] 18.1938ms 17.1294ms 58.3791 Ops/s 58.1846 Ops/s $\color{#35bf28}+0.33\%$
test_cql_speed[False-backward] 23.2852ms 22.8587ms 43.7471 Ops/s 43.5532 Ops/s $\color{#35bf28}+0.45\%$
test_cql_speed[True-None] 3.2016ms 2.9718ms 336.4982 Ops/s 330.8135 Ops/s $\color{#35bf28}+1.72\%$
test_cql_speed[True-backward] 5.7100ms 5.1714ms 193.3716 Ops/s 180.2786 Ops/s $\textbf{\color{#35bf28}+7.26\%}$
test_cql_speed[reduce-overhead-None] 21.5571ms 13.2691ms 75.3631 Ops/s 75.2052 Ops/s $\color{#35bf28}+0.21\%$
test_cql_speed[reduce-overhead-backward] 1.7033ms 1.6295ms 613.6873 Ops/s 648.0894 Ops/s $\textbf{\color{#d91a1a}-5.31\%}$
test_a2c_speed[False-None] 3.4986ms 3.2559ms 307.1385 Ops/s 300.7368 Ops/s $\color{#35bf28}+2.13\%$
test_a2c_speed[False-backward] 6.6527ms 6.5623ms 152.3863 Ops/s 155.7769 Ops/s $\color{#d91a1a}-2.18\%$
test_a2c_speed[True-None] 1.0851ms 1.0050ms 995.0314 Ops/s 977.6672 Ops/s $\color{#35bf28}+1.78\%$
test_a2c_speed[True-backward] 2.8458ms 2.7861ms 358.9265 Ops/s 377.1633 Ops/s $\color{#d91a1a}-4.84\%$
test_a2c_speed[reduce-overhead-None] 22.0859ms 11.7857ms 84.8485 Ops/s 85.9233 Ops/s $\color{#d91a1a}-1.25\%$
test_a2c_speed[reduce-overhead-backward] 1.0238ms 0.9875ms 1.0127 KOps/s 865.0753 Ops/s $\textbf{\color{#35bf28}+17.06\%}$
test_ppo_speed[False-None] 3.8953ms 3.7551ms 266.3076 Ops/s 265.8597 Ops/s $\color{#35bf28}+0.17\%$
test_ppo_speed[False-backward] 7.5177ms 7.0814ms 141.2155 Ops/s 136.5124 Ops/s $\color{#35bf28}+3.45\%$
test_ppo_speed[True-None] 1.0157ms 0.9558ms 1.0462 KOps/s 1.0369 KOps/s $\color{#35bf28}+0.90\%$
test_ppo_speed[True-backward] 2.6956ms 2.5583ms 390.8895 Ops/s 362.2505 Ops/s $\textbf{\color{#35bf28}+7.91\%}$
test_ppo_speed[reduce-overhead-None] 0.7213ms 0.5068ms 1.9733 KOps/s 1.8896 KOps/s $\color{#35bf28}+4.43\%$
test_ppo_speed[reduce-overhead-backward] 1.0388ms 0.9748ms 1.0258 KOps/s 987.2128 Ops/s $\color{#35bf28}+3.91\%$
test_reinforce_speed[False-None] 2.3945ms 2.2929ms 436.1346 Ops/s 433.9124 Ops/s $\color{#35bf28}+0.51\%$
test_reinforce_speed[False-backward] 3.3781ms 3.3242ms 300.8207 Ops/s 299.1817 Ops/s $\color{#35bf28}+0.55\%$
test_reinforce_speed[True-None] 0.8977ms 0.8265ms 1.2099 KOps/s 1.1936 KOps/s $\color{#35bf28}+1.37\%$
test_reinforce_speed[True-backward] 2.4668ms 2.4089ms 415.1207 Ops/s 404.3841 Ops/s $\color{#35bf28}+2.66\%$
test_reinforce_speed[reduce-overhead-None] 22.8408ms 11.9314ms 83.8124 Ops/s 86.5930 Ops/s $\color{#d91a1a}-3.21\%$
test_reinforce_speed[reduce-overhead-backward] 1.0821ms 1.0468ms 955.2849 Ops/s 933.7450 Ops/s $\color{#35bf28}+2.31\%$
test_iql_speed[False-None] 9.8674ms 9.3813ms 106.5946 Ops/s 106.3634 Ops/s $\color{#35bf28}+0.22\%$
test_iql_speed[False-backward] 13.9793ms 13.2509ms 75.4668 Ops/s 75.4307 Ops/s $\color{#35bf28}+0.05\%$
test_iql_speed[True-None] 1.8535ms 1.7581ms 568.7897 Ops/s 571.7109 Ops/s $\color{#d91a1a}-0.51\%$
test_iql_speed[True-backward] 4.8434ms 4.2821ms 233.5279 Ops/s 222.6310 Ops/s $\color{#35bf28}+4.89\%$
test_iql_speed[reduce-overhead-None] 15.5454ms 9.1839ms 108.8867 Ops/s 109.4627 Ops/s $\color{#d91a1a}-0.53\%$
test_iql_speed[reduce-overhead-backward] 1.5201ms 1.4550ms 687.2867 Ops/s 621.0704 Ops/s $\textbf{\color{#35bf28}+10.66\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9890ms 6.4769ms 154.3946 Ops/s 153.5254 Ops/s $\color{#35bf28}+0.57\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5556ms 0.3186ms 3.1384 KOps/s 3.2143 KOps/s $\color{#d91a1a}-2.36\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4934ms 0.2759ms 3.6239 KOps/s 3.4609 KOps/s $\color{#35bf28}+4.71\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4656ms 6.2273ms 160.5820 Ops/s 159.7361 Ops/s $\color{#35bf28}+0.53\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0671ms 0.3259ms 3.0682 KOps/s 3.1973 KOps/s $\color{#d91a1a}-4.04\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6223ms 0.2814ms 3.5539 KOps/s 3.3001 KOps/s $\textbf{\color{#35bf28}+7.69\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6513ms 1.3795ms 724.9027 Ops/s 704.3296 Ops/s $\color{#35bf28}+2.92\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4675ms 1.2183ms 820.8363 Ops/s 735.0557 Ops/s $\textbf{\color{#35bf28}+11.67\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5313ms 6.4221ms 155.7122 Ops/s 155.6641 Ops/s $\color{#35bf28}+0.03\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0926ms 0.4776ms 2.0940 KOps/s 2.1306 KOps/s $\color{#d91a1a}-1.72\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7131ms 0.4451ms 2.2466 KOps/s 2.1620 KOps/s $\color{#35bf28}+3.91\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4118ms 6.2141ms 160.9247 Ops/s 159.6071 Ops/s $\color{#35bf28}+0.83\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9500ms 0.2985ms 3.3501 KOps/s 2.7882 KOps/s $\textbf{\color{#35bf28}+20.15\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5818ms 0.3149ms 3.1754 KOps/s 3.0161 KOps/s $\textbf{\color{#35bf28}+5.28\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4309ms 6.2024ms 161.2280 Ops/s 160.5385 Ops/s $\color{#35bf28}+0.43\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.7040ms 0.3893ms 2.5685 KOps/s 3.0945 KOps/s $\textbf{\color{#d91a1a}-17.00\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5473ms 0.3756ms 2.6625 KOps/s 2.9892 KOps/s $\textbf{\color{#d91a1a}-10.93\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5736ms 6.3628ms 157.1643 Ops/s 155.2838 Ops/s $\color{#35bf28}+1.21\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2773ms 0.4468ms 2.2383 KOps/s 2.0741 KOps/s $\textbf{\color{#35bf28}+7.92\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7977ms 0.4283ms 2.3346 KOps/s 2.1580 KOps/s $\textbf{\color{#35bf28}+8.18\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9500ms 5.3445ms 187.1079 Ops/s 188.7045 Ops/s $\color{#d91a1a}-0.85\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.2368ms 2.1483ms 465.4871 Ops/s 442.3306 Ops/s $\textbf{\color{#35bf28}+5.24\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.0890ms 1.2974ms 770.7700 Ops/s 824.8489 Ops/s $\textbf{\color{#d91a1a}-6.56\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4980s 15.2767ms 65.4593 Ops/s 190.9824 Ops/s $\textbf{\color{#d91a1a}-65.72\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 11.9377ms 2.1016ms 475.8368 Ops/s 435.9890 Ops/s $\textbf{\color{#35bf28}+9.14\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.5421ms 1.2056ms 829.4422 Ops/s 873.5654 Ops/s $\textbf{\color{#d91a1a}-5.05\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 9.6547ms 5.6812ms 176.0202 Ops/s 33.0924 Ops/s $\textbf{\color{#35bf28}+431.90\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.1574ms 2.2321ms 448.0120 Ops/s 529.5429 Ops/s $\textbf{\color{#d91a1a}-15.40\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 9.4891ms 1.5141ms 660.4777 Ops/s 818.5595 Ops/s $\textbf{\color{#d91a1a}-19.31\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.9901ms 13.4446ms 74.3795 Ops/s 75.4549 Ops/s $\color{#d91a1a}-1.43\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.1830ms 17.6333ms 56.7110 Ops/s 57.2685 Ops/s $\color{#d91a1a}-0.97\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.3375ms 17.8541ms 56.0094 Ops/s 54.6722 Ops/s $\color{#35bf28}+2.45\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.1344ms 17.8219ms 56.1109 Ops/s 56.2496 Ops/s $\color{#d91a1a}-0.25\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.2925ms 17.5093ms 57.1125 Ops/s 55.6863 Ops/s $\color{#35bf28}+2.56\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.3965ms 19.5238ms 51.2196 Ops/s 51.9532 Ops/s $\color{#d91a1a}-1.41\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 13, 2024
ghstack-source-id: 31fd5f20572517a6ae7c0ca424e33da07d372484
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 13, 2024
ghstack-source-id: 3c5d992cfe6b0eb706685c47665040c4a0176b06
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 13, 2024
ghstack-source-id: 419a8cb4afadc19be83fd84f62ab0b36c5cf7ed2
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 13, 2024
ghstack-source-id: 63d3b51b49ac158811c503c3602dc47045e28b43
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 13, 2024
ghstack-source-id: 3174e74b719be17e30ebca0756b0501ca10c4f48
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 13, 2024
ghstack-source-id: 8aa6d4dd3873757f3512504c91cf8b4f32abb7bf
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 13, 2024
ghstack-source-id: 65fbf85328c0a15f3ef8f8f3baf80c7b834662bc
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 13, 2024
ghstack-source-id: d4d9a6fa7cf00628109347a6207880debe3465bd
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 13, 2024
ghstack-source-id: 09afd596bfec465951670418ee09515362c3fa22
Pull Request resolved: #2573
vmoens added a commit that referenced this pull request Dec 13, 2024
ghstack-source-id: 09afd596bfec465951670418ee09515362c3fa22
Pull Request resolved: #2573
vmoens added a commit that referenced this pull request Dec 13, 2024
ghstack-source-id: 09afd596bfec465951670418ee09515362c3fa22
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: 5fadce953ca929deb9c7aef25b497a227a0bb0a0
Pull Request resolved: #2573
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: 5fadce953ca929deb9c7aef25b497a227a0bb0a0
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: fc6cbd5d0ad6ff536022cb859d2718bb8c32d5d1
Pull Request resolved: #2573
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: fc6cbd5d0ad6ff536022cb859d2718bb8c32d5d1
Pull Request resolved: #2573
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: fc6cbd5d0ad6ff536022cb859d2718bb8c32d5d1
Pull Request resolved: #2573
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants