This repo contains two things:
-
tests-scripts/run_tests.py: A load-testing script that compares Streamlit's handling of concurrent users against how Python handles concurrent tasks when Streamlit is not involved. This script runs lots of experiments and writes the results to
data/
. -
result-browser/streamlit_app.py: A Streamlit app that reads the results stored in the
data/
folder and shows them in an easy-to-use UI.
If you just want to look at the results of these load tests, you can find them at:
https://load-test.streamlit.app
-
Go to the right folder:
$ cd load-tests
-
Install requirements:
$ pip install -r requirements.txt
-
Start the tests:
$ python test.py
-
Go fly a kite. The results will be ready in ~24-48h.
The script from (2) will save a file of type
.rowjson
indata/
. Those are your test results.
-
Go to the right folder:
$ cd result-browser
-
Install requirements:
$ pip install -r requirements.txt
-
Start the app:
$ streamlit run result_browser.py
-
Let's define a "task" as:
- Drawing 100 strings (
num_stuff_to_draw
variable) - Followed by performing between 1M and 1B multiplications of random numbers, either as numbers
or as matrices (
num_multiplications
andcomputation
variables) - ...where you can optionally sleep for 0-0.1s after every few multiplications
(
sleep_time_between_multiplications
).
- Drawing 100 strings (
-
Then for each type of test described in item 3 below we run the task above 1-50 times concurrently (
num_users
).Tasks are either started all at once or spread out over a 1s interval (
user_arrival_style
). -
We do this for each of 5 test types:
-
streamlit_test.py :: run_base_test()
This starts a Streamlit app that runs the task above every time a user opens a tab and presses "Run test", and then records how long it takes. Then it opens up a headless browser via Playwright with a new tab pointing to Streamlit for each of the 1-50 concurrent users we'd like to simulate. Once every tab loads, the test clicks "Run test" either all at once or spread out over 1s.
-
streamlit_test.py :: run_processpool_test()
This is the same as the test above, but in this case the Streamlit app runs the task inside a
ProcessPoolExecutor
, so the heavy lifting is done outside the Streamlit server process and in a whole separate Python process (1 process per task). This simulates the best-practice for Streamlit apps that perform heavy non-cacheable computations. -
threads_test.py :: run_test()
This runs a
for
loop that starts 1-50 new threads running the task from item 1 above. When each thread is done, it records how long the task ran for. -
processpool_test.py :: run_test()
This runs a
for
loop that starts 1-50 new processes running the task from item 1 above with aProcessPoolExecutor
. When each process is done, it records how long it ran for. -
multiprocess_test.py :: run_test()
This runs a
for
loop that starts 1-50 new processes running the task from item 1 above withmultiprocessing.Process()
. When each process is done, it records how long it ran for.(NOTE: We don't include the data for this last test in our official results because they match the
ProcessPoolExecutor
results, and therefore just make the official results harder to read)
-