-
Notifications
You must be signed in to change notification settings - Fork 491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add doc to explain multithreading #1154
Open
jokester
wants to merge
10
commits into
streamlit:main
Choose a base branch
from
jokester:doc/custom-threading
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+177
−0
Open
Changes from 6 commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
279d1fb
add doc
jokester e52865b
wip
jokester d15c200
update
jokester 14e14b2
update
jokester 5522f92
update
jokester d90b801
update
jokester 3186923
revise
jokester e940376
update menu
jokester 7a37a4a
update
jokester 038c27e
revise
jokester File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,167 @@ | ||
--- | ||
title: Threading in Streamlit | ||
slug: /develop/concepts/architecture/threading | ||
--- | ||
|
||
# Threading in Streamlit | ||
|
||
While building with Streamlit may feel like magic, the things beneath are still plain Python objects. This means the use of threads to improve performance and responsiveness still applies to Streamlit. However it can be tricky to start more threads from your code. This guide is meant to help do threading right in Streamlit. | ||
|
||
Before reading on, you are advised to check [architecture](/develop/concepts/architecture/architecture) and [session-state](/develop/concepts/architecture/session-state) first. | ||
|
||
## Threads created by Streamlit | ||
|
||
A `streamlit run` process creates 2 types of threads: | ||
|
||
- Main thread: runs the web (HTTP + WebSocket) server | ||
- Script thread: runs page code when triggered (by page view or UI interactivity) | ||
|
||
This is an oversimplifed and inaccurate illustration to show the creation of Streamlit threads: | ||
|
||
```py | ||
from threading import Thread | ||
from streamlit.somewhere import WebSocketServer, ScriptRunContext | ||
|
||
# created once per process, runs on main thread | ||
class StreamlitServer(WebSocketServer): | ||
def on_websocket_connection(self, conn): | ||
# assuming 1 connection bounds to exactly 1 session | ||
session = Session() | ||
conn.on_page_run_message(lambda message: session.on_page_run_message(conn, message)) | ||
|
||
|
||
# created for each session | ||
class Session() | ||
def on_page_run_message(self, conn, message): | ||
script_thread = ScriptThread(conn=conn, page_file=message.page_to_run, session=self) | ||
# attach the context object, | ||
# it can be used inside script thread like getattr(current_thread(), "secret..") | ||
setattr(script_thread, "secret_runner_context", ScriptRunContext(session)) | ||
script_thread.start() | ||
|
||
|
||
# created for each page run | ||
class ScriptThread(Thread): | ||
def __init__(self, conn, page_file, session): | ||
self.conn = conn | ||
self.page_file = page_file | ||
|
||
def run(self): | ||
with open(self.page_file) as f: | ||
page_code = f.read() | ||
ui_state = eval(page_code) | ||
self.conn.send_ui_state(ui_state) | ||
# on the other end of WebSocket connection, | ||
# frontend receives the state and updates UI | ||
|
||
|
||
StreamlitServer().listen() | ||
``` | ||
|
||
## `missing ScriptRunContext!` or `streamlit.errors.NoSessionContext` | ||
|
||
Since you are reading this page, chances are that you have already noticed such messages. | ||
|
||
Many Streamlit APIs, including `st.session_state` and multiple builtin widgets, expect themselves to run on a ScriptThread. Such APIs are typically related to per-session or per-page-run internal states. | ||
|
||
In a happy scenario, such code finds the `ScriptRunContext` object attached to the current thread (like in the illustriial code above). But when such Streamlit APIs couldn't, they issue such warnings or errors. | ||
|
||
## Custom threads | ||
|
||
An effective mitigation to delay, is to create threads and let them work concurrently. This works especially well with IO-heavy operations like remote query or data load. | ||
|
||
But due to the reasons you read by far, interacting with Streamlit code from your thread can be quirky. In this section we introduce 2 patterns to let different threads work together. | ||
|
||
Note: they are only patterns rather than complete solutions. You are advised to think them as an idea to start with. For example, one could extend pattern 1 into using a `concurrent.futures.ThreadPoolExecutor` thread pool. | ||
|
||
### 1. Only call Stramlit code from script thread | ||
|
||
Python threading provides ways to start a thread, wait for its execution, and collect its result. If we isolate custom thread from Streamlit APIs, everything should just work in order. | ||
|
||
In the following example page, `main` runs on the script thread and creates 2 custom `WorkerThread`. After WorkerThread-s run concurrently, `main` collects their results and updates UI. | ||
|
||
```py | ||
import streamlit as st | ||
import time | ||
from threading import Thread | ||
|
||
class WorkerThread(Thread): | ||
def __init__(self, delay): | ||
super().__init__() | ||
self.delay = delay | ||
self.return_value = None | ||
def run(self): | ||
# runs in custom thread, touches no Streamlit APIs | ||
start_time = time.time() | ||
time.sleep(self.delay) | ||
end_time = time.time() | ||
self.return_value = f"start: {start_time}, end: {end_time}" | ||
|
||
st.header("t1") | ||
result_1 = st.empty() | ||
st.header("t2") | ||
result_2 = st.empty() | ||
|
||
def main(): | ||
t1 = WorkerThread(5) | ||
t2 = WorkerThread(5) | ||
t1.start() | ||
t2.start() | ||
t1.join() | ||
t2.join() | ||
# main() runs in script thread, and can safely call Streamlit APIs | ||
result_1.write(t1.return_value) | ||
result_2.write(t2.return_value) | ||
|
||
main() | ||
|
||
``` | ||
|
||
### 2. Expose context object to custom thread | ||
|
||
Alternatively, one can let a custom thread have access to the `ScriptRunContext` attached to ScriptThread. This pattern is also used by Streamlit standard widgets like [st.spinner](https://github.com/streamlit/streamlit/blob/develop/lib/streamlit/elements/spinner.py). | ||
|
||
**Caution** this may not work with all Streamlit code. The previous pattern is safer in this way. | ||
|
||
**Caution** `get_script_run_ctx` is meant to be called from a script thread, not a main or custom thread. | ||
|
||
**Caution** when using this pattern, please ensure a custom thread that uses `ScriptRunContext` does not outlive the script thread. Leak of `ScriptRunContext` may cause subtle bugs. | ||
|
||
In the following example page, a custom thread with `ScriptRunContext` attached can call `st.write` without a warning. (Remove a call to `add_script_run_ctx()` and you will see a `streamlit.errors.NoSessionContext`) | ||
|
||
```py | ||
import streamlit as st | ||
from streamlit.runtime.scriptrunner import add_script_run_ctx, get_script_run_ctx | ||
import time | ||
from threading import Thread | ||
|
||
class WorkerThread(Thread): | ||
def __init__(self, delay, target): | ||
super().__init__() | ||
self.delay = delay | ||
self.target = target | ||
def run(self): | ||
# runs in custom thread, but can call Streamlit APIs | ||
start_time = time.time() | ||
time.sleep(self.delay) | ||
end_time = time.time() | ||
self.target.write(f"start: {start_time}, end: {end_time}") | ||
|
||
st.header("t1") | ||
result_1 = st.empty() | ||
st.header("t2") | ||
result_2 = st.empty() | ||
|
||
def main(): | ||
t1 = WorkerThread(5, result_1) | ||
t2 = WorkerThread(5, result_2) | ||
# obtain the ScriptRunContext of the current script thread, and assign to worker threads | ||
add_script_run_ctx(t1, get_script_run_ctx()) | ||
add_script_run_ctx(t2, get_script_run_ctx()) | ||
t1.start() | ||
t2.start() | ||
t1.join() | ||
t2.join() | ||
|
||
main() | ||
``` |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I saw people are already going this way (in various GH issues), I'm not really sure about this pattern
it exposes internal object to page writers
it is less guaranteed. I don't know enough to say the probability where a
ScriptRunContext
suffices. The other pattern just looks "safer" to me because it assumes less from Streamlit.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Officially, adding your own threads isn't supported. We don't want to include unsupported patterns in the main concepts area. This will likely go out as a Knowledge Base (KB) article. There is a subset of this information that can live in concepts, but it will need to be very carefully separated. For now, we can move this page to the KB so that we can get it published faster and I'll likely follow up with moving some of it back into concepts. (I still haven't read through any of it yet, so I still expect a few weeks before I get to this. Just a heads up about what will likely happen.)
In the longer run, the plan is to properly support multithreading and async tasks, but that work isn't currently on the schedule this quarter or next, so it wouldn't likely happen until next year at the earliest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. I don't really want to promote the hack either.
When you come back, feel free to change or move things or ask me to do so 👍🏽