You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Just had a quick browse of the code and noticed that it uses asyncio to create background threads which fetch/extract text from documents.
Is it likely that Django would start handling the next request before the background thread has finished running? Because if the same database connection is used by both the text extraction and the new request at the same time, this could cause issues as database connections are not thread safe.
This might cause another issue: Async IO uses a thread pool of 5 * num_cpus by default which might create too many connections for some users (eg, on shared hosting) so maybe we should add a "concurrency" parameter to the "transcribe_documents" command which allows the user to specify a limit on the number of worker threads? (you can specify this in run_in_executor).
The text was updated successfully, but these errors were encountered:
kaedroho
changed the title
Thread safety?
Lots of database connections created by transcribe_documents?
Jul 23, 2018
Just had a quick browse of the code and noticed that it uses asyncio to create background threads which fetch/extract text from documents.
Is it likely that Django would start handling the next request before the background thread has finished running? Because if the same database connection is used by both the text extraction and the new request at the same time, this could cause issues as database connections are not thread safe.EDIT: looks like Django has this covered: https://github.com/django/django/blob/master/django/db/utils.py#L142
This might cause another issue: Async IO uses a thread pool of 5 * num_cpus by default which might create too many connections for some users (eg, on shared hosting) so maybe we should add a "concurrency" parameter to the "transcribe_documents" command which allows the user to specify a limit on the number of worker threads? (you can specify this in run_in_executor).
The text was updated successfully, but these errors were encountered: