Revert threads usage; "add executor to validate_payload" #676

astrand · 2024-11-14T08:11:08Z

run_in_executor() is in practice just a fancy name for creating threads. However, this library is based on asyncio; not threads. Introducing threads into an application has some serious consequences. For example, it is no longer safe to use os.fork() (or subprocess with preexec_fn). Thus with threads, this library can be incompatible with many existing applications, preventing updates.

astrand · 2024-11-14T08:20:00Z

Also note that version is currently 2.0.0-rc.2 - introducing threads between two release candidates does not make sense at all. Such a change would rather mandate a major version increase, like 3.Y.Z.

jainmohit2001 · 2024-11-14T09:03:31Z

Hi @astrand, correct me if I am wrong.
Python doesn't support non-blocking file I/O. The only way to do this is by using threads. Even aiofiles does this by delegating the work to a thread pool. I understand your concern of introducing threads in an async python application. If we were to tasked with introducing non-blocking file I/O, we have to work with threads or processes either way.

drc38 · 2024-11-14T09:11:52Z

run_in_executor() is in practice just a fancy name for creating threads. However, this library is based on asyncio; not threads. Introducing threads into an application has some serious consequences. For example, it is no longer safe to use os.fork() (or subprocess with preexec_fn). Thus with threads, this library can be incompatible with many existing applications, preventing updates.

Hi @astrand, can you explain where the issue arises in using the asyncio documented method for dealing with blocking IO https://docs.python.org/3/library/asyncio-dev.html

run_in_executor() is in practice just a fancy name for creating threads. However, this library is based on asyncio; not threads. Introducing threads into an application has some serious consequences. For example, it is no longer safe to use os.fork() (or subprocess with preexec_fn). Thus with threads, this library can be incompatible with many existing applications, preventing updates.

astrand · 2024-11-14T11:56:43Z

Hi @astrand, correct me if I am wrong. Python doesn't support non-blocking file I/O. The only way to do this is by using threads. Even aiofiles does this by delegating the work to a thread pool. I understand your concern of introducing threads in an async python application. If we were to tasked with introducing non-blocking file I/O, we have to work with threads or processes either way.

Python is a language that ideally should support all concurrency mechanisms supported by the platform. In particular, it should support Linux mechanisms such as subprocesses and select-based IO, without using threads.

As I mentioned, I think the problem we are trying to solve needs to be better described. open() typically does not block.

astrand · 2024-11-14T12:04:34Z

run_in_executor() is in practice just a fancy name for creating threads. However, this library is based on asyncio; not threads. Introducing threads into an application has some serious consequences. For example, it is no longer safe to use os.fork() (or subprocess with preexec_fn). Thus with threads, this library can be incompatible with many existing applications, preventing updates.

Hi @astrand, can you explain where the issue arises in using the asyncio documented method for dealing with blocking IO https://docs.python.org/3/library/asyncio-dev.html

Just because an API is documented, does not mean that there are no issues with it. See https://docs.python.org/3/library/os.html:

Even in code that appears to work, it has never been safe to mix threading with os.fork() on POSIX platforms.

Also note that the asyncio example is a CPU bound calculation, not a simple open().

There's a reason for not being able to use normal file descriptions with UNIX/Linux select() - it is typically not necessary since the local file system is considered fast IO.

jainmohit2001 · 2024-11-14T12:15:00Z

Can we get a sample repro, which would help us understand how os.fork() or any other command might break the validate_payload?

astrand · 2024-11-14T14:57:49Z

Can we get a sample repro, which would help us understand how os.fork() or any other command might break the validate_payload?

It is the other way around. The use of run_in_executor() prevents using os.fork() and preexec_fn. With this patch to the examples:

+++ b/examples/v16/central_system.py
@@ -26,6 +26,14 @@ class ChargePoint(cp):
     def on_boot_notification(
         self, charge_point_vendor: str, charge_point_model: str, **kwargs
     ):
+        import os
+        # If there are threads, os.fork() on Python 3.12 will raise
+        # DeprecationWarning, and cannot use subprocess.Popen with
+        # preexec_fn
+        child = os.fork()
+        if child == 0:
+            # Child
+            os._exit(0)
         return call_result.BootNotificationPayload(
             current_time=datetime.utcnow().isoformat(),
             interval=10,

You will get:
/home/astrand/tmp/ocpp/examples/v16/central_system.py:33: DeprecationWarning: This process (pid=732646) is multi-threaded, use of fork() may lead to deadlocks in the child.

drc38 · 2024-11-14T17:56:21Z

Thanks @astrand, an interesting discussion on the issue with using os.fork() by the dev team https://discuss.python.org/t/concerns-regarding-deprecation-of-fork-with-alive-threads/33555/14

drc38 · 2024-11-15T17:31:01Z

Thanks @astrand, an interesting discussion on the issue with using os.fork() by the dev team https://discuss.python.org/t/concerns-regarding-deprecation-of-fork-with-alive-threads/33555/14

I see python 3.13 uses spawn as the default start method for multiprocessing, assume because of this issue with fork.

astrand · 2024-11-17T09:13:37Z

Thanks for the links. This is a tricky issue, and I don't think upstream Python is paying enough attention to the issue.

As pointed out above, there are issues with using both threads and fork()
Most large UNIX applications needs to call other executables. Thus, subprocesses and fork() must be used.
This means that the only solution in such cases which allows fine grained control over subprocesses is to avoid threads. Python seems to go in the opposite direction and more or less forcing the use of threads. This is unfortunate.

In any case, I investigated the issue with the validator. It is not the open() call that takes time, it is this call in messages.py:

validator.validate(message.payload)

On my test CS, the call takes up to 150 ms, with some fairly normal OCPP communication. I guess this could be a problem in some cases. For reference, the default value of loop.slow_callback_duration is 100 ms. With the product I am working on, however, we have a separate process for time critical code, so we prefer that the validation is synchronous.

If jsonschema had an API which allowed incremental validation, the problem could be solved by yielding to the event loop during the validation. But apparently it does not.

Anyhow, it should be possible to disable the use of threads for the validation; if this feature should be kept. At least two solutions are possible:

Move run_in_executor call to common method like async_validate_payload in message.py (Don't Repeat Yourself - it is currently called in 4 different places!). Then, we can make a configuration mechanism to enable/disable use of run_in_executor. For example, we can have a module variable that indicates if run_in_executor should be used or not. Another idea is to extend _skip_schema_validation so that it can have a 3rd value (in addition to True and False) which indicates that async validation should be used. One argument for this is that the time it takes to do the validation depends on which message you are dealing with.
Another solution outside the ocpp library is to change the default executor to an synchronous one. Again, it is strange the upstream Python does not provide this, but it is easy to create one:

class SyncExecutor(ThreadPoolExecutor):
    def __init__(self, *args, **kwargs):
        super().__init__()

    def submit(self, fun, *args, **kwargs):
        future = Future()
        try:
            future.set_result(fun(*args, **kwargs))
        except Exception as exc:
            future.set_exception(exc)
        return future

loop = asyncio.get_running_loop()
loop.set_default_executor(SyncExecutor())

We could perhaps even include this in our example code.

drc38 · 2024-11-17T09:53:41Z

Thanks @astrand, I'd be comfortable with the first solution you've proposed.

astrand · 2024-11-18T07:35:02Z

New suggested solution at #678 .

See discussion at #676 --------- Co-authored-by: Peter Astrand <[email protected]> Co-authored-by: Patrick Roelke <[email protected]>

astrand requested review from mdwcrft and proelke as code owners November 14, 2024 08:11

astrand mentioned this pull request Nov 14, 2024

Blocking call detected #647

Closed

jainmohit2001 force-pushed the revert-threads branch from 7c087b7 to d8a8ef8 Compare November 14, 2024 11:36

astrand mentioned this pull request Nov 18, 2024

Disable threads if messages.ASYNC_VALIDATION = False #678

Merged

astrand closed this Nov 18, 2024

jainmohit2001 pushed a commit that referenced this pull request Dec 7, 2024

Disable threads if messages.ASYNC_VALIDATION = False (#678)

0dc23fb

See discussion at #676 --------- Co-authored-by: Peter Astrand <[email protected]> Co-authored-by: Patrick Roelke <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert threads usage; "add executor to validate_payload" #676

Revert threads usage; "add executor to validate_payload" #676

astrand commented Nov 14, 2024

astrand commented Nov 14, 2024

jainmohit2001 commented Nov 14, 2024

drc38 commented Nov 14, 2024

astrand commented Nov 14, 2024

astrand commented Nov 14, 2024

jainmohit2001 commented Nov 14, 2024

astrand commented Nov 14, 2024

drc38 commented Nov 14, 2024

drc38 commented Nov 15, 2024 •

edited

Loading

astrand commented Nov 17, 2024

drc38 commented Nov 17, 2024

astrand commented Nov 18, 2024

Revert threads usage; "add executor to validate_payload" #676

Revert threads usage; "add executor to validate_payload" #676

Conversation

astrand commented Nov 14, 2024

astrand commented Nov 14, 2024

jainmohit2001 commented Nov 14, 2024

drc38 commented Nov 14, 2024

astrand commented Nov 14, 2024

astrand commented Nov 14, 2024

jainmohit2001 commented Nov 14, 2024

astrand commented Nov 14, 2024

drc38 commented Nov 14, 2024

drc38 commented Nov 15, 2024 • edited Loading

astrand commented Nov 17, 2024

drc38 commented Nov 17, 2024

astrand commented Nov 18, 2024

drc38 commented Nov 15, 2024 •

edited

Loading