loop._thread_id == threading.current_thread().ident
I shall raise it with Andrew Godwin to discuss. Thanks!
Any other “you’re holding it wrong” pointers very welcome!
Thanks again.
Technically, you can create a loop in one thread and run it in another.
asyncio
doesn’t forbid it strictly but discourages.
I don’t know how many code will be broken we raise this restriction (but pretty sure some people did weird things already).
Maybe we can start with raising a warning though.
The reality is: passing a loop instance between threads brings you to the land where dragons live. It is dangerous but you might survive.
My concern with using loop._thread_id
for Django’s async_unsafe()
(which I presume is or will potentially be used in a large number of places) is that it will be relying on an internal attribute, meaning it could be changed or broken at any point in time without any guarantees. Also, it’s highly specific to BaseEventLoop
, if a user or another library substitutes their own event loop that lacks a loop._thread_id
, it won’t work.
If there is a legitimate use case for accessing loop._thread_id
, which it seems like there might be based on this topic, we could consider implementing a public @property to access it through, such as loop.thread_id
; or a public getter method, such as loop.get_thread_id()
.
Technically, you can create a loop in one thread and run it in another.
asyncio
doesn’t forbid it strictly but discourages.
I don’t know how many code will be broken we raise this restriction (but pretty sure some people did weird things already).
Maybe we can start with raising a warning though.
I’m curious as to how this warning would be best implemented though. IIRC, the only thread that’s actually tracked in the loop is the one that it’s ran in (in loop.run_forever()
), which is when loop._thread_id
is assigned. We’d likely need a separate attribute that’s set in new_event_loop()
that tracks what thread the loop was created in.
Exactly as you proposed.
We can assign self._creator_thread_id
(select the name) to the current thread on the loop creation (in BaseEventLoop.__init__
).
run_forever
can compare the actual thread id with stored on the creation stage and raise a warning if they don’t match.
I think you might want get_running_loop
(might not be the exact spelling). That lets you fetch the loop that the current code is running under (if any),
Do note though that get_running_loop()
only works within a coroutine, and that it raises a RuntimeError
when there isn’t one running. For fetching the event loop outside of a coroutine (or creating new one if there isn’t one in the main thread), you can use get_event_loop()
.
Exactly as you proposed.
We can assign self._creator_thread_id
(select the name) to the current thread on the loop creation (in BaseEventLoop.__init__
).
run_forever
can compare the actual thread id with stored on the creation stage and raise a warning if they don’t match.
Sounds good, I can open a bpo issue later today to discuss it further (if this is something we want to consider implementing).
aeros:
IIRC, the only thread that’s actually tracked in the loop is the one that it’s ran in (in loop.run_forever()
), which is when loop._thread_id
is assigned. We’d likely need a separate attribute that’s set in new_event_loop()
that tracks what thread the loop was created in.
Just to clarify, our need was/is exactly for the thread the loop is running in, regardless of where it was created.
carltongibson:
Just to clarify, our need was/is exactly for the thread the loop is running in , regardless of where it was created.
The above was in reference to @asvetlov’s idea for raising a warning when an event loop is ran in a separate thread from where it was created. From my understanding, this was just to serve as an advisory warning to caution users about creating a loop in one thread and running it in another, since using the same loop across different threads isn’t particularly safe.
For the purposes of determining which thread a loop is running from, I was suggesting a public property or getter for loop._thread_id
(which only tracks what thread the loop is running in).
My question: how is loop.get_thread_id()
widespread?
I think if Daphne creates a loop and controls its lifetime it also can know what loop belongs to what thread.
Adding a new API always means a maintenance cost, even if the cost is low. Also, the loop class can be provided by third-party libraries, adding a method to the loop means that all third-parties should implement it as well.
Don’t get me wrong: the new method is possible but I’d like to see strong motivation why is it necessary. “Nice to have” motivation is not enough for this case, sorry.
asvetlov:
Adding a new API always means a maintenance cost, even if the cost is low. Also, the loop class can be provided by third-party libraries, adding a method to the loop means that all third-parties should implement it as well.
Don’t get me wrong: the new method is possible but I’d like to see strong motivation why is it necessary. “Nice to have” motivation is not enough for this case, sorry.
I agree. That’s primarily why I worded it as “there might be a decent argument for adding a public means of accessing this attribute…”, to prompt the discussion around the practical use cases for a loop.get_thread_id()
rather than making a formal proposal myself. I figured that someone might have a realistic need for it based on the post, but I personally don’t.
Also, regarding the bpo issue for the warning we discussed earlier: I’ve recently been a bit additionally occupied with some job opportunities, so my time has been more limited than usual. I still plan on opening one though in the near future. I would be glad to help implement the warning, assuming we’re all in agreement and there’s no rush to add it.
Hi all. Thanks for the interesting replies. I leave it to your judgement whether new API is warranted.
I will try get_running_loop()
. Given Kyle’s comment, I’m not sure if it’ll solve the need. I need to test it, and it’s 3.7+ but maybe.
I think the task that async_unsafe()
is trying to achieve is a reasonable use-case. (Catch cases were you really should have used a ThreadPoolExecutor
, or a different API inside a coroutine.)
The Daphne case, where this came up, is a distraction: the issue isn’t Daphne specific…
Yes, now we’re creating a new event loop we can make sure we control what thread it’s used on, but…
We were just naively calling get_event_loop()
in the natural place, and found this com up. I suspect other folks will do the same. We’ve seen the same issue pop up with Django 3.0 and Jupyter. If you call get_event_loop()
before threading you’re going to hit this kind of issue… — maybe users should handle it better, but it’s a question of how to know to do that, other than by experience?
Thanks again for the discussion: really very helpful.
Happy New Year!
Looking at async_unsafe
sources I can say that get_running_loop()
is exactly what are you looking for.
It exists starting from Python 3.7, agree – but any new API can be added into Python 3.9 only which is much worse for your case.
Regarding get_event_loop()
functionality: @yselivanov expressed an idea of deprecation the function.
The proposal was too early for Python 3.8 but I support the deprecation for upcoming Pythons and even for 3.9 maybe.