Garbage collection problem with generators
Haochuan Guo
guohaochuan at gmail.com
Fri Dec 23 07:39:17 EST 2016
Hi, everyone
I'm building a http long polling client for our company's discovery service
and something weird happened in the following code:
```python
while True:
try:
r = requests.get("url", stream=True, timeout=3)
for data in r.iter_lines():
processing_data...
except TimeoutException:
time.sleep(10)
```
When I deliberately times out the request and then check the connections
with `lsof -p process`, I discover that there are *two active
connections*(ESTABLISH)
instead of one. After digging around, it turns out it might not be the
problem with `requests` at all, but gc related to generators.
So I write this script to demonstrate the problem:
https://gist.github.com/wooparadog/766f8007d4ef1227f283f1b040f102ef
Function `A.a` will return a generator which will raise an exception. And
in function `b`, I'm building new a new instance of `A` and iterate over
the exception-raising generator. In the exception handler, I'll close the
generator, delete it, delete the `A` instance, call `gc.collect()` and do
the whole process all over again.
There's another greenlet checking the `A` instances by using
`gc.get_objects()`. It turns out there are always two `A` instances.
This is reproducible with python2.7, but not in python3.5. I've also tried
with `thread` instead of `gevent`, it still happens. I'm guessing it's
related to garbage collection of generators.
Did I bump into a python2 bug? Or am I simply wrong about the way to close
generators...?
Thanks
More information about the Python-list
mailing list