[Python-Dev] PEP 3148 ready for pronouncement
Brian Quinlan
brian at sweetapp.com
Mon May 24 11:36:50 CEST 2010
On May 24, 2010, at 5:16 AM, Glyph Lefkowitz wrote:
>
> On May 23, 2010, at 2:37 AM, Brian Quinlan wrote:
>
>> On May 23, 2010, at 2:44 PM, Glyph Lefkowitz wrote:
>>
>>>
>>> On May 22, 2010, at 8:47 PM, Brian Quinlan wrote:
>>>
>>>> Jesse, the designated pronouncer for this PEP, has decided to
>>>> keep discussion open for a few more days.
>>>>
>>>> So fire away!
>>>
>>> As you wish!
>>
>> I retract my request ;-)
>
> May you get what you wish for, may you find what you are seeking :).
>
>>> The PEP should be consistent in its usage of terminology about
>>> callables. It alternately calls them "callables", "functions",
>>> and "functions or methods". It would be nice to clean this up and
>>> be consistent about what can be called where. I personally like
>>> "callables".
>>
>> Did you find the terminology confusing? If not then I propose not
>> changing it.
>
> Yes, actually. Whenever I see references to the multiprocessing
> module, I picture a giant "HERE BE (serialization) DRAGONS" sign.
> When I saw that some things were documented as being "functions", I
> thought that maybe there was intended to be a restriction like the
> "these can only be top-level functions so they're easy for different
> executors to locate and serialize". I didn't realize that the
> intent was "arbitrary callables" until I carefully re-read the
> document and noticed that the terminology was inconsistent.
ProcessPoolExecutor has the same serialization perils that
multiprocessing does. My original plan was to link to the
multiprocessing docs to explain them but I couldn't find them listed.
>> But changing it in the user docs is probably a good idea. I like
>> "callables" too.
>
> Great. Still, users will inevitably find the PEP and use it as
> documentation too.
>
>>> The execution context of callable code is not made clear.
>>> Implicitly, submit() or map() would run the code in threads or
>>> processes as defined by the executor, but that's not spelled out
>>> clearly.
>
> Any response to this bit? Did I miss something in the PEP?
Yes, the execution context is Executor-dependent. The section under
ProcessPoolExecutor and ThreadPoolExecutor spells this out, I think.
>>> More relevant to my own interests, the execution context of the
>>> callables passed to add_done_callback and remove_done_callback is
>>> left almost completely to the imagination. If I'm reading the
>>> sample implementation correctly, <http://code.google.com/p/pythonfutures/source/browse/branches/feedback/python3/futures/process.py#241
>>> >, it looks like in the multiprocessing implementation, the done
>>> callbacks are invoked in a random local thread. The fact that
>>> they are passed the future itself *sort* of implies that this is
>>> the case, but the multiprocessing module plays fast and loose with
>>> object identity all over the place, so it would be good to be
>>> explicit and say that it's *not* a pickled copy of the future
>>> sitting in some arbitrary process (or even on some arbitrary
>>> machine).
>>
>> The callbacks will always be called in a thread other than the main
>> thread in the process that created the executor. Is that a strong
>> enough contract?
>
> Sure. Really, almost any contract would work, it just needs to be
> spelled out. It might be nice to know whether the thread invoking
> the callbacks is a daemon thread or not, but I suppose it's not
> strictly necessary.
Your concerns is that the thread will be killed when the interpreter
exits? It won't be.
>>> This is really minor, I know, but why does it say "NOTE: This
>>> method can be used to create adapters from Futures to Twisted
>>> Deferreds"? First of all, what's the deal with "NOTE"; it's the
>>> only "NOTE" in the whole PEP, and it doesn't seem to add
>>> anything. This sentence would read exactly the same if that word
>>> were deleted. Without more clarity on the required execution
>>> context of the callbacks, this claim might not actually be true
>>> anyway; Deferred callbacks can only be invoked in the main reactor
>>> thread in Twisted. But even if it is perfectly possible, why
>>> leave so much of the adapter implementation up to the
>>> imagination? If it's important enough to mention, why not have a
>>> reference to such an adapter in the reference Futures
>>> implementation, since it *should* be fairly trivial to write?
>>
>> I'm a bit surprised that this doesn't allow for better
>> interoperability with Deferreds given this discussion:
>
>> <discussion snipped>
>
> I did not communicate that well. As implemented, it's quite
> possible to implement a translation layer which turns a Future into
> a Deferred. What I meant by that comment was, the specification in
> the PEP was to loose to be sure that such a layer would work with
> arbitrary executors.
>
> For what it's worth, the Deferred translator would look like this,
> if you want to include it in the PEP (untested though, you may want
> to run it first):
>
> from twisted.internet.defer import Deferred
> from twisted.internet.reactor import callFromThread
>
> def future2deferred(future):
> d = Deferred()
> def invoke_deferred():
> try:
> result = future.result()
> except:
> d.errback()
> else:
> d.callback(result)
> def done_callback(same_future):
> callFromThread(invoke_deferred)
> future.add_done_callback(done_callback)
> return d
>
> This does beg the question of what the traceback will look like in
> that except: block though. I guess the multi-threaded executor will
> use python3 exception chaining so Deferred should be able to show a
> sane traceback in case of an error, but what about exceptions in
> other processes?
>
>>> I suggest having have add_done_callback, implementing it with a
>>> list so that callbacks are always invoked in the order that
>>> they're added, and getting rid of remove_done_callback.
>>
>> Sounds good to me!
>
> Great! :-)
>
>>> futures._base.Executor isn't exposed publicly, but it needs to
>>> be. The PEP kinda makes it sound like it is ("Executor is an
>>> abstract class..."). Plus, A third party library wanting to
>>> implement an executor of its own shouldn't have to copy and paste
>>> the implementation of Executor.map.
>>
>> That was a bug that I've fixed. Thanks!
>
> Double-great!
>
>>> One minor suggestion on the "internal future methods" bit -
>>> something I wish we'd done with Deferreds was to put 'callback()'
>>> and 'addCallbacks()' on separate objects, so that it was very
>>> explicit whether you were on the emitting side of a Deferred or
>>> the consuming side. That seems to be the case with these internal
>>> methods - they are not so much "internal" as they are for the
>>> producer of the Future (whether a unit test or executor) so you
>>> might want to put them on a different object that it's easy for
>>> the thing creating a Future() to get at but hard for any
>>> subsequent application code to fiddle with by accident. Off the
>>> top of my head, I suggest naming it "Invoker()". A good way to do
>>> this would be to have an Invoker class which can't be instantiated
>>> (raises an exception from __init__ or somesuch), then a
>>> Future.create() method which returns an Invoker, which itself has
>>> a '.future' attribute.
>
> No reaction on this part? I think you'll wish you did this in a
> couple of years when you start bumping into application code that
> calls "set_result" :).
My reactions are mixed ;-)
Your proposal is to add a level of indirection to make it harder for
people to call implementation methods. The downside is that it makes
it a bit harder to write tests and Executors. I also can't see a big
problem in letting people call set_result in client code though it is
documented as being only for Executor implementations and tests.
On the implementation side, I don't see why an Invoker needs a
reference to the future. Each Invoker could own one Future. A
reference to the Invoker is kept by the Executor and its future is
returned to the client i.e.
class Invoker(object):
def __init__(self):
"""Should only be called by Executor implementations."""
self.future = Future()
def set_running_or_notify_cancel(self):
# Messes with self.future's internals
def set_result(self):
# Messes with self.future's internals
def set_exception(self):
# Messes with self.future's internals
Cheers,
Brian
>>> Finally, why isn't this just a module on PyPI? It doesn't seem
>>> like there's any particular benefit to making this a stdlib module
>>> and going through the whole PEP process - except maybe to prompt
>>> feedback like this :).
>>
>> We've already had this discussion before. Could you explain why
>> this module should *not* be in the stdlib e.g. does it have
>> significantly less utility than other modules in stdlib? Is it
>> significantly higher risk? etc?
>
> You've convinced me, mainly because I noticed later on in the
> discussion that it *has* been released to pypi for several months,
> and does have a bunch of downloads. It doesn't have quite the
> popularity I'd personally like to see for stdlib modules, but it's
> not like you didn't try, and you do (sort of) have a point about
> small modules being hard to get adoption. I'm sorry that this, my
> least interesting point in my opinion, is what has seen the most
> discussion so far.
>
> I'd appreciate it if you could do a release to pypi with the
> bugfixes you mentioned here, to make sure that the released version
> is consistent with what eventually gets into Python.
>
> Oh, and one final nitpick: <http://www.rfc-editor.org/rfc/
> rfc2606.txt> says you really should not put real domain names into
> your "web crawl example", especially not "some-made-up-domain.com".
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brian%40sweetapp.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100524/ec371d18/attachment-0001.html>
More information about the Python-Dev
mailing list