[Python-Dev] Accepting PEP 3154 for 3.4?
solipsis at pitrou.net
Tue Nov 19 00:10:14 CET 2013
Ok, how about merging the two sub-threads :-)
On Mon, 18 Nov 2013 16:44:59 -0600
Tim Peters <tim.peters at gmail.com> wrote:
> > You can't know how much space the pickle will take until the pickling
> > ends, though, which makes it difficult to decide whether you want to
> > emit a PREFETCH opcode or not.
> Ah, of course. Presumably the outgoing pickle stream is first stored
> in some memory buffer, right? If pickling completes before the buffer
> is first flushed, then you know exactly how large the entire pickle
> is. If "it's small" (say, < 100 bytes), don't write out the PREFETCH
> part. Else do.
That's true. We could also have a SMALLPREFETCH opcode with a one-byte
length to still get the benefits of prefetching.
> > Well, yes: much better memory usage for large pickles.
> > Some people use pickles to store huge data, which was the motivation to
> > add the 8-byte-size opcodes after all.
> We'd have the same advantage _if_ it were feasible to know the entire
> size up front. I understand now that it's not feasible.
AFAICT, it would only be possible by doing two-pass pickling, which
would also slow it down massively.
> A long-running process can legitimately put billions of items on work
> queues, far more than could ever fit in RAM simultaneously. Comparing
> this to PyObject overhead makes no sense to me. Neither does the line
> of argument "there are several kinds of overheads, so making this
> overhead worse too doesn't matter".
Well, it's a question of cost / benefit: does it make sense to optimize
something that will be dwarfed by other factors in real world
> When possible, we should strive not to add overheads that don't repay
> their costs. For small pickles, an 8-byte size field doesn't appear
> to buy anything. But I appreciate that it costs implementation effort
> to avoid producing it in these cases.
I share the concern, although I still don't think the "ocean of tiny
pickles" is a reasonable use case :-)
That said, assuming you think this is important (do you?), we're left
with the following constraints:
- it would be nice to have this PEP in 3.4
- 3.4 beta1 and feature freeze is in approximately one week
- switching to the PREFETCH scheme requires some non-trivial work on the
current patch, work done by either Alexandre or me (but I already
have pathlib (PEP 428) on my plate, so it'll have to be Alexandre) -
unless you want to do it, of course?
What do you think?
More information about the Python-Dev