[Python-Dev] Accepting PEP 3154 for 3.4?
Tim Peters
tim.peters at gmail.com
Mon Nov 18 23:18:21 CET 2013
[Tim]
>> But I wonder why it isn't done with a new framing opcode instead (say,
>> FRAME followed by 8-byte count). I suppose that would be like the
>> "prefetch" idea, except that framing opcodes would be mandatory
>> (instead of optional) in proto 4. Why I initially like that:
>>
>> - Uniform decoding loop ("the next thing" _always_ starts with an opcode).
> But it's not actually uniform. A frame isn't a normal opcode, it's a
> large section of bytes that contains potentially many opcodes.
>
> The framing layer is really below the opcode layer, it makes also sense
> to implement it like that.
That makes sense to me.
> (I also tried to implement Serhiy's PREFETCH idea, but it didn't bring
> any actual simplification)
But it has a different kind of advantage: PREFETCH was optional. As
Guido said, it's annoying to bloat the size of small pickles (which
may, although individually small, occur in great numbers) by 8 bytes
each. There's really no point to framing small chunks of data, right?
Which leads to another idea: after the PROTO opcode, there is, or is
not, an optional PREFETCH opcde with an 8-byte argument. If the
PREFETCH opcode exists, then it gives the number of bytes up to and
including the pickle's STOP opcode. So there's exactly 0 or 1
PREFETCH opcodes per pickle.
Is there an advantage to spraying multiple 8-byte "frame counts"
throughout a pickle stream? 8 bytes is surely enough to specify the
size of any single pickle for half a generation ;-) to come.
>> When slinging 8-byte counts, _some_ sanity-checking seems like a good idea ;-)
> I don't know. It's not much worse (for denial of service opportunities)
> than a 4-byte count, which already exists in earlier protocols.
I'm not thinking of DOS at all, just general sanity as data objects
get larger & larger. Pickles have almost no internal checks now. But
I've seen my share of corrupted pickles! About the only thing that
catches them early is hitting a byte that isn't a legitimate pickle
opcode. That _used_ to be a much stronger check than it is now,
because the 8-bit opcode space was sparsely populated at first. But,
over time, more and more opcodes get added, so the chance of mistaking
a garbage byte for a legit opcode has increased correspondingly.
A PREFETCH opcode with a "bytes until STOP" makes for a decent bad ;-)
sanity check too ;-)
More information about the Python-Dev
mailing list