[Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)

Nathaniel Smith njs at pobox.com
Wed Oct 12 02:31:04 EDT 2016


On Tue, Oct 11, 2016 at 10:53 PM, Serhiy Storchaka <storchaka at gmail.com> wrote:
> On 12.10.16 08:03, Nathaniel Smith wrote:
>>
>> On Tue, Oct 11, 2016 at 9:08 PM, INADA Naoki <songofacandy at gmail.com>
>> wrote:
>>>
>>> From Python 3.4, bytearray is good solution for I/O buffer, thanks to
>>> #19087 [1].
>>> Actually, asyncio uses bytearray as I/O buffer often.
>>
>>
>> Whoa what?! This is awesome, I had no idea that bytearray had O(1)
>> deletes at the front. I literally reimplemented this myself on type of
>> bytearray for some 3.5-only code recently because I assumed bytearray
>> had the same asymptotics as list, and AFAICT this is totally
>> undocumented. Shouldn't we write this down somewhere? Maybe here? ->
>> https://docs.python.org/3/library/functions.html#bytearray
>
>
> I afraid this is CPython implementation detail (like string concatenation
> optimization). Other implementations can have O(N) deletes at the front of
> bytearray.

Well, it shouldn't be :-).

The problem with the string concatenation optimization is that to
work, it requires both the use of refcounting GC and that you get
lucky with how the underlying malloc implementation happened to lay
things out in memory. Obviously it shouldn't be part of the language
spec.

But amortized O(1) deletes from the front of bytearray are totally
different, and more like amortized O(1) appends to list: there are
important use cases[1] that simply cannot be implemented without some
feature like this, and putting the implementation inside bytearray is
straightforward, deterministic, and more efficiently than hacking
together something on top. Python should just guarantee it, IMO.

-n

[1] My use case is parsing HTTP out of a receive buffer. If deleting
the first k bytes of an N byte buffer is O(N), then not only does
parsing becomes O(N^2) in the worst case, but it's the sort of O(N^2)
that random untrusted network clients can trigger at will to DoS your
server.

-- 
Nathaniel J. Smith -- https://vorpus.org


More information about the Python-Dev mailing list