[Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)

Wed Oct 12 04:01:43 EDT 2016

On Wed, Oct 12, 2016 at 12:17 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:
> On 12.10.16 09:31, Nathaniel Smith wrote:
>>
>> But amortized O(1) deletes from the front of bytearray are totally
>> different, and more like amortized O(1) appends to list: there are
>> important use cases[1] that simply cannot be implemented without some
>> feature like this, and putting the implementation inside bytearray is
>> straightforward, deterministic, and more efficiently than hacking
>> together something on top. Python should just guarantee it, IMO.
>>
>> -n
>>
>> [1] My use case is parsing HTTP out of a receive buffer. If deleting
>> the first k bytes of an N byte buffer is O(N), then not only does
>> parsing becomes O(N^2) in the worst case, but it's the sort of O(N^2)
>> that random untrusted network clients can trigger at will to DoS your
>> server.
>
>
> Deleting from buffer can be avoided if pass the starting index together with
> the buffer. For example:
>
>     def read_line(buf: bytes, start: int) -> (bytes, int):
>         try:
>             end = buf.index(b'\r\n', start)
>         except ValueError:
>             return b'', start
>
>         return buf[start:end], end+2

It's more complicated than that -- the right algorithm is the one that
Antoine implemented in 3.4. But yes, having implemented this by hand,
I am aware that it can be implemented by hand :-). My point is that
forcing everyone who writes network code in Python to do that is
silly, especially given that CPython's apparently been shipping this
feature for years.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org