[Python-ideas] Integrate some itertools into the Python syntax

Ethan Furman ethan at stoneleaf.us
Tue Mar 22 15:41:46 EDT 2016


On 03/22/2016 12:39 PM, Ethan Furman wrote:
> On 03/22/2016 12:26 PM, Michel Desmoulin wrote:
>> Le 22/03/2016 20:23, Ethan Furman a écrit :
>>> On 03/22/2016 12:10 PM, Koos Zevenhoven wrote:
>>>> On Tue, Mar 22, 2016 at 8:41 PM, Ethan Furman wrote:
>>>>> On 03/22/2016 10:51 AM, Michel Desmoulin wrote:
>>>>>
>>>>>> def foo(p):
>>>>>>        with open(p) as f:
>>>>>>            def begin:
>>>>>>                return x == "BEGIN SECTION"
>>>>>>             def end:
>>>>>>                return x == "STOP"
>>>>>>            return f[begin, end][:10000]
>>>>>>
>>>>>> It's very clean, very convenient, very natural, and memory efficient.
>>>>>
>>>>>
>>>>> Except the 10,000 limit doesn't happen until /after/ the end block is
>>>>> reached -- which could be a million lines later.
>>>>
>>>>
>>>> if f[begin, end] is a generator, the 10000 limit may happen before the
>>>> end block is reached, which I think was the point.
>>>
>>> That is wrong, which was my point: the `[:10000]` doesn't take effect
>>> until after `f[begin:end]` (whatever it is) is evaluated.
>  >
>> [begin, end] and [:10000] are applied one next() at a time.
>>
>> begin, then end, then :10000 for the first next(),
>>
>> then again in that order for the following next() call, etc.
>>
>> That's the whole point.
>
> That may be the point, but that is not what the above code does.  Since
> you don't believe me, let's break it down:
>
> f[begin:end] -> grabs a section of p.  This could be 5 lines or 50000000
>
> [:10000] -> take the first 10000 of the previous result
>
> return -> send those (up to) 10000 lines back

In case my point still isn't clear:  if you just stored 49,990,000 lines 
just to throw them away, you are not being memory efficient.

--
~Ethan~


More information about the Python-ideas mailing list