advice : how do you iterate with an acc ?

Bengt Richter bokr at oz.net
Sat Dec 3 18:46:55 EST 2005


On 3 Dec 2005 03:28:19 -0800, bonono at gmail.com wrote:

>
>Bengt Richter wrote:
>> On 2 Dec 2005 18:34:12 -0800, bonono at gmail.com wrote:
>>
>> >
>> >Bengt Richter wrote:
>> >> It looks to me like itertools.groupby could get you close to what you want,
>> >> e.g., (untested)
>> >Ah, groupby. The generic string.split() equivalent. But the doc said
>> >the input needs to be sorted.
>> >
>>
>>  >>> seq = [3,1,4,'t',0,3,4,2,'t',3,1,4]
>>  >>> import itertools
>>  >>> def condition(item): return item=='t'
>>  ...
>>  >>> def dosomething(it): return 'doing something with %r'%list(it)
>>  ...
>>  >>> for condresult, acciter in itertools.groupby(seq, condition):
>>  ...     if not condresult:
>>  ...         dosomething(acciter)
>>  ...
>>  'doing something with [3, 1, 4]'
>>  'doing something with [0, 3, 4, 2]'
>>  'doing something with [3, 1, 4]'
>>
>> I think the input only needs to be sorted if you a trying to group sorted subsequences of the input.
>> I.e., you can't get them extracted together unless the condition is satisfied for a contiguous group, which
>> only happens if the input is sorted. But AFAIK the grouping logic just scans and applies key condition
>> and returns iterators for the subsequences that yield the same key function result, along with that result.
>> So it's a general subsequence extractor. You just have to supply the key function to make the condition value
>> change when a group ends and a new one begins. And the value can be arbitrary, or just toggle beween two values, e.g.
>>
>>  >>> for condresult, acciter in itertools.groupby(range(20), lambda x:x%3==0 or x==5):
>>  ...     print '%6s: %r'%(condresult, list(acciter))
>>  ...
>>    True: [0]
>>   False: [1, 2]
>>    True: [3]
>>   False: [4]
>>    True: [5, 6]
>>   False: [7, 8]
>>    True: [9]
>>   False: [10, 11]
>>    True: [12]
>>   False: [13, 14]
>>    True: [15]
>>   False: [16, 17]
>>    True: [18]
>>   False: [19]
>>
>> or a condresult that stays the same in groups, but every group result is different:
>>
>>  >>> for condresult, acciter in itertools.groupby(range(20), lambda x:x//3):
>>  ...     print '%6s: %r'%(condresult, list(acciter))
>>  ...
>>       0: [0, 1, 2]
>>       1: [3, 4, 5]
>>       2: [6, 7, 8]
>>       3: [9, 10, 11]
>>       4: [12, 13, 14]
>>       5: [15, 16, 17]
>>       6: [18, 19]
>>
>Thanks. So it basically has an internal state storing the last
>"condition" result and if it flips(different), a new group starts.
>
So it appears. But note that "flips(different)" seems to be based on ==,
and default key function is just passthrough like lambda x:x, so e.g. integers
and floats will group together if their values are equal.
E.g., to elucidate further,

Default key function:
 >>> from itertools import groupby
 >>> for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j]):
 ...     print k, list(g)
 ...
 0 [0, 0.0, 0j]
 [] [[]]
 () [()]
 None [None]
 1 [1, 1.0]
 1j [1j]

Group by bool value:
 >>> for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j], key=bool):
 ...     print k, list(g)
 ...
 False [0, 0.0, 0j, [], (), None]
 True [1, 1.0, 1j]

It's not trying to sort, so it doesn't trip on complex
 >>> for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j]):
 ...     print k, list(g)
 ...
 0 [0, 0.0, 0j]
 [] [[]]
 () [()]
 None [None]
 1 [1, 1.0]
 1j [1j]
 2j [2j]

But you have to watch out if you try to pre-sort stuff that includes complex numbers
 >>> for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j])):
 ...     print k, list(g)
 ...
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
 TypeError: cannot compare complex numbers using <, <=, >, >=

And if you do sort using a key function, it doesn't mean groupy inherits that keyfunction for grouping
unless you specify it

 >>> def keyfun(x):
 ...     if isinstance(x, (int, long, float)): return x
 ...     else: return type(x).__name__
 ...
 >>> for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j], key=keyfun)):
 ...     print k, list(g)
 ...
 0 [0, 0.0]
 1 [1, 1.0]
 None [None]
 0j [0j]
 1j [1j]
 2j [2j]
 [] [[]]
 () [()]

Vs giving groupby the same keyfun
 >>> for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j], key=keyfun), keyfun):
 ...     print k, list(g)
 ...
 0 [0, 0.0]
 1 [1, 1.0]
 NoneType [None]
 complex [0j, 1j, 2j]
 list [[]]
 tuple [()]


Exmple of unsorted vs sorted subgroup extraction:

 >>> for k,g in groupby('this that other thing note order'.split(), key=lambda s:s[0]):
 ...     print k, list(g)
 ...
 t ['this', 'that']
 o ['other']
 t ['thing']
 n ['note']
 o ['order']

vs.

 >>> for k,g in groupby(sorted('this that other thing note order'.split()), key=lambda s:s[0]):
 ...     print k, list(g)
 ...
 n ['note']
 o ['order', 'other']
 t ['that', 'thing', 'this']

Oops, that key would be less brittle as (untested) key=lambda s:s[:1], e.g., in case a split with args was used.

Regards,
Bengt Richter



More information about the Python-list mailing list