Sequence splitting
Pablo Torres N.
tn.pablo at gmail.com
Fri Jul 3 00:40:13 EDT 2009
On Thu, Jul 2, 2009 at 23:34, Brad<schickb at gmail.com> wrote:
> On Jul 2, 9:08 pm, Paul Rubin <http://phr...@NOSPAM.invalid> wrote:
>> Brad <schi... at gmail.com> writes:
>> > On Jul 2, 8:14 pm, Paul Rubin <http://phr...@NOSPAM.invalid> wrote:
>> > > schickb <schi... at gmail.com> writes:
>> > > > def split(seq, func=None):
>> > > > if func is None:
>> > > > func = bool
>> > > > t, f = [], []
>> > > > for item in seq:
>> > > > if func(item):
>> > > > t.append(item)
>> > > > else:
>> > > > f.append(item)
>> > > > return (t, f)
>>
>> > > untested:
>>
>> > > def split(seq, func=bool):
>> > > xs = zip(seq, itertools.imap(func, seq))
>> > > t = list(x for (x,y) in xs if y)
>> > > f = list(x for (x,y) in xs if not y)
>> > > return (t, f)
>>
>> > In my testing that is 3.5x slower than the original solution (and less
>> > clear imo). I fixed my version to take a bool default. Either way, I'm
>> > not really looking for additional ways to do this in Python unless
>> > I've totally missed something. What I am considering is writing it in
>> > C, much like filter.
>>
>> I'm a little skeptical that the C version will help much, if it's
>> evaluating a python function at every list element.
>
> Perhaps true, but it would be a nice convenience (for me) as a built-
> in written in either Python or C. Although the default case of a bool
> function would surely be faster.
>
>> Here's a variant of your version:
>>
>> def split(seq, func=bool):
>> t, f = [], []
>> ta, fa = t.append, f.append
>> for item in seq:
>> (ta if func(item) else fa)(item)
>> return (t, f)
>>
>> This avoids some dict lookups and copying. I wonder if that helps
>> significantly.
>
> Faster, but in tests of a few short sequences only 1% so.
>
> -Brad
> --
> http://mail.python.org/mailman/listinfo/python-list
>
If it is speed that we are after, it's my understanding that map and
filter are faster than iterating with the for statement (and also
faster than list comprehensions). So here is a rewrite:
def split(seq, func=bool):
t = filter(func, seq)
f = filter(lambda x: not func(x), seq)
return list(t), list(f)
The lambda thing is kinda ugly, but I can't think of anything else.
Also, is it ok to return lists? Py3k saw a lot of APIs changed to
return iterables instead of lists, so maybe my function should have
'return t, f' as it's last statement.
--
Pablo Torres N.
More information about the Python-list
mailing list