sort by last then by first

Padraig at Linux.ie Padraig at Linux.ie
Tue Jan 28 12:38:15 EST 2003


Andrew Dalke wrote:
> Padraig at Linux.ie replied to my post:
>  > I was suggesting this as a way to auto convert data that
>  > was already in string format. For e.g. if importing tabular
>  > data, using split, then you get all string fields. Actually
>  > it would be good if you could pass to split what format the
>  > fields are, something like:
>  >
>  > line.split('\t',4,(str,int,float,str))
> 
>  >>> line = "Spam\t-1\t9.8\tEggs"
>  >>> [f(s) for f, s in zip( (str, int, float, str), line.split('\t') )]
> ['Spam', -1, 9.8000000000000007, 'Eggs']

nice, but still more overhead than what I suggested.
Still there's always a tradeoff between speed and
usability/generality, and I think you're correct here.

>  > > Why not just write this as Python code?
>  >
>  > speed
> 
> which is why I suggested
> ] f = attr("name1").reverse() + attr("name2").as(int) + \
> ]        col(4).as(float).reverse()
>    ....
> ] and for optimization you can also implement
> ]
> ] f.sort(x)
> ]
> ] which would do the fast decorate, sort, undecorate.
> 
> This should be exactly as fast as anything you can do by
> special encoding of the sort parameters in the sort() call.
> Maybe even faster, because they way you want it implies
> there may still be O(n Log(n)) function callbacks into
> Python, which is not necessarily the case with
> decorate/sort/undecorate.
> 
> And potentially nicer because I can extend the system as
> I wish, in Python.
> 
> So again, the answer to your question/statement:
>  > wouldn't it be cool if [].sort() took an optional parameter that
>  > was essentially the --key option in gnu sort. in this e.g:
> 
> is no.

Agreed :-)

Pádraig.





More information about the Python-list mailing list