unicode() vs. s.decode()
Thorsten Kampe
thorsten at thorstenkampe.de
Sat Aug 8 07:16:12 EDT 2009
* Steven D'Aprano (08 Aug 2009 03:29:43 GMT)
> On Fri, 07 Aug 2009 17:13:07 +0200, Thorsten Kampe wrote:
> > One guy claims he has times between 2.7 and 5.7 seconds when
> > benchmarking more or less randomly generated "one million different
> > lines". That *is* *exactly* nothing.
>
> We agree that in the grand scheme of things, a difference of 2.7 seconds
> versus 5.7 seconds is a trivial difference if your entire program takes
> (say) 8 minutes to run. You won't even notice it.
Exactly.
> But why assume that the program takes 8 minutes to run? Perhaps it takes
> 8 seconds to run, and 6 seconds of that is the decoding. Then halving
> that reduces the total runtime from 8 seconds to 5, which is a noticeable
> speed increase to the user, and significant if you then run that program
> tens of thousands of times.
Exactly. That's why it doesn't make sense to benchmark decode()/unicode
() isolated - meaning out of the context of your actual program.
> By all means, reminding people that pre-mature optimization is a
> waste of time, but it's possible to take that attitude too far to Planet
> Bizarro. At the point that you start insisting, and emphasising, that a
> three second time difference is "*exactly*" zero,
Exactly. Because it was not generated in a real world use case but by
running a simple loop one millions times. Why one million times? Because
by running it "only" one hundred thousand times the difference would
have seen even less relevant.
> it seems to me that this is about you winning rather than you giving
> good advice.
I already gave good advice:
1. don't benchmark
2. don't benchmark until you have an actual performance issue
3. if you benchmark then the whole application and not single commands
It's really easy: Michael has working code. With that he can easily
write two versions - one that uses decode() and one that uses unicode().
He can benchmark these with some real world input he often uses by
running it a hundred or a thousand times (even a million if he likes).
Then he can compare the results. I doubt that there will be any
noticeable difference.
Thorsten
More information about the Python-list
mailing list