unicode() vs. s.decode()
Mark Lawrence
breamoreboy at yahoo.co.uk
Fri Aug 7 03:04:51 EDT 2009
Michael Ströder wrote:
> Thorsten Kampe wrote:
>> * Michael Ströder (Thu, 06 Aug 2009 18:26:09 +0200)
>>>>>> timeit.Timer("unicode('äöüÄÖÜß','utf-8')").timeit(10000000)
>>> 17.23644495010376
>>>>>> timeit.Timer("'äöüÄÖÜß'.decode('utf8')").timeit(10000000)
>>> 72.087096929550171
>>>
>>> That is significant! So the winner is:
>>>
>>> unicode('äöüÄÖÜß','utf-8')
>> Unless you are planning to write a loop that decodes "äöüÄÖÜß" one
>> million times, these benchmarks are meaningless.
>
> Well, I can tell you I would not have posted this here and checked it if it
> would be meaningless for me. You don't have to read and answer this thread if
> it's meaningless to you.
>
> Ciao, Michael.
I believe that the comment "these benchmarks are meaningless" refers to
the length of the strings being used in the tests. Surely something
involving thousands or millions of characters is more meaningful? Or to
go the other way, you are unlikely to write
for c in 'äöüÄÖÜß':
u = unicode(c, 'utf-8')
...
Yes?
--
Kindest regards.
Mark Lawrence.
More information about the Python-list
mailing list