should writing Unicode files be so slow
djc
slais-www at ucl.ac.uk
Sun Mar 21 10:29:58 EDT 2010
Antoine Pitrou wrote:
> Le Fri, 19 Mar 2010 17:18:17 +0000, djc a écrit :
>> changing
>> with open(filename, 'rU') as tabfile: to
>> with codecs.open(filename, 'rU', 'utf-8', 'backslashreplace') as
>> tabfile:
>>
>> and
>> with open(outfile, 'wt') as out_part: to
>> with codecs.open(outfile, 'w', 'utf-8') as out_part:
>>
>> causes a program that runs in
>> 43 seconds to take 4 minutes to process the same data.
>
> codecs.open() (and the object it returns) is slow as it is written in
> pure Python.
>
> Accelerated reading and writing of unicode files is available in Python
> 2.7 and 3.1, using the new `io` module.
Thank you, for a clear and to the point explanation. I shall concentrate on
finding an optimal time to upgrade from Python 2.6.
--
David Clark, MSc, PhD. UCL Centre for Publishing
Gower Str London WCIE 6BT
What sort of web animal are you?
<https://www.bbc.co.uk/labuk/experiments/webbehaviour>
More information about the Python-list
mailing list