How do I display unicode value stored in a string variable using ord()
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Mon Aug 20 01:56:10 EDT 2012
On Mon, 20 Aug 2012 00:44:22 -0400, Roy Smith wrote:
> In article <5031bb2f$0$29972$c3e8da3$5496439d at news.astraweb.com>,
> Steven D'Aprano <steve+comp.lang.python at pearwood.info> wrote:
>
>> > So it may be with utf-8 someday.
>>
>> Only if you believe that people's ability to generate data will remain
>> lower than people's ability to install more storage.
>
> We're not talking *data*, we're talking *text*. Most of those
> whatever-bytes people are generating are images, video, and music. Text
> is a pittance compared to those.
Paul Rubin already told you about his experience using OCR to generate
multiple terrabytes of text, and how he would not be happy if that was
stored in UCS-4.
HTML is text. XML is text. SVG is text. Source code is text. Email is
text. (Well, it's actually bytes, but it looks like ASCII text.) Log
files are text, and they can fill a hard drive pretty quickly. Lots of
data is text.
Pittance or not, I do not believe that people will widely abandon compact
storage formats like UTF-8 and Latin-1 for UCS-4 any time soon. Given
that we're still trying to convince people to use UTF-8 over ASCII, I
reckon it will be at least 40 years before there's even a slim chance of
migrating from UTF-8 to UCS-4 in a widespread manner. In the IT world,
that's close enough to "never" -- we might not even be using Unicode in
2052.
In any case, time will tell who is right.
--
Steven
More information about the Python-list
mailing list