hex dump w/ or w/out utf-8 chars
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Tue Jul 9 02:53:39 EDT 2013
On Tue, 09 Jul 2013 07:49:45 +1000, Chris Angelico wrote:
> On Tue, Jul 9, 2013 at 6:56 AM, Dave Angel <davea at davea.name> wrote:
>> But Unicode has nothing to do with Guido, and it has existed for about
>> 25 years (if I recall correctly).
>
> Depends how you measure. According to [1], the work kinda began back
> then (25 years ago being 1988), but it wasn't till 1991/92 that the spec
> was published. Also, the full Unicode range with multiple planes came
> about in 1996, with Unicode 2.0, so that could also be considered the
> beginning of Unicode. But that still means it's nearly old enough to
> drink, so programmers ought to be aware of it.
Yes, yes, a thousand times yes. It's really not that hard to get the
basics of Unicode.
"When I discovered that the popular web development tool PHP has almost
complete ignorance of character encoding issues, blithely using 8 bits
for characters, making it darn near impossible to develop good
international web applications, I thought, enough is enough.
So I have an announcement to make: if you are a programmer working in
2003 and you don't know the basics of characters, character sets,
encodings, and Unicode, and I catch you, I'm going to punish you by
making you peel onions for 6 months in a submarine. I swear I will."
http://www.joelonsoftware.com/articles/Unicode.html
Also: http://nedbatchelder.com/text/unipain.html
To start with, if you're writing code for Python 2.x, and not using u''
for strings, then you're making a rod for your own back. Do yourself a
favour and get into the habit of always using u'' strings in Python 2.
I'll-start-taking-my-own-advice-next-week-I-promise-ly yrs,
--
Steven
More information about the Python-list
mailing list