a question about Chinese characters in a Python Program
sjmachin at lexicon.net
Wed Oct 22 01:10:54 CEST 2008
On Oct 21, 11:03 pm, Ben Finney <bignose+hates-s... at benfinney.id.au>
> John Machin <sjmac... at lexicon.net> writes:
> > I don't understand the point or value of filtering out all byte values
> > greater than 127
> That's only done if the encoding isn't otherwise specified. In which
> case, ASCII is the documented default encoding. In which case, it
> *must* be restricted to code points 0+IBM-127, otherwise it's not ASCII.
> The value of doing this is to make it rapidly and repeatably apparent
> when the programmer's assumptions about character encoding are false,
> allowing the programming error to be fixed early rather than late.
"make it rapidly and repeatably apparent ..." is much better achieved
by raising an exception.
> This is, in my estimation, of more value than heuristic magic to
> +IBw-guess+IB0- the encoding, and the resultant debugging nightmare when
> that guesswork fails in unpredictable ways later in the program's
Was I suggesting "heuristic magic"?
What is that 0+IBM-127 +IBw-guess+IB0- gibberish in your posting?
More information about the Python-list