Time we switched to unicode?

Chris “Kwpolska” Warrick kwpolska at gmail.com
Tue Mar 25 18:24:10 CET 2014


On Tue, Mar 25, 2014 at 9:05 AM, Marko Rauhamaa <marko at pacujo.net> wrote:
> Chris Angelico <rosuav at gmail.com>:
>
>> On Tue, Mar 25, 2014 at 4:14 PM, Mark H Harris <harrismh777 at gmail.com> wrote:
>>>>>> Π¹ = pi
>>
>> That's good! (Although typing Π¹ quicker than pi is majorly pushing it.
>
> It don't think that's good. The lower-case letter π² should be used. The
> upper-case letter is used for a product, although unicode dedicates a
> separate character for the purpose: ∏³.
>
> I often see Americans, especially, confuse upper and lower-case letters
> in symbols ("KM" for "km", "L" for "l" etc).


“L” is actually valid, and so is “l”.  This happens mainly because
humans (and computers) tend to write “1 l” (one liter, one-ell) in a
way that makes it harder to distinguish (becoming eleven or ell-ell),
especially if you don’t include the space (which is invalid).

On Tue, Mar 25, 2014 at 9:23 AM, Chris Angelico <rosuav at gmail.com> wrote:
> If you can type a capital ∏³, you can type a lower-case π², unless there's something very weird going on.

Nitpick time!  (because we all love it so much!)

Π¹ = U+03A0 GREEK CAPITAL LETTER PI
π² = U+03C0 GREEK SMALL LETTER PI
∏³ = U+220F N-ARY PRODUCT

“If you can type an N-ARY PRODUCT, you can type a GREEK SMALL LETTER
PI, unless there’s something very weird going on.”

…like, the user is in the past and is using ISO 8859-7 (instead of a
21st-century encoding, like UTF-8).  An encoding which has support for
Π¹ and π², but not for ∏³… (of course, this assumes that, if we add
those new characters into python, we allow any encoding, somehow.)

That’s not too weird, other than the ancient encoding being used.
(though that’s a bit less weird on Windows, but that’d be
Windows-1253.)

Oh: and speaking of fancy Unicode characters that are worthless
~duplicates, spot the difference here:

µ μ

If you are lucky enough (and, luckiness may involve reading this
e-mail in Helvetica (not Neue though) on a Mac), you can clearly see
that they are different.  If you are using a font that does not
differentiate them, you may think they’re the same.  If you ask some
intelligent software (like `unicodedata.name()` in Python), you’ll
quickly find out the first is MICRO SIGN, and the other is GREEK SMALL
LETTER MU.  Such craziness is what makes Unicode Unicode.

-- 
Chris “Kwpolska” Warrick <http://kwpolska.tk>
PGP: 5EAAEA16
stop html mail | always bottom-post | only UTF-8 makes sense



More information about the Python-list mailing list