case-sensitivity (was Re: True, False, None)

Thu Nov 13 13:57:40 EST 2003

Alex Martelli <aleax at aleax.it> wrote in message news:<3gNsb.20789$hV.759234 at news2.tin.it>...
> Michele Simionato wrote:
> 
> > Alex Martelli <aleax at aleax.it> wrote in message
> > news:<yRssb.11182$9_.422629 at news1.tin.it>...
> >> So, when I used to have a factory function (as 'int' was), and change
> >> it into a type (or class, same thing), I should rename it and break all
>  ...
> > But these are very rare cases, so probably I could live with an
> > enforced capitalization too.
> 
> You think it's rare, during refactoring, to change between types and
> factory functions?!  I suspect you may not have been maintaining and
> enhancing code (in languages allowing interchange of the two things)
> for long enough.  Consider that *ALL* types in Python's builtins started
> life as factory functions -- that's 100%... "very rare"?!-)

It is very rare in my own code, but I haven't maintained code written 
by others, so I will accept your experience on this.

> >> remember that module FCNTL has an all-uppercase name, htmllib all-lower,
> >> cStringIO weirdly mixed, mimetypes lower, MimeWriter mixed, etc, etc --
> >> totally wasted mnemonic effort.
> > 
> > This is more a problem of inconsistent conventions. It is true that
> > it will disappear with case insensitivity enforced, but somewhat I
> 
> [[ probably mean "case sensivity enforced" ? ]]

No, this time I really meant case *insensitivity*: if case
insensitivity
was enforced, the issue with FCNTL,cStringIO, MimeWriter, etc. etc.
would
disappear and a normal memory would be enough to grasp everything.

> How am I "exaggerating" in claiming that the SAME module, zipfile,
> spells "zipfile" differently in the module name itself, in class
> zipfile.ZipFile, and in class zipfile.BadZipfile?  Maybe you have a
> photographic memory so that having seen each of these ONCE you are
> never going to forget which ones uppercase exactly which letters, but
> even back when my memory was outstanding (when I was younger) it was
> always more "auditory" than "visual": I could easily recite by heart
> long quotes from books I had read once, but never could recall the 
> details of punctuation (or capitalization, when non-systematic, as 
> it often was e.g. in 17th/18th century english) without painstaking
> and painful explicit memorization effort.

No, I don't have a photographics memory and sometimes (few times,
actually) I don't remember the right capitalization, but few seconds 
with the on-line help (trying zipfile or ZipFile or BadZipfile, for 
instance) are usually enough to solve the issue. So, as I said, 
capitalization does not give big headaches to me. Also, I am
absolutely unconvinced that capitalization gives big headaches to you
(except as a point of principle, of course). I am not saying case
sensitivity is perfect, I am saying its disadvantages (in real life) 
are not so serious as you claim. People has written tons of C, but I
am sure capitalization bugs are scarcely relevant as compared, for
instance, to pointer arithmetic bugs ;) 

> Should a language cater mostly to the "crowd" (?) of people with
> photographic memories, or shouldn't it rather help the productivity
> of people whose memories aren't _quite_ that good and/or visual...?

I don't have an auditive memory, nor a visual memory, but still I
manage
to survive with case sensitivity, so I guess everybody can do the same
...

> > You are still exaggerating. 99% of times uppercase constants denote
> > numbers or strings. I don't remember having ever seen an all uppercase
> > function, even if I am sure you do ;)
> 
> Maybe my defect is knowing the Python standard library too well?  It's
> got SEVERAL all-uppercase functions, Michele!  Check out modules
> difflib (functions IS_LINE_JUNK and IS_CHARACTER_JUNK), gzip 
> (functions U32 and LOWU32), stat (all the S_IMODE etc functions),
> token (functions ISTERMINAL and ISNONTERMINAL)...!

Ubi major, minor cessat ;) 

I confess I never used these functions: not only their capitalization
is new to me, but even their names! Still, I am sure that knowing the
names I will be able to remember the capitalization: 100% of times
for all caps identifiers, 80% of times for camel case identifiers.
I do think this is an average performance.

> Maybe the sum total IS 1% or so of the functions in the library, but
> that's _STILL_ a silly, arbitrary memorization chore which I shouldn't
> have had to undergo in the first place -- and I'm not even sure I
> have in fact remembered all of them...

Even if you enforce case insensitivity, the library will still have
inconsistencies due to different code conventions: think to the use of
underscores to separate names, for instance. The standard library
index gives plenty of cases, i.e. "add_something" versus
"addsomething"
etc.
> 
> > For instance, I can define a matrix type, overload "*" and write the
> > multiplication of a matrix "A" times a vector "a" as
> > 
> > b=A*a
> > 
> > Much more readable to me, that something like
> 
> Please note that here you are suddenly and undocumentedly _switching
> conventions on the fly_ regarding capitalization.  One moment ago,
> leading cap had to mean "type" and all caps had to mean "numeric
> constant" (which in turn made a single-caracter capital identifier
> ambiguous already...) -- now suddenly neither of these conventions
> exists any more, since that uppercase A means 'matrix' instead of
> 'vector' (and a _number_, i.e. even lower dimensionality, would be
> indicated *HOW*?  Don't you EVER multiply your matrices by scalars?!
> Or is it so crucial to distinguish matrices from vectors but totally
> irrelevant to distinguish either from scalars?!).

That's rethoric and you know that. As I already said, readability is 
more important than foolish consistency; also in the context a matrix
computations, how big is the possibility to misunderstand the
meaning of variables such as

A=Matrix(2,2); b=vector(2); s=scalar() ??

> My opinion is that, while _habit_ in mathematical formulas may surely
> make one hanker for such case-sensitivity, the preference just does not
> stand up to critical analysis, as above.  You're trying to overload WAY
> too many different and conflicting "conventions" onto a meager "one bit
> [at most] per character" (about 0.87 bits I believe, in fact) of
> "supplementary information" yielded by case-sensitivity.

I strongly like operator overloading because of my mathematical
background. Mathematicians overloads the meaning of operators all
the time and never get confused, since the context is always made
well clear. Nevertheless, many programmers are against operator
overloading, because it can be abused. In my opinion the advantage
is worth the risk (the same for case sensitivity) but I do accept
that other may disagree. Insisting too much, we risk to enter in
a religious discussion, so it is better to stop here.
> 
> Most paladins of case sensitivity would probably be horrified to see
> that the main point in its "favour" now appears to be that it
> encourages you to use shorter (e.g. 1-letter) identifiers (and thus
> more cryptic ones) because it gives you more of them to choose from...!!!

In mathematical formulas, at least, shorter identifiers are clearer,
not more cryptic; also, it is nice to import modules with syntaxes
such as ``import math as m``, using a single letter, which does not
clash with some other ``M`` (for instance the mass of an
object in a Physics program, or a pluggable metaclass in an object
oriented program, or anything else).
> 
> I've tried (e.g. in Dylan) the concept of having punctuation freely
> embeddable in identifiers and didn't particularly like it (I guess it
> works better with a NON-infix-syntax language -- I don't recall it
> feeling like a problem in either Forth or Scheme -- but in Dylan the
> inability of writing a sum as
>     a+b
> because that's an identifier, so you have to write
>     a + b
> instead, _was_ rather uncomfortable to me [maybe I just didn't get
> long-enough practice and experience with it]).

I am against too much punctuation, that's one of the reason why 
I do like case insensitivity, so you don't need extra identifiers:
did you miss my point or I am misreading you?

> I disagree -- once you have to spell out e.g. pi, capital-sigma, etc,
> in Ascii letters anyway, having to make sure you do so in letters that
> are unambiguous in terms of capitalization differences is no big loss.
> Personally, in terms of formulas, I've never found Fortran any less
> readable than C, for example.

I don't like Fortran verbosity, but others could agree with you.
That's just a matter of personal preference.

> And no, I definitely don't want Unicode characters in identifiers --
> that would ensure a LOT of new and diverse errors as people use the
> wrong "decoration" (accent, circumflex, etc, etc) on letters.  Plain 
> ascii's just great...!-)
> 

Who wants Unicode characters ?? I am not so foolish yet !!

> Alex

Michele