Indentation and optional delimiters

Thu Feb 28 09:00:12 EST 2008

On Feb 28, 12:46 am, Steven D'Aprano <st... at REMOVE-THIS-
cybersource.com.au> wrote:
> By the way bearophile... the readability of your posts will increase a
> LOT if you break it up into paragraphs, rather than use one or two giant
> run-on paragraphs.
>
> My comments follow.
>
> On Tue, 26 Feb 2008 15:22:16 -0800, bearophileHUGS wrote:
> > Steven D'Aprano:
> >> Usability for beginners is a good thing, but not at the expense of
> >> teaching them the right way to do things. Insisting on explicit
> >> requests before copying data is a *good* thing. If it's a gotcha for
> >> newbies, that's just a sign that newbies don't know the Right Way from
> >> the Wrong Way yet. The solution is to teach them, not to compromise on
> >> the Wrong Way. I don't want to write code where the following is
> >> possible: ...
> >> ... suddenly my code hits an unexpected performance drop ... as
> >> gigabytes of data get duplicated
>
> > I understand your point of view, and I tend to agree. But let me express
> > my other point of view. Computer languages are a way to ask a machine to
> > do some job. As time passes, computers become faster,
>
> But never fast enough, because as they get faster, we demand more from
> them.
>
> > and people find
> > that it becomes possible to create languages that are higher level, that
> > is often more distant from how the CPU actually performs the job,
> > allowing the human to express the job in a way closer to how less
> > trained humans talk to each other and perform jobs.
>
> Yes, but in practice, there is always a gap between what we say and what
> we mean. The discipline of having to write down precisely what we mean is
> not something that will ever go away -- all we can do is use "bigger"
> concepts, and thus change the places where we have to be precise.
>
> e.g. the difference between writing
>
> index = 0
> while index < len(seq):
>     do_something_with(seq[index])
>     index += 1
>
> and
>
> for x in seq:
>     do_something_with(x)
>
> is that iterating over an object is, in some sense, a "bigger" concept
> than merely indexing into an array. If seq happens to be an appropriately-
> written tree structure, the same for-loop will work, while the while loop
> probably won't.
>
> > Probably many years
> > ago a language like Python was too much costly in terms of CPU, making
> > it of little use for most non-toy purposes. But there's a need for
> > higher level computer languages. Today Ruby is a bit higher-level than
> > Python (despite being rather close). So my mostly alternative answers to
> > your problem are: 1) The code goes slow if you try to perform that
> > operation? It means the JIT is "broken", and we have to find a smarter
> > JIT (and the user will look for a better language).
>
> [...]
>
> Of course I expect that languages will continue to get smarter, but there
> will always be a gap between "Do What I Say" and "Do What I Mean".
>
> It may also turn out that, in the future, I won't care about Python4000
> copying ten gigabytes of data unexpectedly, because copying 10GB will be
> a trivial operation. But I will care about it copying 100 petabytes of
> data unexpectedly, and complain that Python4000 is slower than G.
>
> The thing is, make-another-copy and make-another-reference are
> semantically different things: they mean something different. Expecting
> the compiler to tell whether I want "x = y" to make a copy or to make
> another reference is never going to work, not without running "import
> telepathy" first. All you can do is shift the Gotcha! moment around.
>
> You should read this article:
>
> http://www.joelonsoftware.com/articles/fog0000000319.html
>
> It specifically talks about C, but it's relevant to Python, and all  
> hypothetical future languages. Think about string concatenation in Python.
>
> > A higher level
> > language means that the user is more free to ignore what's under the
> > hood, the user just cares that the machine will perform the job,
> > regardless how, the user focuses the mind on what job to do, the low
> > level details regarding how to do it are left to the machine.
>
> More free, yes. Completely free, no.
>
> > Despite that I think today lot of people that have a 3GHZ CPU
> > that may accept to use a language 5 times slower than Python, that for
> > example uses base-10 floating point numbers (they are different from
> > Python Decimal numbers). Almost every day on the Python newsgroup a
> > newbie asks if the round() is broken seeing this:
> >>>> round(1/3.0, 2)
> > 0.33000000000000002
> > A higher level language (like Mathematica) must be designed to give more
> > numerically correct answers, even if it may require more CPU. But such
> > language isn't just for newbies: if I write a 10 lines program that has
> > to print 100 lines of numbers I want it to reduce my coding time,
> > avoiding me to think about base-2 floating point numbers.
>
> Sure. But all you're doing is moving the Gotcha around. Now newbies will
> start asking why (2**0.5)**2 doesn't give 2 exactly when (2*0.5)*2 does.
> And if you fix that by creating a surd data type, at more performance
> cost, you'll create a different Gotcha somewhere else.
>
> > If the
> > language use a higher-level numbers by default I can ignore that
> > problem,
>
> But you can't. The problem only occurs somewhere else: Decimal is base
> 10, and there are base 10 numbers that can't be expressed exactly no
> matter how many bits you use. They're different from the numbers you
> can't express exactly in base 2 numbers, and different from the numbers
> you can't express exactly as rationals, but they're there, waiting to
> trip you up:
>
> >>> from decimal import Decimal as d
> >>> x = d(1)/d(3)  # one third
> >>> x
>
> Decimal("0.3333333333333333333333333333")>>> assert x*3 == d(1)
>
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> AssertionError
>
> --
> Steven

"Gotcha"s is what I meant by "bugs".  Something in the language
doesn't follow the writer's flow of thought.  You can bring thoughts
to match the language with practice-- piano and chess spring to mind--
and you can bring the language to match thoughts.  But whose, and what
are they?  A good start is the writer's native spoken language (or
perhaps a hypothetical language for musicians, the libraries of which
resemble intervals and harmonies, and syntax rhythms), but context
(user history) is so richly populated, and computer languages so
young, that they don't really measure.  No pun intended.  It'd be like
(bear with me) dozens of different versions of a given import library,
each with subtleties that makes calling functions in it really fluid.

Three thought experiments: One, remove all the identifiers from a
program; make them numbers or unrelated names.  You can have the
source to the libraries but their identifiers are gone too.  And you
can have the specs for the architecture.  So, after days of study, you
narrowed down what the file write statement is--- which got harder as
the language got higher-level.  No you can't run it; no comments.
(What about strings and literals?  Hmmm, hear me to the point first.)
However, a name is consistent from one usage to another: you can trace
them, but not infer anything from them.

For the second one, simulate many typos in a working program, and hand
it to a human to correct.  (For a variation, the compiler can be
present and running is fine.)  The typos include missing parentheses
(ouch), mispelled identifiers, misordered statements (!?), and
indentation mistakes ( / brace misplacements ).

Last, take a working program, and explain it only in human language to
another person who has no computer experience (none), but does accept
you want to do these things, and (due to some other facet of the
situation) wants to learn.  He's asked later by another party about
details of it, in order to verify understanding.

Your success in these examples is a benchmark for computer performance
in understanding what a human wrote.  It doesn't know what you mean,
but if any one of the three aspects the experiments illustrated is
missing, here's my point, the human doesn't either.

The more you have in common with the reader, the easier it is for him
to decipher your code.  If he speaks the same language (human -and-
computer) as you, took the same class as you, works on the same
project as you at work, and has known you personally for a really long
time, maybe it would go pretty quick in the first two examples.  With
a language in common, a knowledge of math and science, and a history
together, the third is quick too.  But take a foreign stranger really
in to astrology, and the third is slow; take an author from a
different background as you, and the first two are slow too.

What does that say about the perfect computer language?

It can tolerate, or at least question you about, a few errors.  ("Did
you mean...?")  It can refer to idiosynchracies you've used elsewhere
in your code.  It can refer to other mistakes you've made in (grin!)
other programs, and what their resolution was.  You're plagued by a
certain gotcha more than other speakers, but your spelling is
teriffic, so that's probably not the problem, which makes something
else more likely to be.

For the examples, they may not be so easy to find in Python,
considering the One Obvious Way principle, but they may abound in
another, and I don't just mean brace placement.

Lastly, to be specific: the language of course can't question you
about anything.  But something about the perfect language, either
syntax or libraries, makes tools possible that can.