One Python 2.1 idea

Sun Dec 24 17:26:07 EST 2000

<rturpin at my-deja.com> wrote in message news:925baq$5sj$1 at nnrp1.deja.com...
> Let me frame, up front, where I think we disagree. We
> both love Python for its level of abstraction, its
> flexibility, its portability, and its ability to serve
> as glue. The issue is NOT Python vs. something else. It

For me, it's Python *plus* something else -- but I
think that much is clear.

> is about how Python should evolve AT THIS POINT in its
> lifecycle. My view is that Python, from a language
> viewpoint, is quite mature. There are not a lot of big
> abstractions yet-to-be implemented. Or if there are, I
> have not seen them discussed. Most of the enhancements

The 'interfaces' strawman-proposal (which doesn't seem
to have made it into a PEP, yet) is the most significant
one I can think of.  Generators, micro-threads, &c, are
also of some importance... but, yes, maturity, and thus
stability, are surely there.

> now proposed are small and incremental. Python has
> reached the point where the greatest gains come from
> improving its underlying infrastructure: more speed,
> running on more platforms, supporting larger-scale
> programming. This is GOOD. It is a sign of success.

Nolo contendere regarding the importance of
infrastructure -- but I wouldn't single out speed as
a particularly crucial aspect.

> Stackless Python is a small but exciting step in the
> direction of improving infrastructure. I wish, instead

Sure, but it's not particularly oriented to improving
speed, is it?  Rather, allowing good exposure of new
abstractions (generators, coroutines, micro-threads)
seems to be what it's all about.

> the abstractions it implements. For 10% penalty, I'll use
> list comprehensions, with the expectation that in a year
> or two, I'll gain an X% bonus. (And if we don't? What
> good does it do to have list comprehensions if programmers
> routinely use loops instead, because they know it will
> get them an easy 10% performance boost? The goal is not
> just to provide nice abstractions, but to seem them used!)

I don't think a 10% difference matters substantially, either
way!  _This_ may be the crux of our disagreement.  Sure,
if the infrastructure can be enhanced to get a few percents
here and there, why not.  But that's not going to change my
programming style, nor make much of a difference to the
effectiveness of the applications I write.  And if some coder
_in a higher-level language_ is letting "few-percent issues"
sway his or her style away from clarity and simplicity, then
I think the solution is _education_... explaining exactly how
and why this is a losing trade-off.

*Orders of magnitude* in performance DO make a huge
difference -- but I don't think this is what we're talking
about.  Well, we _might_, I guess, when a list is being
built up by .append calls, with frequent reallocations of the
list -- *that* is something to watch out for, O(N*N) cost
where O(N) is easily achievable.  But if the infrastructure
was somehow enhanced to detect the pattern of 'many
appends in a loop' and switch the internal structure to
something more performing, this would optimize the
'lower-level' (item-by-item) approach, NOT the higher
abstraction level one.  (Actually, I suspect that adding a
couple of new types, a stringbuffer and a listbuffer, that
ARE optimized for speedy in-place modification, to be
used during 'construction phases' then transformed back
into basetypes -- strings and lists -- is an easy approach
for that; however, this moves *away* from abstraction, so
I doubt it would meet with your approval...:-).

> One BIG benefit to abstractions like JOIN and list
> comprehension is that performance improves without the
> application programmers doing anything. The next release
> of the programming environment makes everything run

But why shouldn't said 'next release' be able to optimize
the 'lower-abstraction' approach, too?  As long as the
language allows and encourages the coder to express
the code's intent exactly and precisely, optimizations are
feasible anyway.  In other words, I do NOT consider
potential optimizations a 'BIG' benefit of higher-level
abstractions over lower-level ones.

> faster. Or so one hopes. This is well understood in the
> DBMS world. We teach beginning SQL programmers to use
> the constructs, even if they think they can beat the
> current implementation. That's what we also ought to
> teach Python programmers, right?

Right, but NOT because of performance considerations!
*Clarity and simplicity*, and the resulting gain in coding
and maintenance productivity, is what we should be
focusing on (hopefully removing *order-of-magnitude*
performance bottlenecks that can result when coding is
*naive* rather than just *simple, direct and clear*!-).

> > Actually, for all-out speed in a higher-level language, ..
>
> Why restrict ourselves to higher-level languages? For all
> out speed with portability, we could program in C.

Yes, but at LARGE productivity costs, if we used C for
_all_ of our coding, rather than just a very few critical
hot-spots.  The productivity costs of using Scheme or
O'Caml instead of Python aren't anywhere close to those
C would impose, in a single-language setting; which
is *why* it makes most sense to 'restrict ourselves to
higher-level languages', for most of our coding.

> > I prefer Python because I do NOT prize speed in a
> > higher-level language above other factors -- great
> > practicality at "playing well with others", clarity,
> > cleanness of syntax, great productivity.
>
> I would explicitly list "working at a higher level of
> abstraction."

When I compare working in Python with working in
Haskell, say, or perl, or, I _think_, ML dialects (not
enough practical real-world experience to be sure!),
it does not seem to me that I'm working at a higher
level of abstraction; the abstraction-level seems quite
comparable.  (Actually, this goes for C++, as well,
most of the time -- but, admittedly, lower-level issues
_do_ keep interfering often enough in that case).

Rather, I find Python's key advantages to be the ones I
listed above.  Wrt Haskell, for example, I notice much
more ease of integration with other parts of my world,
and higher productivity thanks to latent typing; wrt
perl, higher clarity and cleanness, and much higher
productivity in maintenance; and so on.

> Many things improve productivity, some
> of which Python lacks. There is a lot to be said for a
> highly integrated programming environment, as one finds
> in Delphi. But the tie to a specific GUI framework
> limits portability. I am glad that Python made the
> choice of simplicity and abstraction.

Me too!  Highly-integrated environments can always come
later -- simplicity and clarity remain.  'Abstraction' and
'higher-level' are basically synonyms, and I don't find
Python too different from, say, Scheme, in these terms.

> > If you do have another scale of values, Python may not
> > be the best choice for you at this time; specifically,
> > if for some reason the solutions MUST be single-language
> > ones (you can't or won't use a lower-level, faster
> > language for that 10% or so of your code that bottlenecks
> > its speed).  I do not have any such constraint, so Python
> > + some tiny amount of C++ at the right places .. is
> > exactly right for my needs...
>
> This introduces friction that undermines precisely Python's
> greatest selling point: increase in productivity. The
> problem is NOT just programming in C the parts of an
> application that are bottlenecks, but first identifying the
> bottlenecks. That's not too hard for those of us with
> experience at it; more difficult for many. And keep in mind
> that the solution may not be programming in C, but rewriting
> in longer-winded Python, which thereby becomes less readable
> and understandable, now undermining maintainability.

'Longer-winded Python' will not buy you order-of-magnitude
performance enhancements (unless some 'naive', O(N*N),
approach, is being turned into O(N) or at least O(N logN)
code -- and in that case, it would be hard to argue much
against 'long-windedness':-).

I don't think the art, or science, of identifying bottlenecks
and coding them away will disappear in my lifetime.  Nor
will a few percents' worth of optimization make much
difference in this regard.

> My bottom line is this: I don't think we should make people
> choose either-or. Either take the level of abstraction
> provided by Python (and solve performance problems after
> the fact) OR use something else. Python CAN BE faster. There

Sure -- by a few percents here and there.  Worthwhile.  Hardly
major, I think.

> is nothing in the language that forces list comprehensions
> to run slower the simple lists.

I think it boils down to allocation issues; as list comprehension
are now implemented in terms of a loop of appends to an
initially-empty list, that is going to cost compared to having
it allocated all at once.  Having a temporary mode, for lists
being built up, where more memory is invested (exponential
allocation, with constant amortized cost per item, rather than
linear allocation, with O(N) amortized cost per item) -- or the
list is at first built up in discontiguous memory, then the compact
block actually needed is obtained once only at the end, when
it's clear exactly how much is needed -- might be big wins in
performance terms (the 'listbuffer' ideas I mentioned above).

But if Python had such a 'listbuffer' object, and used it internally
for list-comprehensions, I'd _also_ like to see it exposed for
direct programmer use... why not?  I don't necessarily want to
have to *distort* my code into a list-comprehension, in cases
where some other approach might be clearer, as the only way
to get performance!

Say I want to build up and return a list of Fibonacci numbers,
however many are needed to get up to the first one that is
larger than an argument N.  The simplest approach:

def listFib(N):
    if N<1: return [1]
    result = [1, 1]
    next = 1
    while next <= N
        next = result[-1]+result[-2]
        result.append(next)
    return result

Making this into a list-comprehension would contort things... it
usually does, when each item in the list is defined in terms of
items before it.  Rather, if listbuffer objects existed, I might
get some substantial performance gain through them, without
(IMHO) affecting readability much, if at all:

def listFib(N):
    if N<1: return [1]
    result = listBuffer.listBuffer(1, 1)
    next = 1
    while next <= N
        next = result[-1]+result[-2]
        result.append(next)
    return list(result)

I'm all in favor of abstraction, of course (and motherhood and
apple pie, too!), but I don't want to force-fit it into my code...
as I've seen done too often in higher-level languages that
thought they could *force* programmers to use abstraction
by NOT providing necessary lower-level primitives!-).

> Living with this either-or
> makes sense early in a technology's lifecycle, when the
> abstractions are new and getting them out is more important
> than getting them fast. But they're out. It is time to shift
> the emphasis a bit, from features to infrastructure.

I won't _mind_ infrastructure-enhancements... but they won't
take the place of identifying and recoding key hot-spots, or,
at least, such is my current prediction.

> I might be persuaded otherwise if you would point out the
> BIG feature enhancements that are just around the corner.
> Right now, I don't see them. So what do I want next, more
> than anything else? More speed. And availability on all
> the palm platforms.

What *I* want (no idea if it's coming), is more and more ways
to express in my code all that I *know* about the semantics
involved.  "This object here satisfies the XYZ interface", "this
sequence here is sorted", "this function here is a total
ordering on yonder domain", etc.  Such 'rich assertions' may,
or may not, help debugging my code OR make it faster -- I'd
like a compilation mode where the compiler inserts checks for
all assertions it can't prove at compile-time (to help me ensure
I do indeed know what I'm talking about), and another where
it _optimizes_, *trusting* my assertions, and taking advantage
of anything it can infer from them.  But I wouldn't want the
rich-assertion-sublanguage constrained to stuff the compiler is
actually able to use (or check) _now_ -- I'd rather be able to
'code' existential and universal qualifiers, which are what I do
often need to 'express all that I know', and worry about having
effective checks or use of them (the infrastructure:-) later on.

This would get me more clarity and expressiveness now, and a
_possibility_ of enhanced checking and/or higher optimization at
some future time.  And wouldn't that be great?-)

As far as infrastructure goes -- better development environments
(e.g., the planned 'nanny'/'lint-ish' facility; more thorough checks
of coverage in unit-test frameworks; perhaps an ability to ask the
runtime [maybe in a special version] to selectively 'cause' certain
exceptional conditions -- simulate no-memory, disk-full, etc -- so
I can more easily and thoroughly check my exception-handling),
and standard-library enhancements, still seem to me to need to
be given priority over sheer-speed issues.  Not that I'd spit at a
few percents here, a few percents there, mind you -- the greatest
advantages of such small enhancements, however, would be to
give greater inducement to upgrade-to-newest-versions to shops
that let themselves be guided by such minor issues as those:-).

Alex