Typing system vs. Java

Fri Aug 10 10:33:33 EDT 2001

"Christopher Barber" <cbarber at curl.com> wrote in message
news:psoy9os5a88.fsf at jekyll.curl.com...
    ...
> > Applications whose "performance bottleneck" is _100%_ of their code?!
> > Never heard of any such thing.  10% to 20% seems to be common.  What
> > counter-examples do you have in mind?
>
> Not a bottleneck, but a continuous degredation of performance across the
> entire application due to memory use, overhead for method dispatching,
etc.
> This situation is only likely to happen in very large OO applications.

Yes, and "static typing" doesn't help there, unless you're ready to go whole
hog, a la C++: allow polymorphism to break by having value-objects versus
reference-objects (even value-types versus reference-types, a la C# or
sort-of
a la Java & Eiffel, while being a big complication doesn't help much there),
"slicing" derived objects to base objects on assignment (in value-cases),
having some methods be non-virtual, and so on.

If "this object is of type Foo" means, as in Java, C#, Eiffel, &c, "OR of
any
type derived from Foo", you still need all objects to live in the heap and
all
method calls to dispatch dynamically.  An optimizer MAY be able (if it's
given all the relevant code) to infer that the object is always exactly of
type Foo -- but then there is no performance help at all from the static
type
declaration, which is what we're supposed to be discussing in this
sub-thread.

Far more useful than (Java/Eiffel/&c) type declaration (which still need
dynamic dispatching of methods) is supposed to be runtime caching of
the dispatching (I believe that's what Self uses, for example).  When it
finds an obj.somemethod() [possibly only in a loop or someplace that
is flagged as requiring time-optimization], rather than compiling it
down to:
    "do the complicated search to find the specific somemethod for this
        obj and save it in temp1"
    "call temp1"
it compiles something like:
    "get the typesignature-hash for obj and save it in temp1"
    "if temp1 is different from the previously saved temp2:"
        "do the complicated search to find the specific somemethod for this
            obj and save it in temp3"
        "save temp1 into temp2"
    "call temp3"
or, more refinedly, saving the last few typesignature-hashes in a LRU
queue, etc, etc.  This kind of optimization (it would seem: I have no
first-hand experience!) _replaces_ (more effectively) the one enabled
by static-typing: there would therefore be no substantial performance
advantages to having static typing *as well as* such method-dispatch
information caching.

> There are also deployment situations where you might not be allowed to
make
> use of C extensions.

Well, sure, Jython being the typical case:-).  But then, there are lots
of deployment situations where I'm not allowed to use Python at all
(e.g., an ISP offering sites which include scripting in PHP or Perl, but
NOT Python), and increasing Python's popularity is the only really
practicable counter to those -- increasing the complexity of the
language in pursuit of modest performance gains is definitely not
a road to popularity increases (vide Dylan).

> > Not by itself, surely -- but, when using Python together with
lower-level
> > languages for that 10% to 20% of code whose speed really matters, I
> > can't easily think of examples.
>
> But why can't Python be fast enough without resorting to extensions?
Forcing

Python can be fast enough for most tasks today (and more every day,
as machines get faster).  It could be fast enough (without extensions)
for a few more tasks (perhaps) if very substantial resources were
expended in adapting advanced optimization technologies (I don't
think projects to that end in the past, such as Vyper, went anyplace,
but that could change in the future -- if enough people thought it
matters, strongly enough to be willing to donate time, money and/or
skills to the purpose).  Perhaps some language changes might make
optimization slightly easier, although I have my doubts whether such
slightly easier optimization could be achieved without losing the key
characteristics of Python's simplicity and power (untrammeled signature
based polymorphism, for example: if the libraries started being full
of type-testing to make optimizer-writers' life slightly easier, that
would strike a huge blow to Python's power AND simplicity).

> users to learn C in order to make their Python code fast detracts from
> Python's advantage as a simple, easy-to-use language.

What would really detract from that advantage would be making
Python any LESS simple, easy to use, or powerful (e.g., *encouraging*
the horrid practice of type-testing).  Most users don't need to code
C themselves -- they'll keep using general-purpose C-coded
extensions such as Numeric, PIL, and so on -- or components
usable cross-language on platforms with decent support for
componentization (such as Microsoft's: *everything* is available
as an Active/X [COM] component, typically in two dozen versions
of varying price, footprint, speed, ease-of-use, stability, etc; one
just has to find and choose the best components for one's use,
just as one does, say, in VB or Javascript).  Hopefully in the future
we'll get decent componentization everywhere -- I consider the
lack of it on BSD and Linux variants today their largest problem
(hopefully XPCOM, or Ximian's bonobo/mono, or *SOMETHING*
will help on this score).  SWIG and friends also offer some chances
to put components together from existing libraries without having
to learn how to code in C.

The crucial issue is to have Python avoid the key error that almost
every other language has made: trying to be everything to
everyone, and thereby becoming too complex and complicated,
and seriously impacting everybody's productivity.  Python keeps
simplicity and power together by *NOT* striving to offer very
easy optimization capabilities -- if advanced compilers can get
good optimization anyway, so much the better, but the purpose
of the *LANGUAGE* is to keep simple and powerful and thereby
give its users incredible productivity, *PERIOD* (and I'm quite
happy the "reference implementation" by Guido and friends also
abides by KISS except in a few hot-spots).

> I think that adding features to allow Python powerful enough to remove the
> need for extensions would be worth a little bit of extra complexity.  I

I could hardly disagree more deeply, profoundly and totally than in fact
I do.  Nothing will remove the need for extensions, although their use
may become less frequent.  Consider C++, which NEVER cared about
complexity (well, harldy ever: Stroustrup did make one key decision
to NOT allow overloading by return type, thanks be) and went full steam
ahead for relatively-easy optimizability.  It STILL needs "extensions" in
some cases to get to 100% of a machine's power: why do you think
GMP has such large parts coded in machine language (for some CPU's),
and why do you think Intel goes to the bother of releasing fine-tuned
machine-language libraries for numeric & similar operations on its
own CPU's?  Because that's stuff the C/Fortran/C++ programmer STILL
needs as 'extensions'.  Making Java fast enough for many uses requires
JNI-interfaced C or machine-language extensions too (did you know that
SWIG supports generation of extensions for Java, as well as for Python,
Perl, etc?-).  Etc, etc.

Meanwhile, the complications are something you pay for EVERY day --
in extra bugs in the compilers/libraries/tools, in your own productivity,
in teaching and learning the language.  Thanks, but, *NO*, thanks.

Python IS powerful enough, and all it needs is the removal of a few
_restrictions_ (chiefly those connected to the type/class split) and
sundries.  *IF* (a *BIG* if!) optional type annotations can be added
in a way that adds almost no complexity to the language, why not --
but the big-if remains.  As far as I'm concerned, the concept of
"protocol adaptation" (see the relevant PEP) is more important than
any language-features that might support it (it could very well live
in the libraries -- nice syntax-sugar is no biggie:-) -- not for
performance, but mainly to eradicate the evil of type-testing
for good:-).

Alex

> personally don't think that type declarations would add much complexity,
but I
> guess I can see how someone new to them might be confused.  Of course,
that
> same person would be lost once they had to learn C.
>
> - Christopher