It's still I, Miville

Sun Nov 2 06:15:40 EST 2003

François Miville-Dechêne wrote:

> You say in the definition of mappings that at present Python has only
> one type of it, the dictionnary.  I suggest another one, the sparse

dict is the only _built-in_ mapping type that's general purpose and
immediately available to the user.  However, there are others:

>>> x=vars(object)
>>> type(x)
<type 'dictproxy'>

this dictproxy built-in type is another mapping type, rather special
purpose -- you can access it readonly (and it will then go find the
information in the built-in typeobject 'object') but not assign to
its items (because the underlying typeobject 'object' is immutable
in Python).

Moreover, it's easy to code other mapping types.  Standard library
module UserDict is all about making that easier (including the
provision of a mixin class supplying many "higher-order" methods
if your type just supplies a few elementary ones), and often you
also have the alternative of subclassing dict directly.

> array, where absence of key would mean not absence of element but
> presence of a default-value element, such as zero for sparse arrays in
> the mathematical sense.  This would enable the use of mapping with the
> distributive operators as I just suggested them in a previous e-mail.

"dict with implicit default" is a popular elementary example, and I
see it's already been proposed to you.  However, I suggest that what
you really want here is rather a _sequence_ type, which may use dict
for its implementation but isn't particularly interested in exposing
the mapping interface to the outside -- what it needs to provide is
quite different.

Assume that dd is an instance of your "sparse array" class (and for
simplicity, without loss of generality, let's focus on 1-dimensional
cases).  What should len(dd) return?  If dd is a mapping, len(dd)
must return the number of items dd has (the number of nonempty cells
in the sparse array).  But how useful is that?!  Typically what I
want to know more often about an array -- no matter whether its
underlying implementation is sparse or not -- is how many 'cells'
it has, not how many are "nonempty".  And what about slicing dd[a:b} ?
A dict doesn't support that (slice objects are nonhashable, dict items
are unordered so "slicing" makes no sense) but a sparse array sure should,
if you want it to be interchangeable with dense arrays (lists).

All in all, I think that for most uses related to sparse arrays I
would choose Numeric (a key Python add-on package for all kind of
numeric array computations) and PySparse to go with it (I know there
are alternative Numeric addons for special uses, such as SPAI, but I 
don't know much about them).

> I will be pestering you with my suggestions from time to time.  You say
> Python is an evolving language, prove it.

Python comprises a small core language, a small-ish set of built-ins, 
a large standard library, and a huge variety of third-party add-ons 
(packages, etc).  It's not the desired direction of evolution to have
the core language fatten by comprising features that are easily done
in the built-ins, nor the built-ins fatten by swallowing stuff that
is well located in the standard library.  A 3rd party package can be
made part of the standard library under somewhat stringent conditions
of licensing and maintenance-guarantees, but that doesn't happen all
that often.  E.g., Numeric itself, even though it's essential for
anybody doing numeric array computations in Python, remains outside:
this allows it to keep evolving at _its own_ pace, unconstrained by
Python's release schedule (and indeed, the successor 'numarray' package
is likely to gradually replace Numeric for most uses).

Of course, inevitably, Python over the years accrues "cruft" towards
the center, particularly because the filters weren't so strict once.
Python 3.0 's theme will be "back to simplicity" -- removing some of
the duplication accumulated in the core language, built-ins, etc, over
the years.  Of course, that can only be done in a 2.* -> 3.0 move --
as long as we're within 2.* backwards compatibility constrains this
kind of simplification -- and 3.0 is, I would guess, at least 3 years
away (sigh).  Until the simplification occurs, the filters against
"just piling stuff on" are EXTREMELY strict.

The process of python's evolution is guided by the PEP's (Python
Enhancement Proposals).  In theory, no change to Python takes place
without a PEP (in practice, it does happen, but _shouldn't_:-).  If
you think you can prove that some new feature is absolutely needed,
write a PEP (no doubt after discussing the feature's details) -- it
is open to anybody to write PEP's, though the PEP editors may bounce
them if they don't meet the criteria laid out in the low-numbered
PEPs (call them meta-PEPs:-).  Reading the existing PEPs first is
VERY advisable.  If your pet feature is a close match for something
that's in an existing PEP (including a rejected one), and you show
you haven't done your homework by failing to address this, it's very
unlikely that your PEP will be accepted -- just as you wouldn't
expect a refereed journal paper to be accepted if it didn't mention
all relevant parts of the literature, of course.  When a PEP is
accepted it potentially affects a huge number of people, after all --
depending on how you measure (see my "googling for fun" and "popularity"
recent threads) Python is somewhere between the 4th and 8th most
popular programming language today, with millions of lines of code in
production, hundreds of thousands people whose primary development
language is Python, millions with at least a nodding acquaintance with
it, hundreds of firms depending on it for mission-critical applications
(see the "python success stories" booklets and websites for details).

Therefore, Python must first of all be _STABLE_ even while it's evolving.
Like all programming language tradeoffs, this is a delicate balance too,
but so far I think we've been doing pretty well, considering.  The PEPs
and other filters against too-easy changes to core Python are part of it.

> To take on another subject, an object-oriented language such as yours
> should explicitly tackle the Microsoft Office constructs, which even

Python is mostly cross-platform.  Specific platforms are most often
best supported in add-on packages.  For Windows, in particular, look
at Hammond's win32all extension package.  OpenOffice.org on the other
hand is supported by an OpenOffice-released add-on called PyUNO (it
is included in OO.o 1.1, as is a reference Python implementation, but
I've experimentally used PyUNO with standard Python w/o problems).

So, for MSOffice-related (and more generally specifically COM-related)
issues, they're better addressed on the win32all mailing list (such
discussions are welcome here too, but on win32all you're more likely
to have the specific experts listening at any given time).

> though their actual binary coding is hidden from the public, are in
> principle defined as hierarchically-embedded objects, and just for the
> sake of legal definition Microsoft is liable to give the object-oriented
> definition of the products its programs churn out.  VBasic does a bad
> job with them (when it is functioning), you can do a better one.  I

VBasic is a bad language (particularly VB6 -- MS acknowledges that by
making so many incompatible changes in VB7 aka VB.NET, most of those
changes taking VB semantically closer to Python:-) but its semantics
are often closely modeled on that of the underlying COM platform (and
for VB7, that of the underlying .NET platform).  I definitely don't
think Python should distort its semantics to meet either (besides, if
it had COM semantics it couldn't at the same time have incompatible
dotNET ones -- we're better off with Python's native semantics!-).

So, for example, "reference parameters" to method calls are out of the
question; [out] parameters disappear as parameters and become return 
values instead (throughout win32all but particularly in the COM-specific
parts), [in] are normal Python paramters, [in, out] are both parameters
for the [in] part AND return values for the [out].  But then, that issue
is more general -- check out standard library module select for a
definitely not MS-specific example of the sam idiom.

> personnaly don't like that much Office, but will it or not we are stuck
> with their products for at least a decade.  Your product is free

Actually I've migrated almost exclusively to OO.o 1.1 and so far I'm
pretty happy with the results.

> (although profitable), VBasic is not.  You should intrude into
> Microsoft's domain right at this place.  People are rightly scared when
> they hear of VBasic, so they are turned off from programming altogether,
> prefering to let the Microsoft monsters do all kinds of jobs by the
> means of commands far more difficult to master than learning a
> language.  Your language appears in a calculator fashion, the user is
> reassured and feels empowered.  It is like a toy in his hands.  If he
> could play with Microsoft objects with this toy, he would tackle many
> jobs far more easily than at present.

It's pretty easy to install Python+win32all (e.g. ActivePython distro
has both, plus more) and do that today.  The irreducible complexity in
the Office objectmodel (OpenOffice's too) is of course another issue.

Alex