What's so funny? WAS Re: rotor replacement

Wed Jan 26 21:36:51 EST 2005

"Martin v. Löwis" <martin at v.loewis.de> writes:
> > That it's not appropriate for the
> > distro maintainers to look at the spec and the reference (pure Python)
> > implementatation and say "yes, we want this, go write the C version
> > and we'll include it after it's had some testing".
> 
> I know that I'm not going to give a blanket promise to include some
> code in the future. I can give blanket promises to *review* such
> code. Actually, assuming I find the time, I will review *any*
> code that is contributed.

I don't see why you can't make up your mind enough to issue simple
statements like "the Python lib should have a module that does
so-and-so, and it should meet such-and-such requirements, so if
someone submits one that meets the requirements and passes code review
and testing and doesn't have unexpected issues or otherwise fail to
meet reasonable expectations, we'll use it".

> Sure. That is because O'Reilly is willing to take a financial risk
> of failure of a book product. I'm not willing to take similar risks
> for Python code (risk of having to redesign the interface, 

Again, we're talking about straightforward modules whose basic
interface needs are obvious.  And interfaces in stdlib do get extended
from version to version all the time, if users turn out to need
additional features beyond the obvious basics.

> fix bugs in the implementation, 

Obviously there must be testing and review before inclusion.
Acceptance is contingent on the module passing tests and review.

> or have the submitter run away after the contribution).

There is no way to know in advance whether that's going to happen.  A
lot of work on the ANSI X9 crypto standards came to a screeching halt
a few years ago when one of the more prolific contributors tripped
over his vacuum cleaner cord, fell down the stairs, and was killed.
So if you have to be absolutely sure that the submitter will always be
around, you can never accept anything.  I think you mostly have to go
by how maintainable the code looks and how much maintainance you think
it will actually need and how many people you think are around who can
take care of it when needed.  And I do believe that experienced
programmers are capable of making reasonable judgements about those
questions, so they should not refuse to ever make such judgements.

> > Similarly, look at how the whole PEP process works.  There are lots of
> > times when a PEP has been accepted before any working code is
> > distributed.
> 
> Indeed. For new language features, it is more difficult to try them out
> in advance than for new library API.

I don't see why that should be.  Nothing stops anyone from
implementing and testing a language feature before standardizing it.
It will get even easier with PyPy, which is perhaps a reason to ban
further language changes until PyPy is deployed.

> Taking it to the extreme misses the point. I'm asking for field
> testing for new modules - not for all changes.

"Field testing" to most people means testing that a specific
implementation is reliable enough for inclusion.  It is the final step
in a normal process of declaring a feature ready for deployment, not
the initial step:

  1) think about whether you want the feature
  2) decide you want it
  3) implement
  4) field test (this naturally recognizes the possibility of
     reversing step 2, if something unexpectedly goes wrong in testing
     that's not easily repaired, but step 2 declares the basic
     intention for what should happen after a successful test).
  5) deploy

You wanted much more than for step 4 to always happen before step 5,
which is reasonable.  You claimed step 4 should always happen before
step 1, which is silly.

> > That's bizarre and abnormal as a development process.  What kind of
> > development process in industry doesn't decide whether to include a
> > feature, until after the feature is completely implemented at a
> > production scale?
> 
> Any high-quality standardization process. Standards in IETF and OMG
> are only accepted after implementations have been available.

I don't know what OMG is, but there is no IETF requirement that any
implementations be available in any particular language.  There are
also plenty of instances where the IETF decides that it wants
something to be standardized (e.g. IPSEC) so it invites a working
group to develop a specification.  The WG members then spend a lot of
time in meetings and discussions reaching a consensus on what the spec
should say.  They are willing to spend that time because the IETF has
already given them reasonable expectation that their end result will
actually be used.  The IETF doesn't say "go develop a complete
standard and implementation and put it in the field for a year before
we [the IETF] will even think about whether we want to standardize
it".  They are capable of announcing in advance that they want to
standardize something.  I don't see why the Python folks have to be
incapable of ever doing the same.

For the module we're discussing, there is already a published working
implementation, written in Python.  It implements every feature of the
API and has been integrated into various applications/demos and tested
with them.  And, there was reasonable consensus on clpy and on the
python-crypto list that the API did the right things.  That is all
that the IETF normally requires ("rough consensus and working code").
The Python implementation is unsuitable for the stdlib because it's
too slow to use in serious production apps and so it needs to be
rewritten in C, not because anything is missing or untested about it.
(There's also an unpublished hybrid Python/C implementation that's
tolerably fast, but unsuitable for contribution because it depends on
external packages that can't go in the Python distro).  But the IETF
would not care about that.  All they care is that useable
implementations exist.

Finally, the IETF generally only specifies protocols and doesn't care
about either the implementation specifics or the API (they require
implementations to exist only in order to prove that the protocol is
usable).  The protocols here would be the FIPS operating modes (ECB,
CBC, CFB, etc).  Those are already standardized as FIPS and have test
vectors published by NIST.  The Python implementation computes all the
published test vectors correctly and interoperates with various
non-Python programs that use those same modes.  So again, there is
already enough available to satisfy an IETF-like standardization
process.  Your IETF example does not support your stance.

> The volunteers are free to work on whatever they please. If you chose
> not to write an AES module - that's fine with me. 

However, the result of my not writing an AES module is that Python
doesn't have an AES module.  Because as far as I can tell, while
numerous people want to use such a module, nobody except me has
expressed willingness to write and contribute one.  If someone else
did it, I would be overjoyed, and I'd happily use whatever they wrote,
if it was useable at all, dropping any plans to write my own.  (And
frankly, the only feature such a module absolutely needs to provide is
simple ECB mode block encryption and decryption.  The other modes are
helpful for good performance and convenience but can be done as pure
Python functions that call the ECB mode primitive, with tolerable
speed for lots of useful apps.  It's just the block cipher primitive
that's painfully slow in Python).

> Still, people do contribute to Python, and they do so without asking
> for permission first. 

They have not done that for an AES module.  That is a fact on the
ground.  I don't have any burning desire to be the author of a Python
AES module.  I just think Python should have one that I and other app
writers can rely on, so I've been willing to volunteer to write and
contribute it, since I'm qualified and nobody else has stepped
forward.  But there appears to currently be a policy in force saying
"there will be no crypto module in the Python stdlib regardless of its
technical quality".  Under that policy, there's no point in my writing
one to contribute it.

> Typically, they develop the code because it solves their own needs -
> then it doesn't really matter whether it also solves the needs of others.

In fact that's been done with AES and DES numerous times, and the
resulting modules do meet my needs for my own end-use (I'm currently
using a C library wrapped with SWIG).  What's missing is a module in
the stdlib that lets me ship pure-Python crypto apps that other end
users can run without installing C code.  And, as we discussed
already, the modules people have written for their own needs (at least
so far) aren't really technically suitable for the stdlib, because a
good general purpose stdlib module has somewhat different
characteristics than a module written to support a specific
application, which is what people write to solve their own needs.

Note that the same thing is true for most other types of library
modules besides crypto (i.e. a general purpose module suitable for the
stdlib tends to be different from what someone writes to solve their
own needs).  So either most stdlib modules were either not written
purely to solve the author's needs and then submitted as an
afterthought as you describe, or else the modules in the stdlib
typically don't have good general-purpose designs.  (In fact the
latter is often true, which suggests that the practice of contributing
modules originally written only to solve an individual need often
results in lousy stdlib modules, and a policy change towards
encouraging designing intentionally for the stdlib is probably a good
thing.  To paraphrase RMS, a good stdlib needs to come from a vision
and a plan, not just from scratching a bunch of different itches).

> >>In either case, the user would best use the pre-compiled binary that
> >> somebody else provided for the platform.
> > Who's the "somebody else"?
> 
> Some user. I find that users contribute binaries for all kinds of
> platforms for the code I publish. This is how open source works.
> 
> > Do you do that
> > with your own modules, and still say that it's easy?
> 
> I publish or link to binaries of my code that others have created,
> and find no problems in doing so.

And you seriously believe that the result is as painless for end users
as having a module in the stdlib that apps can just import and call
without the end-user having to first locate, download and install a
3rd party module?  Do you think the Python philosophy of "batteries
included" really is a) meaningless, b) worthwhile in the past but now
obsolete, c) a mistake to begin with, d) sort of worthwhile but not
really important, e) something else?  That's not a rhetorical
question, I'm really having a hard time figuring out what you think
"batteries included" is supposed to mean.  I personally consider it
meaningful and extremely important, or else I'd be using Scheme
instead of Python.

> No, it's three steps
> 1. decide that you want to do it
> 2. do it
> 3. decide whether you are pleased with the result, and only
>     use it if you are
> 
> IOW, there should not be a blanket guarantee to use it after step 1.

But, it's completely normal to say before step 1 that "if the result
of step 2 does so-and-so, then I'll be pleased in step 3", stating
some clear requirements and sticking to them.  People do that in real
life all the time, including in standardization processes, and often
with ironclad guarantees (called "contracts").

It's pathetic if the Python maintainers are so indecisive as to be
incapable to ever saying (even -without- making ironclad guarantees),
that "yeah, we really ought to have an XYZ module and we'll use one if
someone submits it and it does so-and-so", giving an informal
assurance to someone thinking of doing an implementation that they're
probably not wasting their time.  That's often what allows step 1 to
happen.

> > I think that question isn't the right one.  We need to ask how
> > many users the sha module was required to have, before Greg and
> > Andrew could have reasonable confidence that the sha module would
> > go into the core once it was tested enough and shown to be
> > reliable.
> 
> They did not have *any* guarantee until they asked. I guess when they
> asked it was accepted immediately.

And who do you think they asked, hmm?  When I asked Guido about
submitting a crypto module, he told me that he defers all technical
crypto issues to Andrew.  So I think Andrew had a reasonable
expectation of what would happen when he submitted it.  You and
Frederik seem to think there's something inappropriate or
self-inflated about wanting that expectation before committing to do a
pile of work that's primarily for other people's benefit.  I think
your stated attitude is completely bizarre, that you can't really
believe anything so silly, so you're really just acting bureaucratic,
looking for excuses to say no instead of yes to worthwhile proposals.

> Again, we would have to ask - but I would not be surprised if
> AMK started implementing the [sha] module without even *considering*
> a later inclusion in the Python core at that time. He has done
> so on many occasions (include PyXML, which I inherited from him).

I would be surprised.  That need for an sha module was completely
obvious, what it needed to do was completely obvious, and the
requirements and implementation have nothing like the subtlety of PyXML.