From jeremy@beopen.com  Tue Dec  5 02:34:24 2000
From: jeremy@beopen.com (Jeremy Hylton)
Date: Tue Dec  5 02:38:43 2000
Subject: [Python-Dev] 2.0 final is nearly ready

There are provisional source tarballs available at
    ftp://python.beopen.com/pub/python/2.0

These are NOT the source tarballs that we intend to release.  They are
known to contain old README and Misc/NEWS files.  But any reports of
successful builds on your platform would be appreciated.  I expect to
put the final release in place in a few hours; any reports of success
after the release will be expected <0.5 wink>.

Jeremy


From fredrik@effbot.org  Fri Dec  1 06:39:57 2000
From: fredrik@effbot.org (Fredrik Lundh)
Date: Fri, 1 Dec 2000 07:39:57 +0100
Subject: [Python-Dev] TypeError: foo, bar
Message-ID: <008f01c05b61$877263b0$3c6340d5@hagrid>

just stumbled upon yet another (high-profile) python newbie
confused a "TypeError: read-only character buffer, dictionary"
message.

how about changing "read-only character buffer" to "string
or read-only character buffer", and the "foo, bar" format to
"expected foo, found bar", so we get:

    "TypeError: expected string or read-only character
    buffer, found dictionary"

</F>



From tim.one@home.com  Fri Dec  1 06:58:53 2000
From: tim.one@home.com (Tim Peters)
Date: Fri, 1 Dec 2000 01:58:53 -0500
Subject: [Python-Dev] TypeError: foo, bar
In-Reply-To: <008f01c05b61$877263b0$3c6340d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEAOICAA.tim.one@home.com>

[Fredrik Lundh]
> just stumbled upon yet another (high-profile) python newbie
> confused a "TypeError: read-only character buffer, dictionary"
> message.
>
> how about changing "read-only character buffer" to "string
> or read-only character buffer", and the "foo, bar" format to
> "expected foo, found bar", so we get:
> 
>     "TypeError: expected string or read-only character
>     buffer, found dictionary"

+0.  +1 if "found" is changed to "got".

"found"-implies-a-search-ly y'rs  - tim



From thomas.heller@ion-tof.com  Fri Dec  1 08:10:21 2000
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 1 Dec 2000 09:10:21 +0100
Subject: [Python-Dev] PEP 229 and 222
References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us> <20001130181438.A21596@ludwig.cnri.reston.va.us>
Message-ID: <014301c05b6e$269716a0$e000a8c0@thomasnotebook>

> > Beats me.  I'm not even sure if the Distutils offers a way to compile
> > a static Python binary.  (GPW: well, does it?)
> 
> It's in the CCompiler interface, but hasn't been exposed to the outside
> world.  (IOW, it's mainly a question of desiging the right setup
> script/command line interface: the implementation should be fairly
> straightforward, assuming the existing CCompiler classes do the right
> thing for generating binary executables.)
Distutils currently only supports build_*** commands for
C-libraries and Python extensions.

Shouldn't there also be build commands for shared libraries,
executable programs and static Python binaries?

Thomas

BTW: Distutils-sig seems pretty dead these days...




From ping@lfw.org  Fri Dec  1 10:23:56 2000
From: ping@lfw.org (Ka-Ping Yee)
Date: Fri, 1 Dec 2000 02:23:56 -0800 (PST)
Subject: [Python-Dev] Cryptic error messages
Message-ID: <Pine.LNX.4.10.10011181405020.504-100000@skuld.kingmanhall.org>

An attempt to use sockets for the first time yesterday left a
friend of mine bewildered:

    >>> import socket
    >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    >>> s.connect('localhost:234')
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    TypeError: 2-sequence, 13-sequence
    >>> 

"What the heck does '2-sequence, 13-sequence' mean?" he rightfully asked.


I see in getargs.c (line 275) that this type of message is documented:

    /* Convert a tuple argument.
    [...]
       If the argument is invalid:
    [...]
          *msgbuf contains an error message, whose format is:
             "<typename1>, <typename2>", where:
                <typename1> is the name of the expected type, and
                <typename2> is the name of the actual type,
             (so you can surround it by "expected ... found"),
          and msgbuf is returned.
    */

It's clear that the socketmodule is not prepending "expected" and
appending "found", as the author of converttuple intended.

But when i grepped through the source code, i couldn't find anyone
applying this "expected %s found" % msgbuf convention outside of
getargs.c.  Is it really in use?

Could we just change getargs.c so that converttuple() returns a
message like "expected ..., got ..." instead of seterror()?

Additionally it would be nice to say '13-character string' instead
of '13-sequence'...


-- ?!ng

"All models are wrong; some models are useful."
    -- George Box



From mwh21@cam.ac.uk  Fri Dec  1 11:20:23 2000
From: mwh21@cam.ac.uk (Michael Hudson)
Date: 01 Dec 2000 11:20:23 +0000
Subject: [Python-Dev] Cryptic error messages
In-Reply-To: Ka-Ping Yee's message of "Fri, 1 Dec 2000 02:23:56 -0800 (PST)"
References: <Pine.LNX.4.10.10011181405020.504-100000@skuld.kingmanhall.org>
Message-ID: <m37l5k5qx4.fsf@atrus.jesus.cam.ac.uk>

Ka-Ping Yee <ping@lfw.org> writes:

> An attempt to use sockets for the first time yesterday left a
> friend of mine bewildered:
> 
>     >>> import socket
>     >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>     >>> s.connect('localhost:234')
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     TypeError: 2-sequence, 13-sequence
>     >>> 
> 
> "What the heck does '2-sequence, 13-sequence' mean?" he rightfully asked.
> 

I'm not sure about the general case, but in this case you could do
something like:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102599&group_id=5470

Now you get an error message like:

TypeError: getsockaddrarg: AF_INET address must be tuple, not string

Cheers,
M.

-- 
  I have gathered a posie of other men's flowers, and nothing but the
  thread that binds them is my own.                       -- Montaigne



From guido@python.org  Fri Dec  1 13:02:02 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 01 Dec 2000 08:02:02 -0500
Subject: [Python-Dev] TypeError: foo, bar
In-Reply-To: Your message of "Fri, 01 Dec 2000 07:39:57 +0100."
 <008f01c05b61$877263b0$3c6340d5@hagrid>
References: <008f01c05b61$877263b0$3c6340d5@hagrid>
Message-ID: <200012011302.IAA31609@cj20424-a.reston1.va.home.com>

> just stumbled upon yet another (high-profile) python newbie
> confused a "TypeError: read-only character buffer, dictionary"
> message.
> 
> how about changing "read-only character buffer" to "string
> or read-only character buffer", and the "foo, bar" format to
> "expected foo, found bar", so we get:
> 
>     "TypeError: expected string or read-only character
>     buffer, found dictionary"

The first was easy, and I've done it.  The second one, for some
reason, is hard.  I forget why.  Sorry.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From cgw@fnal.gov  Fri Dec  1 13:41:04 2000
From: cgw@fnal.gov (Charles G Waldman)
Date: Fri, 1 Dec 2000 07:41:04 -0600 (CST)
Subject: [Python-Dev] TypeError: foo, bar
In-Reply-To: <008f01c05b61$877263b0$3c6340d5@hagrid>
References: <008f01c05b61$877263b0$3c6340d5@hagrid>
Message-ID: <14887.43632.812342.414156@buffalo.fnal.gov>

Fredrik Lundh writes:

 > how about changing "read-only character buffer" to "string
 > or read-only character buffer", and the "foo, bar" format to
 > "expected foo, found bar", so we get:
 > 
 >     "TypeError: expected string or read-only character
 >     buffer, found dictionary"

+100.  Recently, I've been teaching Python to some beginners and they
find this message absolutely inscrutable.

Also agree with Tim about "found" vs. "got", but this is of secondary
importance.



From Moshe Zadka <moshez@math.huji.ac.il>  Fri Dec  1 14:26:03 2000
From: Moshe Zadka <moshez@math.huji.ac.il> (Moshe Zadka)
Date: Fri, 1 Dec 2000 16:26:03 +0200 (IST)
Subject: [Python-Dev] [OT] Change of Address
Message-ID: <Pine.GSO.4.10.10012011624170.1366-100000@sundial>

I'm sorry to bother you all with this, but from time to time you might
need to reach my by e-mail...
30 days from now, this e-mail address will no longer be valid.
Please use anything@zadka.site.co.il to reach me.

Thank you for your time.
--
Moshe Zadka <moshez@zadka.site.co.il> -- 95855124
http://advogato.org/person/moshez



From gward@mems-exchange.org  Fri Dec  1 15:14:53 2000
From: gward@mems-exchange.org (Greg Ward)
Date: Fri, 1 Dec 2000 10:14:53 -0500
Subject: [Python-Dev] PEP 229 and 222
In-Reply-To: <014301c05b6e$269716a0$e000a8c0@thomasnotebook>; from thomas.heller@ion-tof.com on Fri, Dec 01, 2000 at 09:10:21AM +0100
References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us> <20001130181438.A21596@ludwig.cnri.reston.va.us> <014301c05b6e$269716a0$e000a8c0@thomasnotebook>
Message-ID: <20001201101452.A26074@ludwig.cnri.reston.va.us>

On 01 December 2000, Thomas Heller said:
> Distutils currently only supports build_*** commands for
> C-libraries and Python extensions.
> 
> Shouldn't there also be build commands for shared libraries,
> executable programs and static Python binaries?

Andrew and I talked about this a bit yesterday, and the proposed
interface is as follows:

    python setup.py build_ext --static

will compile all extensions in the current module distribution, but
instead of creating a .so (.pyd) file for each one, will create a new
python binary in build/bin.<plat>.

Issue to be resolved: what to call the new python binary, especially
when installing it (presumably we *don't* want to clobber the stock
binary, but supplement it with (eg.) "foopython").

Note that there is no provision for selectively building some extensions
as shared.  This means that Andrew's Distutil-ization of the standard
library will have to override the build_ext command and have some extra
way to select extensions for shared/static.  Neither of us considered
this a problem.

> BTW: Distutils-sig seems pretty dead these days...

Yeah, that's a combination of me playing on other things and python.net
email being dead for over a week.  I'll cc the sig on this and see if
this interface proposal gets anyone's attention.

        Greg


From jeremy@alum.mit.edu  Fri Dec  1 19:27:14 2000
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 1 Dec 2000 14:27:14 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
Message-ID: <14887.64402.88530.714821@bitdiddle.concentric.net>

There was recently some idle chatter in Guido's living room about
using a unit testing framework (like PyUnit) for the Python regression
test suite.  We're also writing tests for some DC projects, and need
to decide what framework to use.

Does anyone have opinions on test frameworks?  A quick web search
turned up PyUnit (pyunit.sourceforge.net) and a script by Tres Seaver
that allows implements xUnit-style unit tests.  Are there other tools
we should consider?

Is anyone else interested in migrating the current test suite to a new
framework?  I hope the new framework will allow us to improve the test
suite in a number of ways:

    - run an entire test suite to completion instead of stopping on
      the first failure

    - clearer reporting of what went wrong

    - better support for conditional tests, e.g. write a test for
      httplib that only runs if the network is up.  This is tied into
      better error reporting, since the current test suite could only
      report that httplib succeeded or failed.

Jeremy


From fdrake@acm.org  Fri Dec  1 19:24:46 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 1 Dec 2000 14:24:46 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net>
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
Message-ID: <14887.64254.399477.935828@cj42289-a.reston1.va.home.com>

Jeremy Hylton writes:
 >     - better support for conditional tests, e.g. write a test for
 >       httplib that only runs if the network is up.  This is tied into
 >       better error reporting, since the current test suite could only
 >       report that httplib succeeded or failed.

  There is a TestSkipped exception that can be raised with an
explanation of why.  It's used in the largefile test (at least).  I
think it is documented in the README.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From akuchlin@mems-exchange.org  Fri Dec  1 19:58:27 2000
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Fri, 1 Dec 2000 14:58:27 -0500
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 02:27:14PM -0500
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
Message-ID: <20001201145827.D16751@kronos.cnri.reston.va.us>

On Fri, Dec 01, 2000 at 02:27:14PM -0500, Jeremy Hylton wrote:
>There was recently some idle chatter in Guido's living room about
>using a unit testing framework (like PyUnit) for the Python regression
>test suite.  We're also writing tests for some DC projects, and need

Someone remembered my post of 23 Nov, I see...  The only other test
framework I know of is the unittest.py inside Quixote, written because
we thought PyUnit was kind of clunky.  Greg Ward, who primarily wrote
it, used more sneaky interpreter tricks to make the interface more
natural, though it still worked with Jython last time we checked (some
time ago, though).  No GUI, but it can optionally show the code coverage of a
test suite, too.

See http://x63.deja.com/=usenet/getdoc.xp?AN=683946404 for some notes
on using it.  Obviously I think the Quixote unittest.py is the best
choice for the stdlib.

--amk


From jeremy@alum.mit.edu  Fri Dec  1 20:55:28 2000
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 1 Dec 2000 15:55:28 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <20001201145827.D16751@kronos.cnri.reston.va.us>
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
 <20001201145827.D16751@kronos.cnri.reston.va.us>
Message-ID: <14888.4160.838336.537708@bitdiddle.concentric.net>

Is there any documentation for the Quixote unittest tool?  The Example
page is helpful, but it feels like there are some details that are not
explained.

Jeremy


From akuchlin@mems-exchange.org  Fri Dec  1 21:12:12 2000
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Fri, 1 Dec 2000 16:12:12 -0500
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14888.4160.838336.537708@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 03:55:28PM -0500
References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net>
Message-ID: <20001201161212.A12372@kronos.cnri.reston.va.us>

On Fri, Dec 01, 2000 at 03:55:28PM -0500, Jeremy Hylton wrote:
>Is there any documentation for the Quixote unittest tool?  The Example
>page is helpful, but it feels like there are some details that are not
>explained.

I don't believe we've written docs at all for internal use.  What
details seem to be missing?

--amk



From jeremy@alum.mit.edu  Fri Dec  1 21:21:27 2000
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 1 Dec 2000 16:21:27 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <20001201161212.A12372@kronos.cnri.reston.va.us>
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
 <20001201145827.D16751@kronos.cnri.reston.va.us>
 <14888.4160.838336.537708@bitdiddle.concentric.net>
 <20001201161212.A12372@kronos.cnri.reston.va.us>
Message-ID: <14888.5719.844387.435471@bitdiddle.concentric.net>

>>>>> "AMK" == Andrew Kuchling <akuchlin@mems-exchange.org> writes:

  AMK> On Fri, Dec 01, 2000 at 03:55:28PM -0500, Jeremy Hylton wrote:
  >> Is there any documentation for the Quixote unittest tool?  The
  >> Example page is helpful, but it feels like there are some details
  >> that are not explained.

  AMK> I don't believe we've written docs at all for internal use.
  AMK> What details seem to be missing?

Details:

   - I assume setup/shutdown are equivalent to setUp/tearDown 
   - Is it possible to override constructor for TestScenario?
   - Is there something equivalent to PyUnit self.assert_
   - What does parse_args() do?
   - What does run_scenarios() do?
   - If I have multiple scenarios, how do I get them to run?

Jeremy



From akuchlin@mems-exchange.org  Fri Dec  1 21:34:30 2000
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Fri, 1 Dec 2000 16:34:30 -0500
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14888.5719.844387.435471@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 04:21:27PM -0500
References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net> <20001201161212.A12372@kronos.cnri.reston.va.us> <14888.5719.844387.435471@bitdiddle.concentric.net>
Message-ID: <20001201163430.A12417@kronos.cnri.reston.va.us>

On Fri, Dec 01, 2000 at 04:21:27PM -0500, Jeremy Hylton wrote:
>   - I assume setup/shutdown are equivalent to setUp/tearDown 

Correct.

>   - Is it possible to override constructor for TestScenario?

Beats me; I see no reason why you couldn't, though.

>   - Is there something equivalent to PyUnit self.assert_

Probably test_bool(), I guess: self.test_bool('self.run.is_draft()')
asserts that self.run.is_draft() will return true.  Or does
self.assert_() do something more?

>   - What does parse_args() do?
>   - What does run_scenarios() do?
>   - If I have multiple scenarios, how do I get them to run?

These 3 questions are all related, really.  At the bottom of our test
scripts, we have the following stereotyped code:

if __name__ == "__main__":
    (scenarios, options) = parse_args()
    run_scenarios (scenarios, options)

parse_args() ensures consistent arguments to test scripts; -c measures
code coverage, -v is verbose, etc.  It also looks in the __main__
module and finds all subclasses of TestScenario, so you can do:  

python test_process_run.py  # Runs all N scenarios
python test_process_run.py ProcessRunTest # Runs all cases for 1 scenario
python test_process_run.py ProcessRunTest:check_access # Runs one test case
                                                       # in one scenario class

--amk



From tim.one@home.com  Fri Dec  1 21:47:54 2000
From: tim.one@home.com (Tim Peters)
Date: Fri, 1 Dec 2000 16:47:54 -0500
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECFICAA.tim.one@home.com>

[Jeremy Hylton]
> There was recently some idle chatter in Guido's living room about
> using a unit testing framework (like PyUnit) for the Python regression
> test suite.  We're also writing tests for some DC projects, and need
> to decide what framework to use.
>
> Does anyone have opinions on test frameworks?  A quick web search
> turned up PyUnit (pyunit.sourceforge.net) and a script by Tres Seaver
> that allows implements xUnit-style unit tests.  Are there other tools
> we should consider?

My own doctest is loved by people other than just me <wink>, but is aimed at
ensuring that examples in docstrings work exactly as shown (which is why it
starts with "doc" instead of "test").

> Is anyone else interested in migrating the current test suite to a new
> framework?

Yes.

> I hope the new framework will allow us to improve the test
> suite in a number of ways:
>
>     - run an entire test suite to completion instead of stopping on
>       the first failure

doctest does that.

>     - clearer reporting of what went wrong

Ditto.

>     - better support for conditional tests, e.g. write a test for
>       httplib that only runs if the network is up.  This is tied into
>       better error reporting, since the current test suite could only
>       report that httplib succeeded or failed.

A doctest test is simply an interactive Python session pasted into a
docstring (or more than one session, and/or interspersed with prose).  If
you can write an example in the interactive shell, doctest will verify it
still works as advertised.  This allows for embedding unit tests into the
docs for each function, method and class.  Nothing about them "looks like"
an artificial test tacked on:  the examples in the docs *are* the test
cases.

I need to try the other frameworks.  I dare say doctest is ideal for
computational functions, where the intended input->output relationship can
be clearly explicated via examples.  It's useless for GUIs.  Usefulness
varies accordingly between those extremes (doctest is natural exactly to the
extent that a captured interactive session is helpful for documentation
purposes).

testing-ain't-easy-ly y'rs  - tim



From barry@digicool.com  Sat Dec  2 03:52:29 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Fri, 1 Dec 2000 22:52:29 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
Message-ID: <14888.29181.355023.669030@anthem.concentric.net>

I've just uploaded PEP 231, which describes a new hook in the instance
access mechanism, called __findattr__() after a similar mechanism that
exists in Jython (but is not exposed at the Python layer).

You can do all kinds of interesting things with __findattr__(),
including implement the __of__() protocol of ExtensionClass, and thus
implicit and explicit acquisitions, in pure Python.  You can also do
Java Bean-like interfaces and C++-like access control.  The PEP
contains sample implementations of all of these, although the latter
isn't as clean as I'd like, due to other restrictions in Python.

My hope is that __findattr__() would eliminate most, if not all, the
need for ExtensionClass, at least within the Zope and ZODB contexts.
I haven't tried to implement Persistent using it though.

Since it's a long PEP, I won't include it here.  You can read about it
at this URL

    http://python.sourceforge.net/peps/pep-0231.html

It includes a link to the patch implementing this feature on
SourceForge.

Enjoy,
-Barry


From Moshe Zadka <moshez@zadka.site.co.il>  Sat Dec  2 09:11:50 2000
From: Moshe Zadka <moshez@zadka.site.co.il> (Moshe Zadka)
Date: Sat, 2 Dec 2000 11:11:50 +0200 (IST)
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <14888.29181.355023.669030@anthem.concentric.net>
Message-ID: <Pine.GSO.4.10.10012021109320.1366-100000@sundial>

On Fri, 1 Dec 2000, Barry A. Warsaw wrote:

> I've just uploaded PEP 231, which describes a new hook in the instance
> access mechanism, called __findattr__() after a similar mechanism that
> exists in Jython (but is not exposed at the Python layer).

There's one thing that bothers me about this: what exactly is "the
call stack"? Let me clarify: what happens when you have threads.
Either machine-level threads and stackless threads confuse the issues
here, not to talk about stackless continuations. Can you add a few
words to the PEP about dealing with those?



From mal@lemburg.com  Sat Dec  2 10:03:11 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 02 Dec 2000 11:03:11 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
Message-ID: <3A28C8DF.E430484F@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> I've just uploaded PEP 231, which describes a new hook in the instance
> access mechanism, called __findattr__() after a similar mechanism that
> exists in Jython (but is not exposed at the Python layer).
> 
> You can do all kinds of interesting things with __findattr__(),
> including implement the __of__() protocol of ExtensionClass, and thus
> implicit and explicit acquisitions, in pure Python.  You can also do
> Java Bean-like interfaces and C++-like access control.  The PEP
> contains sample implementations of all of these, although the latter
> isn't as clean as I'd like, due to other restrictions in Python.
> 
> My hope is that __findattr__() would eliminate most, if not all, the
> need for ExtensionClass, at least within the Zope and ZODB contexts.
> I haven't tried to implement Persistent using it though.

The PEP does define when and how __findattr__() is called,
but makes no statement about what it should do or return...

Here's a slightly different idea:

Given the name, I would expect it to go look for an attribute
and then return the attribute and its container (this
doesn't seem to be what you have in mind here, though).

An alternative approach given the semantics above would
then be to first try a __getattr__() lookup and revert
to __findattr__() in case this fails. I don't think there
is any need to overload __setattr__() in such a way, because
you cannot be sure which object actually gets the new attribute.

By exposing the functionality using a new builtin, findattr(),
this could be used for all the examples you give too.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From barry@digicool.com  Sat Dec  2 16:50:02 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Sat, 2 Dec 2000 11:50:02 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
 <3A28C8DF.E430484F@lemburg.com>
Message-ID: <14889.10298.621133.961677@anthem.concentric.net>

>>>>> "M" == M  <mal@lemburg.com> writes:

    M> The PEP does define when and how __findattr__() is called,
    M> but makes no statement about what it should do or return...

Good point.  I've clarified that in the PEP.

    M> Here's a slightly different idea:

    M> Given the name, I would expect it to go look for an attribute
    M> and then return the attribute and its container (this doesn't
    M> seem to be what you have in mind here, though).

No, because some applications won't need a wrapped object.  E.g. in
the Java bean example, it just returns the attribute (which is stored
with a slightly different name).

    M> An alternative approach given the semantics above would then be
    M> to first try a __getattr__() lookup and revert to
    M> __findattr__() in case this fails.

I don't think this is as useful.  What would that buy you that you
can't already do today?

The key concept here is that you want to give the class first crack to
interpose on every attribute access.  You want this hook to get called
before anybody else can get at, or set, your attributes.  That gives
you (the class) total control to implement whatever policy is useful.
    
    M> I don't think there is any need to overload __setattr__() in
    M> such a way, because you cannot be sure which object actually
    M> gets the new attribute.

    M> By exposing the functionality using a new builtin, findattr(),
    M> this could be used for all the examples you give too.

No, because then people couldn't use the object in the normal
dot-notational way.

-Barry


From tismer@tismer.com  Sat Dec  2 16:27:33 2000
From: tismer@tismer.com (Christian Tismer)
Date: Sat, 02 Dec 2000 18:27:33 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
Message-ID: <3A2922F5.C2E0D10@tismer.com>

Hi Barry,

"Barry A. Warsaw" wrote:
> 
> I've just uploaded PEP 231, which describes a new hook in the instance
> access mechanism, called __findattr__() after a similar mechanism that
> exists in Jython (but is not exposed at the Python layer).
> 
> You can do all kinds of interesting things with __findattr__(),
> including implement the __of__() protocol of ExtensionClass, and thus
> implicit and explicit acquisitions, in pure Python.  You can also do
> Java Bean-like interfaces and C++-like access control.  The PEP
> contains sample implementations of all of these, although the latter
> isn't as clean as I'd like, due to other restrictions in Python.
> 
> My hope is that __findattr__() would eliminate most, if not all, the
> need for ExtensionClass, at least within the Zope and ZODB contexts.
> I haven't tried to implement Persistent using it though.

I have been using ExtensionClass for quite a long time, and
I have to say that you indeed eliminate most of its need
through this super-elegant idea. Congratulations!

Besides acquisition and persitency interception,
wrapping plain C objects and giving them Class-like behavior
while retaining fast access to internal properties but being
able to override methods by Python methods was my other use
of ExtensionClass. I assume this is the other "20%" part you
mention, which is much harder to achieve?
But that part also looks easier to implement now, by the support
of the __findattr__ method.

> Since it's a long PEP, I won't include it here.  You can read about it
> at this URL
> 
>     http://python.sourceforge.net/peps/pep-0231.html

Great. I had to read it twice, but it was fun.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From tismer@tismer.com  Sat Dec  2 16:55:21 2000
From: tismer@tismer.com (Christian Tismer)
Date: Sat, 02 Dec 2000 18:55:21 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <Pine.GSO.4.10.10012021109320.1366-100000@sundial>
Message-ID: <3A292979.60BB1731@tismer.com>


Moshe Zadka wrote:
> 
> On Fri, 1 Dec 2000, Barry A. Warsaw wrote:
> 
> > I've just uploaded PEP 231, which describes a new hook in the instance
> > access mechanism, called __findattr__() after a similar mechanism that
> > exists in Jython (but is not exposed at the Python layer).
> 
> There's one thing that bothers me about this: what exactly is "the
> call stack"? Let me clarify: what happens when you have threads.
> Either machine-level threads and stackless threads confuse the issues
> here, not to talk about stackless continuations. Can you add a few
> words to the PEP about dealing with those?

As far as I understood the patch (just skimmed), thee is no
stack involved directly, but the instance increments and decrments
a variable infindattr.

+       if (v != NULL && !inst->infindaddr &&
+           (func = inst->in_class->cl_findattr))
+       {
+               PyObject *args, *res;
+               args = Py_BuildValue("(OOO)", inst, name, v);
+               if (args == NULL)
+                       return -1;
+               ++inst->infindaddr;
+               res = PyEval_CallObject(func, args);
+               --inst->infindaddr;

This is: The call modifies the instance's state, while calling
the findattr method.
You are right: I see a serious problem with this. It doesn't
even need continuations to get things messed up. Guido's
proposed coroutines, together with uThread-Switching, might
be able to enter the same instance twice with ease.

Barry, after second thought, I feel this can become
a problem in the future. This infindattr attribute
only works correctly if we are guaranteed to use
strict stack order of execution.
What you're *intending* to to is to tell the PyEval_CallObject
that it should not find the __findattr__ attribute. But
this should be done only for this call and all of its descendants,
but no *fresh* access from elsewhere.

The hard way to get out of this would be to stop scheduling
in that case. Maybe this is very cheap, but quite unelegant.

We have a quite peculiar system state here: A function call
acts like an escape, to make all subsequent calls behave
differently, until this call is finished.

Without blocking microthreads, a clean way to do this would be
a search up in the frame chain, if there is a running __findattr__
method of this object. Fairly expensive. Well, the problem
also exists with real threads, if they are allowed to switch
in such a context.

I fear it is necessary to either block this stuff until it is
ready, or to maintain some thread-wise structure for the
state of this object.

Ok, after thinking some more, I'll start an extra message
to Barry on this topic.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From tismer@tismer.com  Sat Dec  2 17:21:18 2000
From: tismer@tismer.com (Christian Tismer)
Date: Sat, 02 Dec 2000 19:21:18 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
Message-ID: <3A292F8D.7C616449@tismer.com>


"Barry A. Warsaw" wrote:
> 
> I've just uploaded PEP 231, which describes a new hook in the instance
> access mechanism, called __findattr__() after a similar mechanism that
> exists in Jython (but is not exposed at the Python layer).

Ok, as I announced already, here some thoughts on __findattr__,
system state, and how it could work.

Looking at your patch, I realize that you are blocking __findattr__
for your whole instance, until this call ends.
This is not what you want to do, I guess. This has an effect of
affecting the whole system state when threads are involved.
Also you cannot use __findattr__ on any other attribute during
this call.

You want most probably do this:
__findattr__ should not be invoked again for this instance,
with this attribute name, for this "thread", until you are done.

The correct way to find out whether __findattr__ is active or
not would be to look upwards the frame chain and inspect it.
Moshe also asked about continuations: I think this would resolve
quite fine. However we jump around, the current chain of frames
dictates the semantics of __findattr__. It even applies to
Guido's tamed coroutines, given that an explicit switch were
allowed in the context of __findattr__.

In a sense, we get some kind of dynamic context here, since
we need to do a lookup for something in the dynamic call chain.
I guess this would be quite messy to implement, and inefficient.

Isn't there a way to accomplish the desired effect without
modifying the instance? In the context of __findattr__, *we*
know that we don't want to get a recursive call.
Let's assume __getattr__ and __setattr__ had yet another
optional parameter: infindattr, defaulting to 0.
We would than have to pass a positive value in this context,
which would object.c tell to not try to invoke __findattr__
again.
With explicit passing of state, no problems with threads
can occour. Readability might improve as well.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From Moshe Zadka <moshez@zadka.site.co.il>  Sun Dec  3 13:14:43 2000
From: Moshe Zadka <moshez@zadka.site.co.il> (Moshe Zadka)
Date: Sun, 3 Dec 2000 15:14:43 +0200 (IST)
Subject: [Python-Dev] Another Python Developer Missing
Message-ID: <Pine.GSO.4.10.10012031512430.7826-100000@sundial>

Gordon McMillan is not a possible assignee in the assign_to field.


--
Moshe Zadka <moshez@zadka.site.co.il> -- 95855124
http://moshez.org



From tim.one@home.com  Sun Dec  3 17:35:36 2000
From: tim.one@home.com (Tim Peters)
Date: Sun, 3 Dec 2000 12:35:36 -0500
Subject: [Python-Dev] Another Python Developer Missing
In-Reply-To: <Pine.GSO.4.10.10012031512430.7826-100000@sundial>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEDOICAA.tim.one@home.com>

[Moshe Zadka]
> Gordon McMillan is not a possible assignee in the assign_to field.

We almost never add people as Python developers unless they ask for that,
since it comes with responsibility as well as riches beyond the dreams of
avarice.  If Gordon would like to apply, we won't charge him any interest
until 2001 <wink>.



From mal@lemburg.com  Sun Dec  3 19:21:11 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 03 Dec 2000 20:21:11 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib urllib.py,1.107,1.108
References: <200012031830.KAA30620@slayer.i.sourceforge.net>
Message-ID: <3A2A9D27.AF43D665@lemburg.com>

"Martin v. Löwis" wrote:
> 
> Update of /cvsroot/python/python/dist/src/Lib
> In directory slayer.i.sourceforge.net:/tmp/cvs-serv30506
> 
> Modified Files:
>         urllib.py
> Log Message:
> Convert Unicode strings to byte strings before passing them into specific
> protocols. Closes bug #119822.
> 
> ...
> +
> + def toBytes(url):
> +     """toBytes(u"URL") --> 'URL'."""
> +     # Most URL schemes require ASCII. If that changes, the conversion
> +     # can be relaxed
> +     if type(url) is types.UnicodeType:
> +         try:
> +             url = url.encode("ASCII")

You should make this: 'ascii' -- encoding names are lower case 
per convention (and the implementation has a short-cut to speed up
conversion to 'ascii' -- not for 'ASCII').

> +         except UnicodeError:
> +             raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters")

Would it be better to use a simple ValueError here ? (UnicodeError
is a subclass of ValueError, but the error doesn't really have something to
do with Unicode conversions...)

> +     return url
> 
>   def unwrap(url):


-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tismer@tismer.com  Sun Dec  3 20:01:07 2000
From: tismer@tismer.com (Christian Tismer)
Date: Sun, 03 Dec 2000 22:01:07 +0200
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib filecmp.py,1.6,1.7
References: <200012032048.MAA10353@slayer.i.sourceforge.net>
Message-ID: <3A2AA683.3840AA8A@tismer.com>


Moshe Zadka wrote:
> 
> Update of /cvsroot/python/python/dist/src/Lib
> In directory slayer.i.sourceforge.net:/tmp/cvs-serv9465
> 
> Modified Files:
>         filecmp.py
> Log Message:
> Call of _cmp had wrong number of paramereters.
> Fixed definition of _cmp.

...

> !         return not abs(cmp(a, b, sh, st))
>       except os.error:
>           return 2

Ugh! Wouldn't that be a fine chance to rename the cmp
function in this module? Overriding a built-in
is really not nice to have in a library.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From Moshe Zadka <moshez@zadka.site.co.il>  Sun Dec  3 21:01:07 2000
From: Moshe Zadka <moshez@zadka.site.co.il> (Moshe Zadka)
Date: Sun, 3 Dec 2000 23:01:07 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 filecmp.py,1.6,1.7
In-Reply-To: <3A2AA683.3840AA8A@tismer.com>
Message-ID: <Pine.GSO.4.10.10012032300100.7608-100000@sundial>

On Sun, 3 Dec 2000, Christian Tismer wrote:

> Ugh! Wouldn't that be a fine chance to rename the cmp
> function in this module? Overriding a built-in
> is really not nice to have in a library.

The fine chance was when we moved cmp.py->filecmp.py. 
Now it would just break backwards compatability.
--
Moshe Zadka <moshez@zadka.site.co.il> -- 95855124
http://moshez.org



From tismer@tismer.com  Sun Dec  3 20:12:15 2000
From: tismer@tismer.com (Christian Tismer)
Date: Sun, 03 Dec 2000 22:12:15 +0200
Subject: [Python-Dev] Re: [Python-checkins] CVS:
 python/dist/src/Libfilecmp.py,1.6,1.7
References: <Pine.GSO.4.10.10012032300100.7608-100000@sundial>
Message-ID: <3A2AA91F.843E2BAE@tismer.com>


Moshe Zadka wrote:
> 
> On Sun, 3 Dec 2000, Christian Tismer wrote:
> 
> > Ugh! Wouldn't that be a fine chance to rename the cmp
> > function in this module? Overriding a built-in
> > is really not nice to have in a library.
> 
> The fine chance was when we moved cmp.py->filecmp.py.
> Now it would just break backwards compatability.

Yes, I see. cmp belongs to the module's interface.
Maybe it could be renamed anyway, and be assigned
to cmp at the very end of the file, but not using
cmp anywhere in the code. My first reaction on reading
the patch was "juck!" since I didn't know this module.

python-dev/null - ly y'rs - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From martin@loewis.home.cs.tu-berlin.de  Sun Dec  3 21:56:44 2000
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 3 Dec 2000 22:56:44 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
Message-ID: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>

> Isn't there a way to accomplish the desired effect without modifying
> the instance? In the context of __findattr__, *we* know that we
> don't want to get a recursive call.  Let's assume __getattr__ and
> __setattr__ had yet another optional parameter: infindattr,
> defaulting to 0.  We would than have to pass a positive value in
> this context, which would object.c tell to not try to invoke
> __findattr__ again.

Who is "we" here? The Python code implementing __findattr__? How would
it pass a value to __setattr__? It doesn't call __setattr__, instead
it has "self.__myfoo = x"...

I agree that the current implementation is not thread-safe. To solve
that, you'd need to associate with each instance not a single
"infindattr" attribute, but a whole set of them - one per "thread of
execution" (which would be a thread-id in most threading systems). Of
course, that would need some cooperation from the any thread scheme
(including uthreads), which would need to provide an identification
for a "calling context".

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Sun Dec  3 22:07:17 2000
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 3 Dec 2000 23:07:17 +0100
Subject: [Python-Dev] Re: CVS: python/dist/src/Lib urllib.py,1.107,1.108
Message-ID: <200012032207.XAA03394@loewis.home.cs.tu-berlin.de>

> You should make this: 'ascii' -- encoding names are lower case per
> convention (and the implementation has a short-cut to speed up
> conversion to 'ascii' -- not for 'ASCII').

With conventions, it is a difficult story. I'm pretty certain that
users typically see that particular american standard as ASCII (to the
extend of calling it "a s c two"), not ascii.

As for speed - feel free to change the code if you think it matters.

> +             raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters")

> Would it be better to use a simple ValueError here ? (UnicodeError
> is a subclass of ValueError, but the error doesn't really have
> something to do with Unicode conversions...)

Why does it not have to do with Unicode conversion? A conversion from
Unicode to ASCII was attempted, and failed.

I guess I would be more open to suggested changes if you had put them
into the patch manager at the time you've reviewed the patch...

Regards,
Martin


From tismer@tismer.com  Sun Dec  3 21:38:11 2000
From: tismer@tismer.com (Christian Tismer)
Date: Sun, 03 Dec 2000 23:38:11 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
Message-ID: <3A2ABD43.AB56BD60@tismer.com>


"Martin v. Loewis" wrote:
> 
> > Isn't there a way to accomplish the desired effect without modifying
> > the instance? In the context of __findattr__, *we* know that we
> > don't want to get a recursive call.  Let's assume __getattr__ and
> > __setattr__ had yet another optional parameter: infindattr,
> > defaulting to 0.  We would than have to pass a positive value in
> > this context, which would object.c tell to not try to invoke
> > __findattr__ again.
> 
> Who is "we" here? The Python code implementing __findattr__? How would
> it pass a value to __setattr__? It doesn't call __setattr__, instead
> it has "self.__myfoo = x"...

Ouch - right! Sorry :)

> I agree that the current implementation is not thread-safe. To solve
> that, you'd need to associate with each instance not a single
> "infindattr" attribute, but a whole set of them - one per "thread of
> execution" (which would be a thread-id in most threading systems). Of
> course, that would need some cooperation from the any thread scheme
> (including uthreads), which would need to provide an identification
> for a "calling context".

Right, that is one possible way to do it. I also thought about
some alternatives, but they all sound too complicated to
justify them. Also I don't think this is only thread-related,
since mess can happen even with an explicit coroutine jmp.
Furthermore, how to deal with multiple attribute names?
The function works wrong if __findattr__ tries to inspect
another attribute.

IMO, the state of the current interpreter changes here
(or should do so), and this changed state needs to be carried
down with all subsequent function calls.

confused - ly chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From mal@lemburg.com  Sun Dec  3 22:51:10 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 03 Dec 2000 23:51:10 +0100
Subject: [Python-Dev] Re: CVS: python/dist/src/Lib urllib.py,1.107,1.108
References: <200012032207.XAA03394@loewis.home.cs.tu-berlin.de>
Message-ID: <3A2ACE5E.A9F860A8@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > You should make this: 'ascii' -- encoding names are lower case per
> > convention (and the implementation has a short-cut to speed up
> > conversion to 'ascii' -- not for 'ASCII').
> 
> With conventions, it is a difficult story. I'm pretty certain that
> users typically see that particular american standard as ASCII (to the
> extend of calling it "a s c two"), not ascii.

It's a convention in the codec registry design and used as such
in the Unicode implementation.
 
> As for speed - feel free to change the code if you think it matters.

Hey... this was just a suggestion. I thought that you didn't
know of the internal short-cut and wanted to hint at it.
 
> > +             raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters")
> 
> > Would it be better to use a simple ValueError here ? (UnicodeError
> > is a subclass of ValueError, but the error doesn't really have
> > something to do with Unicode conversions...)
> 
> Why does it not have to do with Unicode conversion? A conversion from
> Unicode to ASCII was attempted, and failed.

Sure, but the fact that URLs have to be ASCII is not something
that is enforced by the Unicode implementation.
 
> I guess I would be more open to suggested changes if you had put them
> into the patch manager at the time you've reviewed the patch...

I didn't review the patch, only the summary...

Don't have much time to look into these things closely right now, so
all I can do is comment.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From barry@scottb.demon.co.uk  Mon Dec  4 00:55:32 2000
From: barry@scottb.demon.co.uk (Barry Scott)
Date: Mon, 4 Dec 2000 00:55:32 -0000
Subject: [Python-Dev] A house upon the sand
In-Reply-To: <20001130181937.B21596@ludwig.cnri.reston.va.us>
Message-ID: <000201c05d8c$e7a15b10$060210ac@private>

I fully support Greg Wards view. If string was removed I'd not
update the old code but add in my own string module.

Given the effort you guys went to to keep the C extension protocol the
same (in the context of crashing on importing a 1.5 dll into 2.0) I
amazed you think that string could be removed...

Could you split the lib into blessed and backward compatibility sections?
Then by some suitable mechanism I can choose the compatibility I need?

Oh and as for join obviously a method of a list...

	['thats','better'].join(' ')

		Barry



From fredrik@pythonware.com  Mon Dec  4 10:37:18 2000
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 4 Dec 2000 11:37:18 +0100
Subject: [Python-Dev] unit testing and Python regression test
References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us>
Message-ID: <00e701c05dde$2d77c240$0900a8c0@SPIFF>

andrew kuchling wrote:
> Someone remembered my post of 23 Nov, I see...  The only other test
> framework I know of is the unittest.py inside Quixote, written because
> we thought PyUnit was kind of clunky.

the pythonware teams agree -- we've been using an internal
reimplementation of Kent Beck's original Smalltalk work, but
we're switching to unittest.py.

> Obviously I think the Quixote unittest.py is the best choice for the stdlib.

+1 from here.

</F>



From mal@lemburg.com  Mon Dec  4 11:14:20 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 04 Dec 2000 12:14:20 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
 <3A28C8DF.E430484F@lemburg.com> <14889.10298.621133.961677@anthem.concentric.net>
Message-ID: <3A2B7C8C.D6B889EE@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "M" == M  <mal@lemburg.com> writes:
> 
>     M> The PEP does define when and how __findattr__() is called,
>     M> but makes no statement about what it should do or return...
> 
> Good point.  I've clarified that in the PEP.
> 
>     M> Here's a slightly different idea:
> 
>     M> Given the name, I would expect it to go look for an attribute
>     M> and then return the attribute and its container (this doesn't
>     M> seem to be what you have in mind here, though).
> 
> No, because some applications won't need a wrapped object.  E.g. in
> the Java bean example, it just returns the attribute (which is stored
> with a slightly different name).

I was thinking of a standardised helper which could then be
used for all kinds of attribute retrieval techniques. Acquisition
would be easy to do, access control too. In most cases __findattr__
would simply return (self, self.attrname).
 
>     M> An alternative approach given the semantics above would then be
>     M> to first try a __getattr__() lookup and revert to
>     M> __findattr__() in case this fails.
> 
> I don't think this is as useful.  What would that buy you that you
> can't already do today?

Forget that idea... *always* calling __findattr__ is the more
useful way, just like you intended.
 
> The key concept here is that you want to give the class first crack to
> interpose on every attribute access.  You want this hook to get called
> before anybody else can get at, or set, your attributes.  That gives
> you (the class) total control to implement whatever policy is useful.

Right.
 
>     M> I don't think there is any need to overload __setattr__() in
>     M> such a way, because you cannot be sure which object actually
>     M> gets the new attribute.
> 
>     M> By exposing the functionality using a new builtin, findattr(),
>     M> this could be used for all the examples you give too.
> 
> No, because then people couldn't use the object in the normal
> dot-notational way.

Uhm, why not ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From gvwilson@nevex.com  Mon Dec  4 14:40:58 2000
From: gvwilson@nevex.com (Greg Wilson)
Date: Mon, 4 Dec 2000 09:40:58 -0500
Subject: [Python-Dev] Q: Python standard library re-org plans/schedule?
In-Reply-To: <20001201145827.D16751@kronos.cnri.reston.va.us>
Message-ID: <NEBBIACCCGNFLECLCLMHCEADCBAA.gvwilson@nevex.com>

Hi, everyone.  A potential customer has asked whether there are any
plans to re-organize and rationalize the Python standard library.
If there are any firms plans, and a schedule (however tentative),
I'd be grateful for a pointer.

Thanks,
Greg


From barry@digicool.com  Mon Dec  4 15:13:23 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 4 Dec 2000 10:13:23 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
Message-ID: <14891.46227.785856.307437@anthem.concentric.net>

>>>>> "MvL" == Martin v Loewis <martin@loewis.home.cs.tu-berlin.de> writes:

    MvL> I agree that the current implementation is not
    MvL> thread-safe. To solve that, you'd need to associate with each
    MvL> instance not a single "infindattr" attribute, but a whole set
    MvL> of them - one per "thread of execution" (which would be a
    MvL> thread-id in most threading systems). Of course, that would
    MvL> need some cooperation from the any thread scheme (including
    MvL> uthreads), which would need to provide an identification for
    MvL> a "calling context".

I'm still catching up on several hundred emails over the weekend.  I
had a sneaking suspicion that infindattr wasn't thread-safe, so I'm
convinced this is a bug in the implementation.  One approach might be
to store the info in the thread state object (isn't that how the
recursive repr stop flag is stored?)  That would also save having to
allocate an extra int for every instance (yuck) but might impose a bit
more of a performance overhead.

I'll work more on this later today.
-Barry


From jeremy@alum.mit.edu  Mon Dec  4 15:23:10 2000
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 4 Dec 2000 10:23:10 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <00e701c05dde$2d77c240$0900a8c0@SPIFF>
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
 <20001201145827.D16751@kronos.cnri.reston.va.us>
 <00e701c05dde$2d77c240$0900a8c0@SPIFF>
Message-ID: <14891.46814.359333.76720@bitdiddle.concentric.net>

>>>>> "FL" == Fredrik Lundh <fredrik@pythonware.com> writes:

  FL> andrew kuchling wrote:
  >> Someone remembered my post of 23 Nov, I see...  The only other
  >> test framework I know of is the unittest.py inside Quixote,
  >> written because we thought PyUnit was kind of clunky.

  FL> the pythonware teams agree -- we've been using an internal
  FL> reimplementation of Kent Beck's original Smalltalk work, but
  FL> we're switching to unittest.py.

Can you provide any specifics about what you like about unittest.py
(perhaps as opposed to PyUnit)?

Jeremy


From guido@python.org  Mon Dec  4 15:20:11 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 04 Dec 2000 10:20:11 -0500
Subject: [Python-Dev] Q: Python standard library re-org plans/schedule?
In-Reply-To: Your message of "Mon, 04 Dec 2000 09:40:58 EST."
 <NEBBIACCCGNFLECLCLMHCEADCBAA.gvwilson@nevex.com>
References: <NEBBIACCCGNFLECLCLMHCEADCBAA.gvwilson@nevex.com>
Message-ID: <200012041520.KAA20979@cj20424-a.reston1.va.home.com>

> Hi, everyone.  A potential customer has asked whether there are any
> plans to re-organize and rationalize the Python standard library.
> If there are any firms plans, and a schedule (however tentative),
> I'd be grateful for a pointer.

Alas, none that I know of except the ineffable Python 3000
schedule. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Mon Dec  4 15:46:53 2000
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 4 Dec 2000 10:46:53 -0500
Subject: [Python-Dev] Quixote unit testing docs (Was: unit testing)
In-Reply-To: <14891.46814.359333.76720@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Mon, Dec 04, 2000 at 10:23:10AM -0500
References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <00e701c05dde$2d77c240$0900a8c0@SPIFF> <14891.46814.359333.76720@bitdiddle.concentric.net>
Message-ID: <20001204104653.A19387@kronos.cnri.reston.va.us>

Prodded by Jeremy, I went and actually wrote some documentation for
the Quixote unittest.py; please see 
<URL:http://www.amk.ca/python/unittest.html>.

The HTML is from a manually hacked Library Reference, so ignore the
broken image links and other formatting goofyness.  In case anyone
needs it, the LaTeX is in /files/python/.  The plain text version
comes out to around 290 lines; I can post it to this list if that's
desired.

--amk



From pf@artcom-gmbh.de  Mon Dec  4 17:59:54 2000
From: pf@artcom-gmbh.de (Peter Funk)
Date: Mon, 4 Dec 2000 18:59:54 +0100 (MET)
Subject: Tim Peter's doctest compared to Quixote unit testing (was Re: [Python-Dev] Quixote unit testing docs)
In-Reply-To: <20001204104653.A19387@kronos.cnri.reston.va.us> from Andrew Kuchling at "Dec 4, 2000 10:46:53 am"
Message-ID: <m142zu6-000Dm8C@artcom0.artcom-gmbh.de>

Hi all,

Andrew Kuchling:
> ... I ... actually wrote some documentation for
> the Quixote unittest.py; please see 
> <URL:http://www.amk.ca/python/unittest.html>.
[...]
> comes out to around 290 lines; I can post it to this list if that's
> desired.

After reading Andrews docs, I think Quixote basically offers 
three additional features if compared with Tim Peters 'doctest':
 1. integration of Skip Montanaro's code coverage analysis. 
 2. the idea of Scenario objects useful to share the setup needed to
    test related functions or methods of a class (same start condition).
 3. Some useful functions to check whether the result returned
    by some test fullfills certain properties without having to be
    so explicite, as cut'n'paste from the interactive interpreter
    session would have been.

As I've pointed out before in private mail to Jeremy I've used Tim Peters 
'doctest.py' to accomplish all testing of Python apps in our company.

In doctest each doc string is an independent unit, which starts fresh.
Sometimes this leads to duplicated setup stuff, which is needed
to test each method of a set of related methods from a class.
This is distracting, if you intend the test cases to take their
double role of being at same time useful documentational examples 
for the intended use of the provided API.

Tim_one: Do you read this?  What do you think about the idea to add 
something like the following two functions to 'doctest':
use_module_scenario() -- imports all objects created and preserved during
    execution of the module doc string examples.
use_class_scenario() -- imports all objects created and preserved during 
    the execution of doc string examples of a class.  Only allowed in doc
    string examples of methods.  

This would allow easily to provide the same setup scenario to a group
of related test cases.

AFAI understand doctest handles test-shutdown automatically, iff
the doc string test examples leave no persistent resources behind.

Regards, Peter


From moshez@zadka.site.co.il  Tue Dec  5 03:31:18 2000
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Tue, 05 Dec 2000 05:31:18 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: Message from barry@digicool.com (Barry A. Warsaw)
 of "Mon, 04 Dec 2000 10:13:23 EST." <14891.46227.785856.307437@anthem.concentric.net>
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>  <14891.46227.785856.307437@anthem.concentric.net>
Message-ID: <20001205033118.9135CA817@darjeeling.zadka.site.co.il>

> I'm still catching up on several hundred emails over the weekend.  I
> had a sneaking suspicion that infindattr wasn't thread-safe, so I'm
> convinced this is a bug in the implementation.  One approach might be
> to store the info in the thread state object

I don't think this is a good idea -- continuations and coroutines might
mess it up. Maybe the right thing is to mess with the *compilation* of
__findattr__ so that it would call __setattr__ and __getattr__ with
special flags that stop them from calling __findattr__? This is 
ugly, but I can't think of a better way.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From tismer@tismer.com  Mon Dec  4 18:35:19 2000
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 04 Dec 2000 20:35:19 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>  <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il>
Message-ID: <3A2BE3E7.60A8E220@tismer.com>


Moshe Zadka wrote:
> 
> > I'm still catching up on several hundred emails over the weekend.  I
> > had a sneaking suspicion that infindattr wasn't thread-safe, so I'm
> > convinced this is a bug in the implementation.  One approach might be
> > to store the info in the thread state object
> 
> I don't think this is a good idea -- continuations and coroutines might
> mess it up. Maybe the right thing is to mess with the *compilation* of
> __findattr__ so that it would call __setattr__ and __getattr__ with
> special flags that stop them from calling __findattr__? This is
> ugly, but I can't think of a better way.

Yeah, this is what I tried to say by "different machine state";
compiling different behavior in the case of a special method
is an interesting idea. It is limited somewhat, since the
changed system state is not inherited to called functions.
But if __findattr__ performs its one, single task in its
body alone, we are fine.

still-thinking-of-alternatives - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From tismer@tismer.com  Mon Dec  4 18:52:43 2000
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 04 Dec 2000 20:52:43 +0200
Subject: [Python-Dev] A house upon the sand
References: <000201c05d8c$e7a15b10$060210ac@private>
Message-ID: <3A2BE7FB.831F2F93@tismer.com>


Barry Scott wrote:
> 
> I fully support Greg Wards view. If string was removed I'd not
> update the old code but add in my own string module.
> 
> Given the effort you guys went to to keep the C extension protocol the
> same (in the context of crashing on importing a 1.5 dll into 2.0) I
> amazed you think that string could be removed...
> 
> Could you split the lib into blessed and backward compatibility sections?
> Then by some suitable mechanism I can choose the compatibility I need?
> 
> Oh and as for join obviously a method of a list...
> 
>         ['thats','better'].join(' ')

The above is the way as it is defined for JavaScript. But in
JavaScript, the list join method performs an implicit str()
on the list elements.
As has been discussed some time ago, Python's lists are
too versatile to justify a string-centric method.

Marc André pointed out that one could do a reduction with the
semantics of the "+" operator, but Guido said that he wouldn't
like to see

      [2, 3, 5].join(7)

being reduced to 2+7+3+7+5 == 24.
That could only be avoided if there were a way to distinguish
numeric addition from concatenation.

but-I-could-live-with-it - ly y'rs - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From barry@digicool.com  Mon Dec  4 21:23:00 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 4 Dec 2000 16:23:00 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
 <14891.46227.785856.307437@anthem.concentric.net>
 <20001205033118.9135CA817@darjeeling.zadka.site.co.il>
Message-ID: <14892.2868.982013.313562@anthem.concentric.net>

>>>>> "CT" == Christian Tismer <tismer@tismer.com> writes:

    CT> You want most probably do this: __findattr__ should not be
    CT> invoked again for this instance, with this attribute name, for
    CT> this "thread", until you are done.

First, I think the rule should be "__findattr__ should not be invoked
again for this instance, in this thread, until you are done".
I.e. once in __findattr__, you want all subsequent attribute
references to bypass findattr, because presumably, your instance now
has complete control for all accesses in this thread.  You don't want
to limit it to just the currently named attribute.

Second, if "this thread" is defined as _PyThreadState_Current, then we
have a simple solution, as I mapped out earlier.  We do a
PyThreadState_GetDict() and store the instance in that dict on entry
to __findattr__ and remove it on exit from __findattr__.  If the
instance can be found in the current thread's dict, we bypass
__findattr__.

>>>>> "MZ" == Moshe Zadka <moshez@zadka.site.co.il> writes:

    MZ> I don't think this is a good idea -- continuations and
    MZ> coroutines might mess it up.

You might be right, but I'm not sure.

If we make __findattr__ thread safe according to the definition above,
and if uthread/coroutine/continuation safety can be accomplished by
the __findattr__ programmer's discipline, then I think that is enough.
IOW, if we can tell the __findattr__ author to not relinquish the
uthread explicitly during the __findattr__ call, we're cool.  Oh, and
as long as we're not somehow substantially reducing the utility of
__findattr__ by making that restriction.

What I worry about is re-entrancy that isn't under the programmer's
control, like the Real Thread-safety problem.

-Barry


From barry@digicool.com  Mon Dec  4 22:58:33 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 4 Dec 2000 17:58:33 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
 <14891.46227.785856.307437@anthem.concentric.net>
 <20001205033118.9135CA817@darjeeling.zadka.site.co.il>
 <14892.2868.982013.313562@anthem.concentric.net>
 <3A2C0E0D.E042D026@tismer.com>
Message-ID: <14892.8601.41178.81475@anthem.concentric.net>

>>>>> "CT" == Christian Tismer <tismer@tismer.com> writes:

    CT> Hmm. WHat do you think about Moshe's idea to change compiling
    CT> of the method? It has the nice advantage that there are no
    CT> Thread-safety problems by design. The only drawback is that
    CT> the contract of not-calling-myself only holds for this
    CT> function.

I'm not sure I understand what Moshe was proposing.  Moshe: are you
saying that we should change the way the compiler works, so that it
somehow recognizes this special case?  I'm not sure I like that
approach.  I think I want something more runtime-y, but I'm not sure
why (maybe just because I'm more comfortable mucking about in the
run-time than in the compiler).

-Barry


From guido@python.org  Mon Dec  4 23:16:17 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 04 Dec 2000 18:16:17 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: Your message of "Mon, 04 Dec 2000 16:23:00 EST."
 <14892.2868.982013.313562@anthem.concentric.net>
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il>
 <14892.2868.982013.313562@anthem.concentric.net>
Message-ID: <200012042316.SAA23081@cj20424-a.reston1.va.home.com>

I'm unconvinced by the __findattr__ proposal as it now stands.

- Do you really think that JimF would do away with ExtensionClasses if
  __findattr__ was intruduced?  I kinda doubt it.  See [*footnote].
  It seems that *using* __findattr__ is expensive (even if *not* using
  is cheap :-).

- Why is deletion not supported?  What if you want to enforce a policy
  on deletions too?

- It's ugly to use the same call for get and set.  The examples
  indicate that it's not such a great idea: every example has *two*
  tests whether it's get or set.  To share a policy, the proper thing
  to do is to write a method that either get or set can use.

- I think it would be sufficient to *only* use __findattr__ for
  getattr -- __setattr__ and __delattr__ already have full control.
  The "one routine to implement the policy" argument doesn't really
  hold, I think.

- The PEP says that the "in-findattr" flag is set on the instance.
  We've already determined that this is not thread-safe.  This is not
  just a bug in the implementation -- it's a bug in the specification.
  I also find it ugly.  But if we decide to do this, it can go in the
  thread-state -- if we ever add coroutines, we have to decide on what
  stuff to move from the thread state to the coroutine state anyway.

- It's also easy to conceive situations where recursive __findattr__
  calls on the same instance in the same thread/coroutine are
  perfectly desirable -- e.g. when __findattr__ ends up calling a
  method that uses a lot of internal machinery of the class.  You
  don't want all the machinery to have to be aware of the fact that it
  may be called with __findattr__ on the stack and without it.  So
  perhaps it may be better to only treat the body of __findattr__
  itself special, as Moshe suggested.  What does Jython do here?

- The code examples require a *lot* of effort to understand.  These
  are complicated issues!  (I rewrote the Bean example using
  __getattr__ and __setattr__ and found no need for __findattr__; the
  __getattr__ version is simpler and easier to understand.  I'm still
  studying the other __findattr__ examples.)

- The PEP really isn't that long, except for the code examples.  I
  recommend reading the patch first -- the patch is probably shorter
  than any specification of the feature can be.

--Guido van Rossum (home page: http://www.python.org/~guido/)

[*footnote]

  There's an easy way (that few people seem to know) to cause
  __getattr__ to be called for virtually all attribute accesses: put
  *all* (user-visible) attributes in a sepate dictionary.  If you want
  to prevent access to this dictionary too (for Zope security
  enforcement), make it a global indexed by id() -- a
  destructor(__del__) can take care of deleting entries here.


From martin@loewis.home.cs.tu-berlin.de  Mon Dec  4 23:10:43 2000
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 5 Dec 2000 00:10:43 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <14891.46227.785856.307437@anthem.concentric.net>
 (barry@digicool.com)
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net>
Message-ID: <200012042310.AAA00786@loewis.home.cs.tu-berlin.de>

> I'm still catching up on several hundred emails over the weekend.  I
> had a sneaking suspicion that infindattr wasn't thread-safe, so I'm
> convinced this is a bug in the implementation.  One approach might be
> to store the info in the thread state object (isn't that how the
> recursive repr stop flag is stored?)

Whether this works depends on how exactly the info is stored. A single
flag won't be sufficient, since multiple objects may have __findattr__
in progress in a given thread. With a set of instances, it would work,
though.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Mon Dec  4 23:13:15 2000
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 5 Dec 2000 00:13:15 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <20001205033118.9135CA817@darjeeling.zadka.site.co.il> (message
 from Moshe Zadka on Tue, 05 Dec 2000 05:31:18 +0200)
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>  <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il>
Message-ID: <200012042313.AAA00832@loewis.home.cs.tu-berlin.de>

> I don't think this is a good idea -- continuations and coroutines
> might mess it up.

If coroutines and continuations present operate preemptively, then
they should present themselves as an implementation of the thread API;
perhaps the thread API needs to be extended to allow for such a feature.

If yielding control is in the hands of the implementation, it would be
easy to outrule a context switch while findattr is in progress.

Regards,
Martin



From martin@loewis.home.cs.tu-berlin.de  Mon Dec  4 23:19:37 2000
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 5 Dec 2000 00:19:37 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <14892.8601.41178.81475@anthem.concentric.net>
 (barry@digicool.com)
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
 <14891.46227.785856.307437@anthem.concentric.net>
 <20001205033118.9135CA817@darjeeling.zadka.site.co.il>
 <14892.2868.982013.313562@anthem.concentric.net>
 <3A2C0E0D.E042D026@tismer.com> <14892.8601.41178.81475@anthem.concentric.net>
Message-ID: <200012042319.AAA00877@loewis.home.cs.tu-berlin.de>

> I'm not sure I understand what Moshe was proposing.  Moshe: are you
> saying that we should change the way the compiler works, so that it
> somehow recognizes this special case?  I'm not sure I like that
> approach.  I think I want something more runtime-y, but I'm not sure
> why (maybe just because I'm more comfortable mucking about in the
> run-time than in the compiler).

I guess you are also uncomfortable with the problem that the
compile-time analysis cannot "see" through levels of indirection.
E.g. if findattr as

   return self.compute_attribute(real_attribute)

then compile-time analysis could figure out to call compute_attribute
directly. However, that method may be implemented as 

  def compute_attribute(self,name):
    return self.mapping[name]

where the access to mapping could not be detected statically.

Regards,
Martin



From tismer@tismer.com  Mon Dec  4 21:35:09 2000
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 04 Dec 2000 23:35:09 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
 <14891.46227.785856.307437@anthem.concentric.net>
 <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net>
Message-ID: <3A2C0E0D.E042D026@tismer.com>


"Barry A. Warsaw" wrote:
> 
> >>>>> "CT" == Christian Tismer <tismer@tismer.com> writes:
> 
>     CT> You want most probably do this: __findattr__ should not be
>     CT> invoked again for this instance, with this attribute name, for
>     CT> this "thread", until you are done.
> 
> First, I think the rule should be "__findattr__ should not be invoked
> again for this instance, in this thread, until you are done".

Maybe this is better. Surely easier. :)

[ThreadState solution - well fine so far]

>     MZ> I don't think this is a good idea -- continuations and
>     MZ> coroutines might mess it up.
> 
> You might be right, but I'm not sure.
> 
> If we make __findattr__ thread safe according to the definition above,
> and if uthread/coroutine/continuation safety can be accomplished by
> the __findattr__ programmer's discipline, then I think that is enough.
> IOW, if we can tell the __findattr__ author to not relinquish the
> uthread explicitly during the __findattr__ call, we're cool.  Oh, and
> as long as we're not somehow substantially reducing the utility of
> __findattr__ by making that restriction.
> 
> What I worry about is re-entrancy that isn't under the programmer's
> control, like the Real Thread-safety problem.

Hmm. WHat do you think about Moshe's idea to change compiling
of the method? It has the nice advantage that there are no
Thread-safety problems by design. The only drawback is that
the contract of not-calling-myself only holds for this function.

I don't know how Threadstate scale up when there are more things
like these invented. Well, for the moment, the simple solution
with Stackless would just be to let the interpreter recurse
in this call, the same as it happens during __init__ and
anything else that isn't easily turned into tail-recursion.
It just blocks :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From barry@digicool.com  Tue Dec  5 02:54:23 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 4 Dec 2000 21:54:23 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
 <14891.46227.785856.307437@anthem.concentric.net>
 <20001205033118.9135CA817@darjeeling.zadka.site.co.il>
 <14892.2868.982013.313562@anthem.concentric.net>
 <200012042316.SAA23081@cj20424-a.reston1.va.home.com>
Message-ID: <14892.22751.921264.156010@anthem.concentric.net>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> - Do you really think that JimF would do away with
    GvR> ExtensionClasses if __findattr__ was intruduced?  I kinda
    GvR> doubt it.  See [*footnote].  It seems that *using*
    GvR> __findattr__ is expensive (even if *not* using is cheap :-).

That's not even the real reason why JimF wouldn't stop using
ExtensionClass.  He's already got too much code invested in EC.
However EC can be a big pill to swallow for some applications because
it's a C extension (and because it has some surprising non-Pythonic
side effects).  In those situations, a pure Python approach, even
though slower, is useful.

    GvR> - Why is deletion not supported?  What if you want to enforce
    GvR> a policy on deletions too?

It could be, without much work.

    GvR> - It's ugly to use the same call for get and set.  The
    GvR> examples indicate that it's not such a great idea: every
    GvR> example has *two* tests whether it's get or set.  To share a
    GvR> policy, the proper thing to do is to write a method that
    GvR> either get or set can use.

I don't have strong feelings either way.

    GvR> - I think it would be sufficient to *only* use __findattr__
    GvR> for getattr -- __setattr__ and __delattr__ already have full
    GvR> control.  The "one routine to implement the policy" argument
    GvR> doesn't really hold, I think.

What about the ability to use "normal" x.name attribute access syntax
inside the hook?  Let me guess your answer. :)

    GvR> - The PEP says that the "in-findattr" flag is set on the
    GvR> instance.  We've already determined that this is not
    GvR> thread-safe.  This is not just a bug in the implementation --
    GvR> it's a bug in the specification.  I also find it ugly.  But
    GvR> if we decide to do this, it can go in the thread-state -- if
    GvR> we ever add coroutines, we have to decide on what stuff to
    GvR> move from the thread state to the coroutine state anyway.

Right.  That's where we've ended up in subsequent messages on this thread.

    GvR> - It's also easy to conceive situations where recursive
    GvR> __findattr__ calls on the same instance in the same
    GvR> thread/coroutine are perfectly desirable -- e.g. when
    GvR> __findattr__ ends up calling a method that uses a lot of
    GvR> internal machinery of the class.  You don't want all the
    GvR> machinery to have to be aware of the fact that it may be
    GvR> called with __findattr__ on the stack and without it.

Hmm, okay, I don't really understand your example.  I suppose I'm
envisioning __findattr__ as a way to provide an interface to clients
of the class.  Maybe it's a bean interface, maybe it's an acquisition
interface or an access control interface.  The internal machinery has
to know something about how that interface is implemented, so whether
__findattr__ is recursive or not doesn't seem to enter into it.

And also, allowing __findattr__ to be recursive will just impose
different constraints on the internal machinery methods, just like
__setattr__ currently does.  I.e. you better know that you're in
__setattr__ and not do self.name type things, or you'll recurse
forever. 

    GvR> So perhaps it may be better to only treat the body of
    GvR> __findattr__ itself special, as Moshe suggested.

Maybe I'm being dense, but I'm not sure exactly what this means, or
how you would do this.
    
    GvR> What does Jython do here?

It's not exactly equivalent, because Jython's __findattr__ can't call
back into Python.

    GvR> - The code examples require a *lot* of effort to understand.
    GvR> These are complicated issues!  (I rewrote the Bean example
    GvR> using __getattr__ and __setattr__ and found no need for
    GvR> __findattr__; the __getattr__ version is simpler and easier
    GvR> to understand.  I'm still studying the other __findattr__
    GvR> examples.)

Is it simpler because you separated out the set and get behavior?  If
__findattr__ only did getting, I think it would be a lot similar too
(but I'd still be interested in seeing your __getattr__ only
example).  The acquisition examples are complicated because I wanted
to support the same interface that EC's acquisition classes support.
All that detail isn't necessary for example code.

    GvR> - The PEP really isn't that long, except for the code
    GvR> examples.  I recommend reading the patch first -- the patch
    GvR> is probably shorter than any specification of the feature can
    GvR> be.

Would it be more helpful to remove the examples?  If so, where would
you put them?  It's certainly useful to have examples someplace I
think.

    GvR>   There's an easy way (that few people seem to know) to cause
    GvR> __getattr__ to be called for virtually all attribute
    GvR> accesses: put *all* (user-visible) attributes in a sepate
    GvR> dictionary.  If you want to prevent access to this dictionary
    GvR> too (for Zope security enforcement), make it a global indexed
    GvR> by id() -- a destructor(__del__) can take care of deleting
    GvR> entries here.

Presumably that'd be a module global, right?  Maybe within Zope that
could be protected, but outside of that, that global's always going to
be accessible.  So are methods, even if given private names.  And I
don't think that such code would be any more readable since instead of
self.name you'd see stuff like

    def __getattr__(self, name):
        global instdict
	mydict = instdict[id(self)]
	obj = mydict[name]
	...

    def __setattr__(self, name, val):
	global instdict
	mydict = instdict[id(self)]
	instdict[name] = val
	...

and that /might/ be a problem with Jython currently, because id()'s
may be reused.  And relying on __del__ may have unfortunate side
effects when viewed in conjunction with garbage collection.

You're probably still unconvinced <wink>, but are you dead-set against
it?  I can try implementing __findattr__() as a pre-__getattr__ hook
only.  Then we can live with the current __setattr__() restrictions
and see what the examples look like in that situation.

-Barry


From guido@python.org  Tue Dec  5 12:54:20 2000
From: guido@python.org (Guido van Rossum)
Date: Tue, 05 Dec 2000 07:54:20 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: Your message of "Mon, 04 Dec 2000 21:54:23 EST."
 <14892.22751.921264.156010@anthem.concentric.net>
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com>
 <14892.22751.921264.156010@anthem.concentric.net>
Message-ID: <200012051254.HAA25502@cj20424-a.reston1.va.home.com>

> >>>>> "GvR" == Guido van Rossum <guido@python.org> writes:
> 
>     GvR> - Do you really think that JimF would do away with
>     GvR> ExtensionClasses if __findattr__ was intruduced?  I kinda
>     GvR> doubt it.  See [*footnote].  It seems that *using*
>     GvR> __findattr__ is expensive (even if *not* using is cheap :-).
> 
> That's not even the real reason why JimF wouldn't stop using
> ExtensionClass.  He's already got too much code invested in EC.
> However EC can be a big pill to swallow for some applications because
> it's a C extension (and because it has some surprising non-Pythonic
> side effects).  In those situations, a pure Python approach, even
> though slower, is useful.

Agreed.  But I'm still hoping to find the silver bullet that lets Jim
(and everybody else) do what ExtensionClass does without needing
another extension.

>     GvR> - Why is deletion not supported?  What if you want to enforce
>     GvR> a policy on deletions too?
> 
> It could be, without much work.

Then it should be -- except I prefer to do only getattr anyway, see
below.

>     GvR> - It's ugly to use the same call for get and set.  The
>     GvR> examples indicate that it's not such a great idea: every
>     GvR> example has *two* tests whether it's get or set.  To share a
>     GvR> policy, the proper thing to do is to write a method that
>     GvR> either get or set can use.
> 
> I don't have strong feelings either way.

What does Jython do?  I thought it only did set (hence the name :-).
I think there's no *need* for findattr to catch the setattr operation,
because __setattr__ *already* gets invoked on each set not just ones
where the attr doesn't yet exist.

>     GvR> - I think it would be sufficient to *only* use __findattr__
>     GvR> for getattr -- __setattr__ and __delattr__ already have full
>     GvR> control.  The "one routine to implement the policy" argument
>     GvR> doesn't really hold, I think.
> 
> What about the ability to use "normal" x.name attribute access syntax
> inside the hook?  Let me guess your answer. :)

Aha!  You got me there.  Clearly the REAL reason for wanting
__findattr__ is the no-recursive-calls rule -- which is also the most
uncooked feature...  Traditional getattr hooks don't need this as much
because they don't get called when the attribute already exists;
traditional setattr hooks deal with it by switching on the attribute
name.  The no-recursive-calls rule certainly SEEMS an attractive way
around this.  But I'm not sure that it really is...

I need to get my head around this more.  (The only reason I'm still
posting this reply is to test the new mailing lists setup via
mail.python.org.)

>     GvR> - The PEP says that the "in-findattr" flag is set on the
>     GvR> instance.  We've already determined that this is not
>     GvR> thread-safe.  This is not just a bug in the implementation --
>     GvR> it's a bug in the specification.  I also find it ugly.  But
>     GvR> if we decide to do this, it can go in the thread-state -- if
>     GvR> we ever add coroutines, we have to decide on what stuff to
>     GvR> move from the thread state to the coroutine state anyway.
> 
> Right.  That's where we've ended up in subsequent messages on this thread.
> 
>     GvR> - It's also easy to conceive situations where recursive
>     GvR> __findattr__ calls on the same instance in the same
>     GvR> thread/coroutine are perfectly desirable -- e.g. when
>     GvR> __findattr__ ends up calling a method that uses a lot of
>     GvR> internal machinery of the class.  You don't want all the
>     GvR> machinery to have to be aware of the fact that it may be
>     GvR> called with __findattr__ on the stack and without it.
> 
> Hmm, okay, I don't really understand your example.  I suppose I'm
> envisioning __findattr__ as a way to provide an interface to clients
> of the class.  Maybe it's a bean interface, maybe it's an acquisition
> interface or an access control interface.  The internal machinery has
> to know something about how that interface is implemented, so whether
> __findattr__ is recursive or not doesn't seem to enter into it.

But the class is also a client of itself, and not all cases where it
is a client of itself are inside a findattr call.  Take your bean
example.  Suppose your bean class also has a spam() method.  The
findattr code needs to account for this, e.g.:

    def __findattr__(self, name, *args):
	if name == "spam" and not args:
	    return self.spam
	...original body here...

Or you have to add a _get_spam() method:

    def _get_spam(self):
	return self.spam

Either solution gets tedious if there ar a lot of methods; instead,
findattr could check if the attr is defined on the class, and then
return that:

    def __findattr__(self, name, *args):
        if not args and name[0] != '_' and hasattr(self.__class__, name):
	    return getattr(self, name)
	...original body here...

Anyway, let's go back to the spam method.  Suppose it references
self.foo.  The findattr machinery will access it.  Fine.  But now
consider another attribute (bar) with _set_bar() and _get_bar()
methods that do a little more.  Maybe bar is really calculated from
the value of self.foo.  Then _get_bar cannot use self.foo (because
it's inside findattr so findattr won't resolve it, and self.foo
doesn't actually exist on the instance) so it has to use self.__myfoo.
Fine -- after all this is inside a _get_* handler, which knows it's
being called from findattr.  But what if, instead of needing self.foo,
_get_bar wants to call self.spam() in order?  Then self.spam() is
being called from inside findattr, so when it access self.foo,
findattr isn't used -- and it fails with an AttributeError!

Sorry for the long detour, but *that's* the problem I was referring
to.  I think the scenario is quite realistic.

> And also, allowing __findattr__ to be recursive will just impose
> different constraints on the internal machinery methods, just like
> __setattr__ currently does.  I.e. you better know that you're in
> __setattr__ and not do self.name type things, or you'll recurse
> forever. 

Actually, this is usually solved by having __setattr__ check for
specific names only, and for others do self.__dict__[name] = value;
that way, recursive __setattr__ calls are okay.  Similar for
__getattr__ (which has to raise AttributeError for unrecognized
names).

>     GvR> So perhaps it may be better to only treat the body of
>     GvR> __findattr__ itself special, as Moshe suggested.
> 
> Maybe I'm being dense, but I'm not sure exactly what this means, or
> how you would do this.

Read Moshe's messages (and Martin's replies) again.  I don't care that
much for it so I won't explain it again.

>     GvR> What does Jython do here?
> 
> It's not exactly equivalent, because Jython's __findattr__ can't call
> back into Python.

I'd say that Jython's __findattr__ is an entirely different beast than
what we have here.  Its min purpose in life appears to be to be a
getattr equivalent that returns NULL instead of raising an exception
when the attribute isn't found -- which is reasonable because from
within Java, testing for null is much cheaper than checking for an
exception, and you often need to look whether a given attribute exists
do some default action if not.  (In fact, I'd say that CPython could
also use a findattr of this kind...)

This is really too bad.  Based on the name similarity and things I
thought you'd said in private before, I thought that they would be
similar.  Then the experience with Jython would be a good argument for
adding a findattr hook to CPython.  But now that they are totally
different beasts it doesn't help at all.

>     GvR> - The code examples require a *lot* of effort to understand.
>     GvR> These are complicated issues!  (I rewrote the Bean example
>     GvR> using __getattr__ and __setattr__ and found no need for
>     GvR> __findattr__; the __getattr__ version is simpler and easier
>     GvR> to understand.  I'm still studying the other __findattr__
>     GvR> examples.)
> 
> Is it simpler because you separated out the set and get behavior?  If
> __findattr__ only did getting, I think it would be a lot similar too
> (but I'd still be interested in seeing your __getattr__ only
> example).

Here's my getattr example.  It's more lines of code, but cleaner IMHO:

    class Bean:
	def __init__(self, x):
	    self.__myfoo = x

	def __isprivate(self, name):
	    return name.startswith('_')

	def __getattr__(self, name):
	    if self.__isprivate(name):
		raise AttributeError, name
	    return getattr(self, "_get_" + name)()

	def __setattr__(self, name, value):
	    if self.__isprivate(name):
		self.__dict__[name] = value
	    else:
		return getattr(self, "_set_" + name)(value)

	def _set_foo(self, x):
	    self.__myfoo = x

	def _get_foo(self):
	    return self.__myfoo


    b = Bean(3)
    print b.foo
    b.foo = 9
    print b.foo

> The acquisition examples are complicated because I wanted
> to support the same interface that EC's acquisition classes support.
> All that detail isn't necessary for example code.

I *still* have to study the examples... :-(  Will do next.

>     GvR> - The PEP really isn't that long, except for the code
>     GvR> examples.  I recommend reading the patch first -- the patch
>     GvR> is probably shorter than any specification of the feature can
>     GvR> be.
> 
> Would it be more helpful to remove the examples?  If so, where would
> you put them?  It's certainly useful to have examples someplace I
> think.

No, my point is that the examples need more explanation.  Right now
the EC example is over 200 lines of brain-exploding code! :-)

>     GvR>   There's an easy way (that few people seem to know) to cause
>     GvR> __getattr__ to be called for virtually all attribute
>     GvR> accesses: put *all* (user-visible) attributes in a sepate
>     GvR> dictionary.  If you want to prevent access to this dictionary
>     GvR> too (for Zope security enforcement), make it a global indexed
>     GvR> by id() -- a destructor(__del__) can take care of deleting
>     GvR> entries here.
> 
> Presumably that'd be a module global, right?  Maybe within Zope that
> could be protected,

Yes.

> but outside of that, that global's always going to
> be accessible.  So are methods, even if given private names.

Aha!  Another think that I expect has been on your agenda for a long
time, but which isn't explicit in the PEP (AFAICT): findattr gives
*total* control over attribute access, unlike __getattr__ and
__setattr__ and private name mangling, which can all be defeated.

And this may be one of the things that Jim is after with
ExtensionClasses in Zope.  Although I believe that in DTML, he doesn't
trust this: he uses source-level (or bytecode-level) transformations
to turn all X.Y operations into a call into a security manager.

So I'm not sure that the argument is very strong.

> And I
> don't think that such code would be any more readable since instead of
> self.name you'd see stuff like
> 
>     def __getattr__(self, name):
>         global instdict
> 	mydict = instdict[id(self)]
> 	obj = mydict[name]
> 	...
> 
>     def __setattr__(self, name, val):
> 	global instdict
> 	mydict = instdict[id(self)]
> 	instdict[name] = val
> 	...
> 
> and that /might/ be a problem with Jython currently, because id()'s
> may be reused.  And relying on __del__ may have unfortunate side
> effects when viewed in conjunction with garbage collection.

Fair enough.  I withdraw the suggestion, and propose restricted
execution instead.  There, you can use Bastions -- which have problems
of their own, but you do get total control.

> You're probably still unconvinced <wink>, but are you dead-set against
> it?  I can try implementing __findattr__() as a pre-__getattr__ hook
> only.  Then we can live with the current __setattr__() restrictions
> and see what the examples look like in that situation.

I am dead-set against introducing a feature that I don't fully
understand.  Let's continue this discussion.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bckfnn@worldonline.dk  Tue Dec  5 15:40:10 2000
From: bckfnn@worldonline.dk (Finn Bock)
Date: Tue, 05 Dec 2000 15:40:10 GMT
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <200012051254.HAA25502@cj20424-a.reston1.va.home.com>
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com>   <14892.22751.921264.156010@anthem.concentric.net>  <200012051254.HAA25502@cj20424-a.reston1.va.home.com>
Message-ID: <3a2d0c29.242749@smtp.worldonline.dk>

On Tue, 05 Dec 2000 07:54:20 -0500, you wrote:

>>     GvR> What does Jython do here?
>> 
>> It's not exactly equivalent, because Jython's __findattr__ can't call
>> back into Python.
>
>I'd say that Jython's __findattr__ is an entirely different beast than
>what we have here.  Its min purpose in life appears to be to be a
>getattr equivalent that returns NULL instead of raising an exception
>when the attribute isn't found -- which is reasonable because from
>within Java, testing for null is much cheaper than checking for an
>exception, and you often need to look whether a given attribute exists
>do some default action if not. 

Correct. It is also the method to override when making a new builtin
type and it will be called on such a type subclass regardless of the
presence of any __getattr__ hook and __dict__ content. So I think it
have some of the properties which Barry wants.


regards,
finn


From greg@cosc.canterbury.ac.nz  Tue Dec  5 23:07:06 2000
From: greg@cosc.canterbury.ac.nz (greg@cosc.canterbury.ac.nz)
Date: Wed, 06 Dec 2000 12:07:06 +1300 (NZDT)
Subject: Are you all mad? (Re: [Python-Dev] PEP 231, __findattr__())
In-Reply-To: <200012051254.HAA25502@cj20424-a.reston1.va.home.com>
Message-ID: <200012052307.MAA01082@s454.cosc.canterbury.ac.nz>

I can't believe you're even considering a magic
dynamically-scoped flag that invisibly changes the
semantics of fundamental operations. To me the
idea is utterly insane!

If I understand correctly, the problem is that if
you do something like

  def __findattr__(self, name):
    if name == 'spam':
      return self.__dict__['spam']

then self.__dict__ is going to trigger a recursive
__findattr__ call. 

It seems to me that if you're going to have some sort
of hook that is always called on any x.y reference,
you need some way of explicitly bypassing it and getting
at the underlying machinery.

I can think of a couple of ways:

1) Make the __dict__ attribute special, so that accessing
it always bypasses __findattr__.

2) Provide some other way of getting direct access to the
attributes of an object, e.g. new builtins called
peekattr() and pokeattr().

This assumes that you always know when you write a particular
access whether you want it to be a "normal" or "special"
one, so that you can use the appropriate mechanism.
Are there any cases where this is not true?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From barry@digicool.com  Wed Dec  6 02:20:40 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 5 Dec 2000 21:20:40 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
 <14891.46227.785856.307437@anthem.concentric.net>
 <20001205033118.9135CA817@darjeeling.zadka.site.co.il>
 <14892.2868.982013.313562@anthem.concentric.net>
 <200012042316.SAA23081@cj20424-a.reston1.va.home.com>
 <14892.22751.921264.156010@anthem.concentric.net>
 <200012051254.HAA25502@cj20424-a.reston1.va.home.com>
 <3a2d0c29.242749@smtp.worldonline.dk>
Message-ID: <14893.41592.701128.58110@anthem.concentric.net>

>>>>> "FB" == Finn Bock <bckfnn@worldonline.dk> writes:

    FB> Correct. It is also the method to override when making a new
    FB> builtin type and it will be called on such a type subclass
    FB> regardless of the presence of any __getattr__ hook and
    FB> __dict__ content. So I think it have some of the properties
    FB> which Barry wants.

We had a discussion about this PEP at our group meeting today.  Rather
than write it all twice, I'm going to try to update the PEP and patch
tonight.  I think what we came up with will solve most of the problems
raised, and will be implementable in Jython (I'll try to work up a
Jython patch too, if I don't fall asleep first :)

-Barry


From barry@digicool.com  Wed Dec  6 02:54:36 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 5 Dec 2000 21:54:36 -0500
Subject: Are you all mad? (Re: [Python-Dev] PEP 231, __findattr__())
References: <200012051254.HAA25502@cj20424-a.reston1.va.home.com>
 <200012052307.MAA01082@s454.cosc.canterbury.ac.nz>
Message-ID: <14893.43628.61063.905227@anthem.concentric.net>

>>>>> "greg" ==   <greg@cosc.canterbury.ac.nz> writes:

    | 1) Make the __dict__ attribute special, so that accessing
    | it always bypasses __findattr__.

You're not far from what I came up with right after our delicious
lunch.  We're going to invent a new protocol which passes __dict__
into the method as an argument.  That way self.__dict__ doesn't need
to be special cased at all because you can get at all the attributes
via a local!  So no recursion stop hack is necessary.

More in the updated PEP and patch.

-Barry


From dgoodger@bigfoot.com  Thu Dec  7 04:33:33 2000
From: dgoodger@bigfoot.com (David Goodger)
Date: Wed, 06 Dec 2000 23:33:33 -0500
Subject: [Python-Dev] unit testing and Python regression test
Message-ID: <B6547D4C.BE96%dgoodger@bigfoot.com>

There is another unit testing implementation out there, OmPyUnit, available
from:

    http://www.objectmentor.com/freeware/downloads.html

-- 
David Goodger    dgoodger@bigfoot.com    Open-source projects:
 - The Go Tools Project: http://gotools.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net (soon!)



From fdrake@users.sourceforge.net  Thu Dec  7 06:26:54 2000
From: fdrake@users.sourceforge.net (Fred L. Drake)
Date: Wed, 6 Dec 2000 22:26:54 -0800
Subject: [Python-Dev] [development doc updates]
Message-ID: <200012070626.WAA22103@orbital.p.sourceforge.net>

The development version of the documentation has been updated:

	http://python.sourceforge.net/devel-docs/


Lots of small changes, but most important, more DOM documentation:

	http://python.sourceforge.net/devel-docs/lib/module-xml.dom.html


From guido@python.org  Thu Dec  7 17:48:53 2000
From: guido@python.org (Guido van Rossum)
Date: Thu, 07 Dec 2000 12:48:53 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
Message-ID: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>

After perusing David Ascher's proposal, several versions of his
patches, and hundreds of email exchanged on this subject (almost all
of this dated April or May of 1998), I've produced a reasonably
semblance of PEP 207.  Get it from CVS or here on the web:

  http://python.sourceforge.net/peps/pep-0207.html

I'd like to hear your comments, praise, and criticisms!

The PEP still needs work; in particular, the minority point of view
back then (that comparisons should return only Boolean results) is not
adequately represented (but I *did* work in a reference to tabnanny,
to ensure Tim's support :-).

I'd like to work on a patch next, but I think there will be
interference with Neil's coercion patch.  I'm not sure how to resolve
that yet; maybe I'll just wait until Neil's coercion patch is checked
in.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Thu Dec  7 17:54:51 2000
From: guido@python.org (Guido van Rossum)
Date: Thu, 07 Dec 2000 12:54:51 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
Message-ID: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>

I'm maybe about three quarters on the way with writing PEP 230 -- far
enough along to be asking for comments.  Get it from CVS or go to:

  http://python.sourceforge.net/peps/pep-0230.html

A prototype implementation in Python is included in the PEP; I think
this shows that the implementation is not too complex (Paul Prescod's
fear about my proposal).

This is pretty close to what I proposed earlier (Nov 5), except that I
have added warning category classes (inspired by Paul's proposal).
This class also serves as the exception to be raised when warnings are
turned into exceptions.

Do I need to include a discussion of Paul's counter-proposal and why I
rejected it?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Barrett@stsci.edu  Thu Dec  7 22:49:02 2000
From: Barrett@stsci.edu (Paul Barrett)
Date: Thu,  7 Dec 2000 17:49:02 -0500 (EST)
Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays
Message-ID: <14896.1191.240597.632888@nem-srvr.stsci.edu>

What is the status of PEP 209?  I see David Ascher is the champion of
this PEP, but nothing has been written up.  Is the intention of this
PEP to make the current Numeric a built-in feature of Python or to
re-implement and replace the current Numeric module?

The reason that I ask these questions is because I'm working on a
prototype of a new N-dimensional Array module which I call Numeric 2.
This new module will be much more extensible than the current Numeric.
For example, new array types and universal functions can be loaded or
imported on demand.  We also intend to implement a record (or
C-structure) type, because 1-D arrays or lists of records are a common
data structure for storing photon events in astronomy and related
fields.

The current Numeric does not handle record types efficiently,
particularly when the data type is not aligned and is in non-native
endian format.  To handle such data, temporary arrays must be created
and alignment and byte-swapping done on them.  Numeric 2 does such
pre- and post-processing inside the inner-most loop which is more
efficient in both time and memory.  It also does type conversion at
this level which is consistent with that proposed for PEP 208.

Since many scientific users would like direct access to the array data
via C pointers, we have investigated using the buffer object.  We have
not had much success with it, because of its implementation.  I have
scanned the python-dev mailing list for discussions of this issue and
found that it now appears to be deprecated.

My opinion on this is that a new _fundamental_ built-in type should be
created for memory allocation with features and an interface similar
to the _mmap_ object.  I'll call this a _malloc_ object.  This would
allow Numeric 2 to use either object interchangeably depending on the
circumstance.  The _string_ type could also benefit from this new
object by using a read-only version of it.  Since its an object, it's
memory area should be safe from inadvertent deletion.

Because of these and other new features in Numeric 2, I have a keen
interest in the status of PEPs 207, 208, 211, 225, and 228; and also
in the proposed buffer object.  

I'm willing to implement this new _malloc_ object if members of the
python-dev list are in agreement.  Actually I see no alternative,
given the current design of Numeric 2, since the Array class will
initially be written completely in Python and will need a mutable
memory buffer, while the _string_ type is meant to be a read-only
object.

All comments welcome.

 -- Paul

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


From DavidA@ActiveState.com  Fri Dec  8 01:13:04 2000
From: DavidA@ActiveState.com (David Ascher)
Date: Thu, 7 Dec 2000 17:13:04 -0800 (Pacific Standard Time)
Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional
 Arrays
In-Reply-To: <14896.1191.240597.632888@nem-srvr.stsci.edu>
Message-ID: <Pine.WNT.4.21.0012071712410.1360-100000@loom>

On Thu, 7 Dec 2000, Paul Barrett wrote:

> What is the status of PEP 209?  I see David Ascher is the champion of
> this PEP, but nothing has been written up.  Is the intention of this

I put my name on the PEP just to make sure it wasn't forgotten.  If
someone wants to champion it, their name should go on it.

--david



From guido@python.org  Fri Dec  8 16:10:50 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Dec 2000 11:10:50 -0500
Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays
In-Reply-To: Your message of "Thu, 07 Dec 2000 17:49:02 EST."
 <14896.1191.240597.632888@nem-srvr.stsci.edu>
References: <14896.1191.240597.632888@nem-srvr.stsci.edu>
Message-ID: <200012081610.LAA30679@cj20424-a.reston1.va.home.com>

> What is the status of PEP 209?  I see David Ascher is the champion of
> this PEP, but nothing has been written up.  Is the intention of this
> PEP to make the current Numeric a built-in feature of Python or to
> re-implement and replace the current Numeric module?

David has already explained why his name is on it -- basically,
David's name is on several PEPs but he doesn't currently have any time
to work on these, so other volunteers are most welcome to join.

It is my understanding that the current Numeric is sufficiently messy
in implementation and controversial in semantics that it would not be
a good basis to start from.

However, I do think that a basic multi-dimensional array object would
be a welcome addition to core Python.

> The reason that I ask these questions is because I'm working on a
> prototype of a new N-dimensional Array module which I call Numeric 2.
> This new module will be much more extensible than the current Numeric.
> For example, new array types and universal functions can be loaded or
> imported on demand.  We also intend to implement a record (or
> C-structure) type, because 1-D arrays or lists of records are a common
> data structure for storing photon events in astronomy and related
> fields.

I'm not familiar with the use of computers in astronomy and related
fields, so I'll take your word for that! :-)

> The current Numeric does not handle record types efficiently,
> particularly when the data type is not aligned and is in non-native
> endian format.  To handle such data, temporary arrays must be created
> and alignment and byte-swapping done on them.  Numeric 2 does such
> pre- and post-processing inside the inner-most loop which is more
> efficient in both time and memory.  It also does type conversion at
> this level which is consistent with that proposed for PEP 208.
> 
> Since many scientific users would like direct access to the array data
> via C pointers, we have investigated using the buffer object.  We have
> not had much success with it, because of its implementation.  I have
> scanned the python-dev mailing list for discussions of this issue and
> found that it now appears to be deprecated.

Indeed.  I think it's best to leave the buffer object out of your
implementation plans.  There are several problems with it, and one of
the backburner projects is to redesign it to be much more to the point
(providing less, not more functionality).

> My opinion on this is that a new _fundamental_ built-in type should be
> created for memory allocation with features and an interface similar
> to the _mmap_ object.  I'll call this a _malloc_ object.  This would
> allow Numeric 2 to use either object interchangeably depending on the
> circumstance.  The _string_ type could also benefit from this new
> object by using a read-only version of it.  Since its an object, it's
> memory area should be safe from inadvertent deletion.

Interesting.  I'm actually not sufficiently familiar with mmap to
comment.  But would the existing array module's array object be at all
useful?  You can get to the raw bytes in C (using the C buffer API,
which is not deprecated) and it is extensible.

> Because of these and other new features in Numeric 2, I have a keen
> interest in the status of PEPs 207, 208, 211, 225, and 228; and also
> in the proposed buffer object.  

Here are some quick comments on the mentioned PEPs.

207: Rich Comparisons.  This will go into Python 2.1.  (I just
finished the first draft of the PEP, please read it and comment.)

208: Reworking the Coercion Model.  This will go into Python 2.1.
Neil Schemenauer has mostly finished the patches already.  Please
comment.

211: Adding New Lineal Algebra Operators (Greg Wilson).  This is
unlikely to go into Python 2.1.  I don't like the idea much.  If you
disagree, please let me know!  (Also, a choice has to be made between
211 and 225; I don't want to accept both, so until 225 is rejected,
211 is in limbo.)

225: Elementwise/Objectwise Operators (Zhu, Lielens).  This will
definitely not go into Python 2.1.  It adds too many new operators.

228: Reworking Python's Numeric Model.  This is a total pie-in-the-sky
PEP, and this kind of change is not likely to happen before Python
3000.

> I'm willing to implement this new _malloc_ object if members of the
> python-dev list are in agreement.  Actually I see no alternative,
> given the current design of Numeric 2, since the Array class will
> initially be written completely in Python and will need a mutable
> memory buffer, while the _string_ type is meant to be a read-only
> object.

Would you be willing to take over authorship of PEP 209?  David Ascher
and the Numeric Python community will thank you.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Dec  8 18:43:39 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Dec 2000 13:43:39 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: Your message of "Thu, 30 Nov 2000 17:46:52 EST."
 <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com>
Message-ID: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>

After the last round of discussion, I was left with the idea that the
best thing we could do to help destructive iteration is to introduce a
{}.popitem() that returns an arbitrary (key, value) pair and deletes
it.  I wrote about this:

> > One more concern: if you repeatedly remove the *first* item, the hash
> > table will start looking lobsided.  Since we don't resize the hash
> > table on deletes, maybe picking an item at random (but not using an
> > expensive random generator!) would be better.

and Tim replied:

> Which is the reason SETL doesn't specify *which* set item is removed:  if
> you always start looking at "the front" of a dict that's being consumed, the
> dict fills with turds without shrinking, you skip over them again and again,
> and consuming the entire dict is still quadratic time.
> 
> Unfortunately, while using a random start point is almost always quicker
> than that, the expected time for consuming the whole dict remains quadratic.
> 
> The clearest way around that is to save a per-dict search finger, recording
> where the last search left off.  Start from its current value.  Failure if
> it wraps around.  This is linear time in non-pathological cases (a
> pathological case is one in which it isn't linear time <wink>).

I've implemented this, except I use a static variable for the finger
intead of a per-dict finger.  I'm concerned about adding 4-8 extra
bytes to each dict object for a feature that most dictionaries never
need.  So, instead, I use a single shared finger.  This works just as
well as long as this is used for a single dictionary.  For multiple
dictionaries (either used by the same thread or in different threads),
it'll work almost as well, although it's possible to make up a
pathological example that would work qadratically.

An easy example of such a pathological example is to call popitem()
for two identical dictionaries in lock step.

Comments please!  We could:

- Live with the pathological cases.

- Forget the whole thing; and then also forget about firstkey()
  etc. which has the same problem only worse.

- Fix the algorithm. Maybe jumping criss-cross through the hash table
  like lookdict does would improve that; but I don't understand the
  math used for that ("Cycle through GF(2^n)-{0}" ???).

I've placed a patch on SourceForge:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102733&group_id=5470

The algorithm is:

static PyObject *
dict_popitem(dictobject *mp, PyObject *args)
{
	static int finger = 0;
	int i;
	dictentry *ep;
	PyObject *res;

	if (!PyArg_NoArgs(args))
		return NULL;
	if (mp->ma_used == 0) {
		PyErr_SetString(PyExc_KeyError,
				"popitem(): dictionary is empty");
		return NULL;
	}
	i = finger;
	if (i >= mp->ma_size)
		ir = 0;
	while ((ep = &mp->ma_table[i])->me_value == NULL) {
		i++;
		if (i >= mp->ma_size)
			i = 0;
	}
	finger = i+1;
	res = PyTuple_New(2);
	if (res != NULL) {
		PyTuple_SET_ITEM(res, 0, ep->me_key);
		PyTuple_SET_ITEM(res, 1, ep->me_value);
		Py_INCREF(dummy);
		ep->me_key = dummy;
		ep->me_value = NULL;
		mp->ma_used--;
	}
	return res;
}

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Dec  8 18:51:49 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Dec 2000 13:51:49 -0500
Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use
Message-ID: <200012081851.NAA32254@cj20424-a.reston1.va.home.com>

Moshe proposes to add an overridable function sys.displayhook(obj)
which will be called by the interpreter for the PRINT_EXPR opcode,
instead of hardcoding the behavior.  The default implementation will
of course have the current behavior, but this makes it much simpler to
experiment with alternatives, e.g. using str() instead of repr() (or
to choose between str() and repr() based on the type).

Moshe has asked me to pronounce on this PEP.  I've thought about it,
and I'm now all for it.  Moshe (or anyone else), please submit a patch
to SF that shows the complete implementation!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Fri Dec  8 19:06:50 2000
From: tim.one@home.com (Tim Peters)
Date: Fri, 8 Dec 2000 14:06:50 -0500
Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEAJIDAA.tim.one@home.com>

[Guido, on sharing a search finger and getting worse-than-linear
 behavior in a simple test case]

See my reply on SourceForge (crossed in the mails).  I predict that fixing
this in an acceptable way (not bulletproof, but linear-time for all
predictably common cases) is a two-character change.

Surprise, although maybe I'm hallucinating (would someone please confirm?):
when I went to the SF patch manager page to look for your patch (using the
Open Patches view), I couldn't find it.  My guess is that if there are "too
many" patches to fit on one screen, then unlike the SF *bug* manager, you
don't get any indication that more patches exist or any control to go to the
next page.



From barry@digicool.com  Fri Dec  8 19:18:26 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Fri, 8 Dec 2000 14:18:26 -0500
Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCIEAJIDAA.tim.one@home.com>
Message-ID: <14897.13314.469255.853298@anthem.concentric.net>

>>>>> "TP" == Tim Peters <tim.one@home.com> writes:

    TP> Surprise, although maybe I'm hallucinating (would someone
    TP> please confirm?): when I went to the SF patch manager page to
    TP> look for your patch (using the Open Patches view), I couldn't
    TP> find it.  My guess is that if there are "too many" patches to
    TP> fit on one screen, then unlike the SF *bug* manager, you don't
    TP> get any indication that more patches exist or any control to
    TP> go to the next page.

I haven't checked recently, but this was definitely true a few weeks
ago.  I think I even submitted an admin request on it, but I don't
remember for sure.

-Barry


From Barrett@stsci.edu  Fri Dec  8 21:22:39 2000
From: Barrett@stsci.edu (Paul Barrett)
Date: Fri,  8 Dec 2000 16:22:39 -0500 (EST)
Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays
In-Reply-To: <200012081610.LAA30679@cj20424-a.reston1.va.home.com>
References: <14896.1191.240597.632888@nem-srvr.stsci.edu>
 <200012081610.LAA30679@cj20424-a.reston1.va.home.com>
Message-ID: <14897.10309.686024.254701@nem-srvr.stsci.edu>

Guido van Rossum writes:
 > > What is the status of PEP 209?  I see David Ascher is the champion of
 > > this PEP, but nothing has been written up.  Is the intention of this
 > > PEP to make the current Numeric a built-in feature of Python or to
 > > re-implement and replace the current Numeric module?
 > 
 > David has already explained why his name is on it -- basically,
 > David's name is on several PEPs but he doesn't currently have any time
 > to work on these, so other volunteers are most welcome to join.
 > 
 > It is my understanding that the current Numeric is sufficiently messy
 > in implementation and controversial in semantics that it would not be
 > a good basis to start from.

That is our (Rick, Perry, and I) belief also.

 > However, I do think that a basic multi-dimensional array object would
 > be a welcome addition to core Python.

That's re-assuring.

 > Indeed.  I think it's best to leave the buffer object out of your
 > implementation plans.  There are several problems with it, and one of
 > the backburner projects is to redesign it to be much more to the point
 > (providing less, not more functionality).

I agree and have already made the decision to leave it out.

 > > My opinion on this is that a new _fundamental_ built-in type should be
 > > created for memory allocation with features and an interface similar
 > > to the _mmap_ object.  I'll call this a _malloc_ object.  This would
 > > allow Numeric 2 to use either object interchangeably depending on the
 > > circumstance.  The _string_ type could also benefit from this new
 > > object by using a read-only version of it.  Since its an object, it's
 > > memory area should be safe from inadvertent deletion.
 > 
 > Interesting.  I'm actually not sufficiently familiar with mmap to
 > comment.  But would the existing array module's array object be at all
 > useful?  You can get to the raw bytes in C (using the C buffer API,
 > which is not deprecated) and it is extensible.

I tried using this but had problems.  I'll look into it again.

 > > Because of these and other new features in Numeric 2, I have a keen
 > > interest in the status of PEPs 207, 208, 211, 225, and 228; and also
 > > in the proposed buffer object.  
 > 
 > Here are some quick comments on the mentioned PEPs.

I've got these PEPs on my desk and will comment on them when I can.

 > > I'm willing to implement this new _malloc_ object if members of the
 > > python-dev list are in agreement.  Actually I see no alternative,
 > > given the current design of Numeric 2, since the Array class will
 > > initially be written completely in Python and will need a mutable
 > > memory buffer, while the _string_ type is meant to be a read-only
 > > object.
 > 
 > Would you be willing to take over authorship of PEP 209?  David Ascher
 > and the Numeric Python community will thank you.

Yes, I'd gladly wield vast and inconsiderate power over unsuspecting
pythoneers. ;-)

 -- Paul




From guido@python.org  Fri Dec  8 22:58:03 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 08 Dec 2000 17:58:03 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Thu, 07 Dec 2000 12:54:51 EST."
 <200012071754.MAA26557@cj20424-a.reston1.va.home.com>
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>
Message-ID: <200012082258.RAA02389@cj20424-a.reston1.va.home.com>

Nobody seems to care much about the warnings PEP so far.  What's up?
Are you all too busy buying presents for the holidays?  Then get me
some too, please? :-)

>   http://python.sourceforge.net/peps/pep-0230.html

I've now produced a prototype implementation for the C code:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102715&group_id=5470

Issues:

- This defines a C API PyErr_Warn(category, message) instead of
  Py_Warn(message, category) as the PEP proposes.  I actually like
  this better: it's consistent with PyErr_SetString() etc. rather than
  with the Python warn(message[, category]) function.

- This calls the Python module from C.  We'll have to see if this is
  fast enough.  I wish I could postpone the import of warnings.py
  until the first call to PyErr_Warn(), but unfortunately the warning
  category classes must be initialized first (so they can be passed
  into PyErr_Warn()).  The current version of warnings.py imports
  rather a lot of other modules (e.g. re and getopt); this can be
  reduced by placing those imports inside the functions that use them.

- All the issues listed in the PEP.

Please comment!

BTW: somebody overwrote the PEP on SourceForge with an older version.
Please remember to do a "cvs update" before running "make install" in
the peps directory!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Fri Dec  8 23:26:51 2000
From: gstein@lyra.org (Greg Stein)
Date: Fri, 8 Dec 2000 15:26:51 -0800
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Dec 08, 2000 at 01:43:39PM -0500
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
Message-ID: <20001208152651.H30644@lyra.org>

On Fri, Dec 08, 2000 at 01:43:39PM -0500, Guido van Rossum wrote:
>...
> Comments please!  We could:
> 
> - Live with the pathological cases.

I agree: live with it. The typical case will operate just fine.

> - Forget the whole thing; and then also forget about firstkey()
>   etc. which has the same problem only worse.

No opinion.

> - Fix the algorithm. Maybe jumping criss-cross through the hash table
>   like lookdict does would improve that; but I don't understand the
>   math used for that ("Cycle through GF(2^n)-{0}" ???).

No need. The keys were inserted randomly, so sequencing through is
effectively random. :-)

>...
> static PyObject *
> dict_popitem(dictobject *mp, PyObject *args)
> {
> 	static int finger = 0;
> 	int i;
> 	dictentry *ep;
> 	PyObject *res;
> 
> 	if (!PyArg_NoArgs(args))
> 		return NULL;
> 	if (mp->ma_used == 0) {
> 		PyErr_SetString(PyExc_KeyError,
> 				"popitem(): dictionary is empty");
> 		return NULL;
> 	}
> 	i = finger;
> 	if (i >= mp->ma_size)
> 		ir = 0;

Should be "i = 0"

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tismer@tismer.com  Sat Dec  9 16:44:14 2000
From: tismer@tismer.com (Christian Tismer)
Date: Sat, 09 Dec 2000 18:44:14 +0200
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
Message-ID: <3A32615E.D39B68D2@tismer.com>


Guido van Rossum wrote:
> 
> After the last round of discussion, I was left with the idea that the
> best thing we could do to help destructive iteration is to introduce a
> {}.popitem() that returns an arbitrary (key, value) pair and deletes
> it.  I wrote about this:
> 
> > > One more concern: if you repeatedly remove the *first* item, the hash
> > > table will start looking lobsided.  Since we don't resize the hash
> > > table on deletes, maybe picking an item at random (but not using an
> > > expensive random generator!) would be better.
> 
> and Tim replied:
> 
> > Which is the reason SETL doesn't specify *which* set item is removed:  if
> > you always start looking at "the front" of a dict that's being consumed, the
> > dict fills with turds without shrinking, you skip over them again and again,
> > and consuming the entire dict is still quadratic time.
> >
> > Unfortunately, while using a random start point is almost always quicker
> > than that, the expected time for consuming the whole dict remains quadratic.
> >
> > The clearest way around that is to save a per-dict search finger, recording
> > where the last search left off.  Start from its current value.  Failure if
> > it wraps around.  This is linear time in non-pathological cases (a
> > pathological case is one in which it isn't linear time <wink>).
> 
> I've implemented this, except I use a static variable for the finger
> intead of a per-dict finger.  I'm concerned about adding 4-8 extra
> bytes to each dict object for a feature that most dictionaries never
> need.  So, instead, I use a single shared finger.  This works just as
> well as long as this is used for a single dictionary.  For multiple
> dictionaries (either used by the same thread or in different threads),
> it'll work almost as well, although it's possible to make up a
> pathological example that would work qadratically.
> 
> An easy example of such a pathological example is to call popitem()
> for two identical dictionaries in lock step.
> 
> Comments please!  We could:
> 
> - Live with the pathological cases.
> 
> - Forget the whole thing; and then also forget about firstkey()
>   etc. which has the same problem only worse.
> 
> - Fix the algorithm. Maybe jumping criss-cross through the hash table
>   like lookdict does would improve that; but I don't understand the
>   math used for that ("Cycle through GF(2^n)-{0}" ???).

That algorithm is really a gem which you should know,
so let me try to explain it.


Intro: A little story about finite field theory (very basic).
-------------------------------------------------------------

For every prime p and every power p^n, there
exists a Galois Field ( GF(p^n) ), which is
a finite field.
The additive group is called "elementary Abelian",
it is commutative, and it looks a little like a
vector space, since addition works in cycles modulo p
for every p cell.
The multiplicative group is cyclic, and it never
touches 0. Cyclic groups are generated by a single
primitive element. The powers of that element make up all the
other elements. For all elements of the multiplication
group GF(p^n)* the equality    x^(p^n -1) == 1 . A generator
element is therefore a primitive (p^n-1)th root of unity.

>From another point of view, the elements of GF(p^n)
can be seen as coefficients of polynomials over
GF(p). It can be easily shown that every generator
of the multiplicative group is an irreducible
polynomial of degree n with coefficients in GF(p).

An irreducible polynomial over a field has the property
not to vanish for any value of the field. It has no
zero in the field. By adjoining such a zero to the
field, we turn it into an extension field: GF(p^n).

Now on the dictionary case.
---------------------------

The idea is to conceive every non-zero bit pattern
as coefficients of a polynomial over GF(2)[x].
We need to find an irreducible polynomial of degee
n over the prime field GF(2). There exists a primitive
n'th root ľ of unity in GF(2^n) which generates every
non-zero bit pattern of length n, being coefficients
of a polynomial over ľ.
That means, every given bit pattern can be seen as
some power of ľ. ľ enumerates the whole multiplicative
group, and the given pattern is just one position in
that enumeration. We can go to the next position
in this cycle simply by multiplying the pattern by ľ.
But since we are dealing with polynomials over ľ,
this multiplication isn't much more that adding one
to very exponent in the polynomial, hence a left
shift of our pattern.
Adjusting the overflow of this pattern involves
a single addition, which is just an XOR in GF(2^n).

Example: p=2  n=3  G = GF(8) = GF(2^3)
----------------------------------------
"""
Since 8 = 2^3, the prime field is GF(2) and we need to
find a monic irreducible cubic polynomial over that
field. Since the coefficients can only be 0 and 1, the
list of irreducible candidates is easily obtained. 

     x^3 + 1 
     x^3 + x + 1 
     x^3 + x^2 + 1 
     x^3 + x^2 + x + 1

Now substituting 0 gives 1 in all cases, and substituting
1 will give 0 only if there are an odd number of x terms,
so the irreducible cubics are just x^3 + x + 1 and x^3 + x^2 + 1.
Now the multiplicative group of this field is a cyclic
group of order 7 and so every nonidentity element is a
generator. Letting ľ be a root of the first polynomial, we
have ľ^3 + ľ + 1 = 0, or ľ^3 = ľ + 1, so the powers of ľ are: 

     ľ^1 = ľ 
     ľ^2 = ľ^2 
     ľ^3 = ľ + 1 
     ľ^4 = ľ^2 + ľ 
     ľ^5 = ľ^2 + ľ + 1 
     ľ^6 = ľ^2 + 1 
     ľ^7 = 1 

"""

We could of course equally choose the second polynomial with
an isomorphic result.

The above example was taken from
http://www-math.cudenver.edu/~wcherowi/courses/finflds.html
Note that finding the irreducible polynomial was so easy
since a reducible cubic always has a linear factor. In the
general case, we have to check against all possible
subpolynomials or use much more of the theory.


Application of the example to the dictionary algorithm (DA)
-----------------------------------------------------------

We stay in GF(8) and use the above example.
The maximum allowed pattern value in our system is
2^n - 1. This is the variable "mask" in the program.

We assume a given non-zero bit pattern with coefficients
   (a2, a1, a0)
and write down a polynomial in ľ for it:

    p = a2*ľ^2 + a1*ľ + a0

To obtain the next pattern in the group's enumeration,
we multiply by the generator polynomial ľ:

    p*ľ = a2*ľ^3 + a1*ľ^2 + a0*ľ

In the program, this is line 210:
		incr = incr << 1;

a simple shift.
But in the case that a2 is not zero, we get an overflow,
and we have to fold the bit back, by the following identity:

    ľ^3 = ľ+1

That means, we have to subtract ľ^3 from the pattern and
add ľ+1 instead. But since addition and subtraction is
identical in GF(2), we just add the whole polynomial.
>From a different POV, we just add zero here, since

    ľ^3 + ľ + 1  =  0

The full progam to perform the polynomial multiplication
gets down to just a shift, a test and an XOR:

		incr = incr << 1;
		if (incr > mask)
			incr ^= mp->ma_poly;

Summary
=======

For every power n of two, we can find a generator element
ľ of GF(2^n). Every non-zero bit pattern can be taken as
coefficient of a polynomial in ľ. 
The powers of ľ reach all these patterns. Therefore, each
pattern *is* some power of ľ. By multiplication with ľ
we can reach every possible pattern exactly once.
Since these patterns are used as distances from the
primary hash-computed slot modulo 2^n, and the distances
are never zero, all slots can be reached.


-----------------------------------

Appendix, on the use of finger:
-------------------------------

Instead of using a global finger variable, you can do the
following (involving a cast from object to int) :

- if the 0'th slot of the dict is non-empty:
  return this element and insert the dummy element
  as key. Set the value field to the Dictionary Algorithm
  would give for the removed object's hash. This is the
  next finger.
- else:
  treat the value field of the 0'th slot as the last finger.
  If it is zero, initialize it with 2^n-1.
  Repetitively use the DA until you find an entry. Save
  the finger in slot 0 again.

This dosn't cost an extra slot, and even when the dictionary
is written between removals, the chance to loose the finger
is just 1:(2^n-1) on every insertion.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From nas@arctrix.com  Sat Dec  9 11:30:06 2000
From: nas@arctrix.com (Neil Schemenauer)
Date: Sat, 9 Dec 2000 03:30:06 -0800
Subject: [Python-Dev] PEP 208 and __coerce__
Message-ID: <20001209033006.A3737@glacier.fnational.com>

While working on the implementation of PEP 208, I discovered that
__coerce__ has some surprising properties.  Initially I
implemented __coerce__ so that the numberic operation currently
being performed was called on the values returned by __coerce__.
This caused test_class to blow up due to code like this:

    class Test:
        def __coerce__(self, other):
            return (self, other)

The 2.0 "solves" this by not calling __coerce__ again if the
objects returned by __coerce__ are instances.  This has the
effect of making code like:

    class A:
        def __coerce__(self, other):
            return B(), other

    class B:
        def __coerce__(self, other):
            return 1, other

    A() + 1

fail to work in the expected way.  The question is: how should
__coerce__ work?  One option is to leave it work the way it does
in 2.0.  Alternatively, I could change it so that if coerce
returns (self, *) then __coerce__ is not called again.


  Neil


From mal@lemburg.com  Sat Dec  9 18:49:29 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 09 Dec 2000 19:49:29 +0100
Subject: [Python-Dev] PEP 208 and __coerce__
References: <20001209033006.A3737@glacier.fnational.com>
Message-ID: <3A327EB9.BD2CA3CC@lemburg.com>

Neil Schemenauer wrote:
> 
> While working on the implementation of PEP 208, I discovered that
> __coerce__ has some surprising properties.  Initially I
> implemented __coerce__ so that the numberic operation currently
> being performed was called on the values returned by __coerce__.
> This caused test_class to blow up due to code like this:
> 
>     class Test:
>         def __coerce__(self, other):
>             return (self, other)
> 
> The 2.0 "solves" this by not calling __coerce__ again if the
> objects returned by __coerce__ are instances.  This has the
> effect of making code like:
> 
>     class A:
>         def __coerce__(self, other):
>             return B(), other
> 
>     class B:
>         def __coerce__(self, other):
>             return 1, other
> 
>     A() + 1
> 
> fail to work in the expected way.  The question is: how should
> __coerce__ work?  One option is to leave it work the way it does
> in 2.0.  Alternatively, I could change it so that if coerce
> returns (self, *) then __coerce__ is not called again.

+0 -- the idea behind the PEP 208 is to get rid off the 
centralized coercion mechanism, so fixing it to allow yet
more obscure variants should be carefully considered. 

I see __coerce__ et al. as old style mechanisms -- operator methods
have much more information available to do the right thing
than the single bottelneck __coerce__.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tim.one@home.com  Sat Dec  9 20:49:04 2000
From: tim.one@home.com (Tim Peters)
Date: Sat, 9 Dec 2000 15:49:04 -0500
Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECEIDAA.tim.one@home.com>

[Guido]
> ...
> I've implemented this, except I use a static variable for the finger
> intead of a per-dict finger.  I'm concerned about adding 4-8 extra
> bytes to each dict object for a feature that most dictionaries never
> need.

It's a bit ironic that dicts are guaranteed to be at least 1/3 wasted space
<wink>.  Let's pick on Christian's idea to reclaim a few bytes of that.

> So, instead, I use a single shared finger.  This works just as
> well as long as this is used for a single dictionary.  For multiple
> dictionaries (either used by the same thread or in different threads),
> it'll work almost as well, although it's possible to make up a
> pathological example that would work qadratically.
>
> An easy example of such a pathological example is to call popitem()
> for two identical dictionaries in lock step.

Please see my later comments attached to the patch:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102733&group_id=5470

In short, for me (truly) identical dicts perform well with or without my
suggestion, while dicts cloned via dict.copy() perform horribly with or
without my suggestion (their internal structures differ); still curious as
to whether that's also true for you (am I looking at a Windows bug?  I don't
see how, but it's possible ...).  In any case, my suggestion turned out to
be worthless on my box.

Playing around via simulations suggests that a shared finger is going to be
disastrous when consuming more than one dict unless they have identical
internal structure (not just compare equal).  As soon as they get a little
out of synch, it just gets worse with each succeeding probe.

> Comments please!  We could:
>
> - Live with the pathological cases.

How boring <wink>.

> - Forget the whole thing; and then also forget about firstkey()
>   etc. which has the same problem only worse.

I don't know that this is an important idea for dicts in general (it is
important for sets) -- it's akin to an xrange for dicts.  But then I've had
more than one real-life program that built giant dicts then ran out of
memory trying to iterate over them!  I'd like to fix that.

> - Fix the algorithm. Maybe jumping criss-cross through the hash table
>   like lookdict does would improve that; but I don't understand the
>   math used for that ("Cycle through GF(2^n)-{0}" ???).

Christian explained that well (thanks!).

However, I still don't see any point to doing that business in .popitem():
when inserting keys, the jitterbug probe sequence has the crucial benefit of
preventing primary clustering when keys collide.  But when we consume a
dict, we just want to visit every slot as quickly as possible.

[Christian]
> Appendix, on the use of finger:
> -------------------------------
>
> Instead of using a global finger variable, you can do the
> following (involving a cast from object to int) :
>
> - if the 0'th slot of the dict is non-empty:
>   return this element and insert the dummy element
>   as key. Set the value field to the Dictionary Algorithm
>   would give for the removed object's hash. This is the
>   next finger.
> - else:
>   treat the value field of the 0'th slot as the last finger.
>   If it is zero, initialize it with 2^n-1.
>   Repetitively use the DA until you find an entry. Save
>   the finger in slot 0 again.
>
> This dosn't cost an extra slot, and even when the dictionary
> is written between removals, the chance to loose the finger
> is just 1:(2^n-1) on every insertion.

I like that, except:

1) As above, I don't believe the GF business buys anything over
   a straightforward search when consuming a dict.

2) Overloading the value field bristles with problems, in part
   because it breaks the invariant that a slot is unused if
   and only if the value field is NULL, in part because C
   doesn't guarantee that you can get away with casting an
   arbitrary int to a pointer and back again.

None of the problems in #2 arise if we abuse the me_hash field instead, so
the attached does that.  Here's a typical run of Guido's test case using
this (on an 866MHz machine w/ 256Mb RAM -- the early values jump all over
the place from run to run):

run = 0
log2size = 10 size = 1024
    7.4 usec per item to build (total 0.008 sec)
    3.4 usec per item to destroy twins (total 0.003 sec)
log2size = 11 size = 2048
    6.7 usec per item to build (total 0.014 sec)
    3.4 usec per item to destroy twins (total 0.007 sec)
log2size = 12 size = 4096
    7.0 usec per item to build (total 0.029 sec)
    3.7 usec per item to destroy twins (total 0.015 sec)
log2size = 13 size = 8192
    7.1 usec per item to build (total 0.058 sec)
    5.9 usec per item to destroy twins (total 0.048 sec)
log2size = 14 size = 16384
    14.7 usec per item to build (total 0.241 sec)
    6.4 usec per item to destroy twins (total 0.105 sec)
log2size = 15 size = 32768
    12.2 usec per item to build (total 0.401 sec)
    3.9 usec per item to destroy twins (total 0.128 sec)
log2size = 16 size = 65536
    7.8 usec per item to build (total 0.509 sec)
    4.0 usec per item to destroy twins (total 0.265 sec)
log2size = 17 size = 131072
    7.9 usec per item to build (total 1.031 sec)
    4.1 usec per item to destroy twins (total 0.543 sec)

The last one is over 100 usec per item using the original patch (with or
without my first suggestion).

if-i-were-a-betting-man-i'd-say-"bingo"-ly y'rs  - tim


Drop-in replacement for the popitem in the patch:

static PyObject *
dict_popitem(dictobject *mp, PyObject *args)
{
	int i = 0;
	dictentry *ep;
	PyObject *res;

	if (!PyArg_NoArgs(args))
		return NULL;
	if (mp->ma_used == 0) {
		PyErr_SetString(PyExc_KeyError,
				"popitem(): dictionary is empty");
		return NULL;
	}
	/* Set ep to "the first" dict entry with a value.  We abuse the hash
	 * field of slot 0 to hold a search finger:
	 * If slot 0 has a value, use slot 0.
	 * Else slot 0 is being used to hold a search finger,
	 * and we use its hash value as the first index to look.
	 */
	ep = &mp->ma_table[0];
	if (ep->me_value == NULL) {
		i = (int)ep->me_hash;
		/* The hash field may be uninitialized trash, or it
		 * may be a real hash value, or it may be a legit
		 * search finger, or it may be a once-legit search
		 * finger that's out of bounds now because it
		 * wrapped around or the table shrunk -- simply
		 * make sure it's in bounds now.
		 */
		if (i >= mp->ma_size || i < 1)
			i = 1;	/* skip slot 0 */
		while ((ep = &mp->ma_table[i])->me_value == NULL) {
			i++;
			if (i >= mp->ma_size)
				i = 1;
		}
	}
	res = PyTuple_New(2);
	if (res != NULL) {
		PyTuple_SET_ITEM(res, 0, ep->me_key);
		PyTuple_SET_ITEM(res, 1, ep->me_value);
		Py_INCREF(dummy);
		ep->me_key = dummy;
		ep->me_value = NULL;
		mp->ma_used--;
	}
	assert(mp->ma_table[0].me_value == NULL);
	mp->ma_table[0].me_hash = i + 1;  /* next place to start */
	return res;
}



From tim.one@home.com  Sat Dec  9 21:09:30 2000
From: tim.one@home.com (Tim Peters)
Date: Sat, 9 Dec 2000 16:09:30 -0500
Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEECEIDAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECEIDAA.tim.one@home.com>

> 	assert(mp->ma_table[0].me_value == NULL);
> 	mp->ma_table[0].me_hash = i + 1;  /* next place to start */

Ack, those two lines should move up into the "if (res != NULL)" block.

errors-are-error-prone-ly y'rs  - tim



From gvwilson@nevex.com  Sun Dec 10 16:11:09 2000
From: gvwilson@nevex.com (Greg Wilson)
Date: Sun, 10 Dec 2000 11:11:09 -0500
Subject: [Python-Dev] re: So You Want to Write About Python?
Message-ID: <NEBBIACCCGNFLECLCLMHCEFLCBAA.gvwilson@nevex.com>

Hi, folks.  Jon Erickson (Doctor Dobb's Journal), Frank Willison (O'Reilly),
and I (professional loose cannon) are doing a workshop at IPC on writing
books
and magazine articles about Python.  It would be great to have a few
articles
(in various stages of their lives) and/or book proposals from people on this
list to use as examples.  So, if you think the world oughta know about the
things you're doing, and would like to use this to help get yourself
motivated
to start writing, please drop me a line.  I'm particularly interested in:

- the real-world issues involved in moving to Unicode

- non-trivial XML processing using SAX and DOM (where "non-trivial" means
  "including namespaces, entity references, error handling, and all that")

- the theory and practice of stackless, generators, and continuations

- the real-world tradeoffs between the various memory management schemes
  that are now available for Python

- feature comparisons of various Foobars that can be used with Python (where
  "Foobar" could be "GUI toolkit", "IDE", "web scripting toolkit", or just
  about anything else)

- performance analysis and tuning of Python itself (as an example of how you
  speed up real applications --- this is something that matters a lot in the
  real world, but tends to get forgotten in school)

- just about anything else that you wish someone had written for you before
  you started your last big project

Thanks,
Greg



From paul@prescod.net  Sun Dec 10 18:02:27 2000
From: paul@prescod.net (Paul Prescod)
Date: Sun, 10 Dec 2000 10:02:27 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com>
Message-ID: <3A33C533.ABA27C7C@prescod.net>

Guido van Rossum wrote:
> 
> Nobody seems to care much about the warnings PEP so far.  What's up?
> Are you all too busy buying presents for the holidays?  Then get me
> some too, please? :-)

My opinions:

 * it should be a built-in or keyword, not a function in "sys". Warning
is supposed to be as easy as possible so people will do it often.
<irrelevant_aside>sys.argv and sys.stdout annoy me as it is.</>

 * the term "level" applied to warnings typically means "warning level"
as in -W1 -W2 -Wall. Proposal: call it "stacklevel" or something.

 * this level idea gives rise to another question. What if I want to see
the full stack context of a warning? Do I have to implement a whole new
warning output hook? It seems like I should be able to specify this as a
command line option alongside the action.

 * I prefer ":*:*:" to ":::" for leaving parts of the warning spec out.

 * should there be a sys.formatwarning? What if I want to redirect
warnings to a socket -- I'd like to use the standard formatting
machinery. Or vice versa, I might want to change the formatting but not
override the destination.

 * there should be a "RuntimeWarning" -- base category for warnings
about dubious runtime behaviors (e.g. integer division truncated value)

 * it should be possible to strip warnings as an optimization step. That
may require interpreter and syntax support.

 * warnings will usually be tied to tests which the user will want to be
able to optimize out also. (e.g. if __debug__ and type(foo)==StringType:
warn "Should be Unicode!")


I propose:

	>>> warn conditional, message[, category]

to be very parallel with 

	>>> assert conditional, message

I'm not proposing the use of the assert keyword anymore, but I am trying
to reuse the syntax for familiarity. Perhaps -Wstrip would strip
warnings out of the bytecode.

 Paul Prescod


From nas@arctrix.com  Sun Dec 10 13:46:46 2000
From: nas@arctrix.com (Neil Schemenauer)
Date: Sun, 10 Dec 2000 05:46:46 -0800
Subject: [Python-Dev] Reference implementation for PEP 208 (coercion)
Message-ID: <20001210054646.A5219@glacier.fnational.com>

Sourceforge unloads are not working.  The lastest version of the
patch for PEP 208 is here:

    http://arctrix.com/nas/python/coerce-6.0.diff

Operations on instances now call __coerce__ if it exists.  I
think the patch is now complete.  Converting other builtin types
to "new style numbers" can be done with a separate patch.

  Neil


From guido@python.org  Sun Dec 10 22:17:08 2000
From: guido@python.org (Guido van Rossum)
Date: Sun, 10 Dec 2000 17:17:08 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Sun, 10 Dec 2000 10:02:27 PST."
 <3A33C533.ABA27C7C@prescod.net>
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com>
 <3A33C533.ABA27C7C@prescod.net>
Message-ID: <200012102217.RAA12550@cj20424-a.reston1.va.home.com>

> My opinions:
> 
>  * it should be a built-in or keyword, not a function in "sys". Warning
> is supposed to be as easy as possible so people will do it often.

Disagree.  Warnings are there mostly for the Python system to warn the
Python programmer.  The most heavy use will come from the standard
library, not from user code.

> <irrelevant_aside>sys.argv and sys.stdout annoy me as it is.</>

Too bad.

>  * the term "level" applied to warnings typically means "warning level"
> as in -W1 -W2 -Wall. Proposal: call it "stacklevel" or something.

Good point.

>  * this level idea gives rise to another question. What if I want to see
> the full stack context of a warning? Do I have to implement a whole new
> warning output hook? It seems like I should be able to specify this as a
> command line option alongside the action.

Turn warnings into errors and you'll get a full traceback.  If you
really want a full traceback without exiting, some creative use of
sys._getframe() and the traceback module will probably suit you well.

>  * I prefer ":*:*:" to ":::" for leaving parts of the warning spec out.

I don't.

>  * should there be a sys.formatwarning? What if I want to redirect
> warnings to a socket -- I'd like to use the standard formatting
> machinery. Or vice versa, I might want to change the formatting but not
> override the destination.

Good point.  I'm changing this to:

def showwarning(message, category, filename, lineno, file=None):
    """Hook to frite a warning to a file; replace if you like."""

and

def formatwarning(message, category, filename, lineno):
    """Hook to format a warning the standard way."""

>  * there should be a "RuntimeWarning" -- base category for warnings
> about dubious runtime behaviors (e.g. integer division truncated value)

OK.

>  * it should be possible to strip warnings as an optimization step. That
> may require interpreter and syntax support.

I don't see the point of this.  I think this comes from our different
views on who should issue warnings.

>  * warnings will usually be tied to tests which the user will want to be
> able to optimize out also. (e.g. if __debug__ and type(foo)==StringType:
> warn "Should be Unicode!")
> 
> I propose:
> 
> 	>>> warn conditional, message[, category]

Sorry, this is not worth a new keyword.

> to be very parallel with 
> 
> 	>>> assert conditional, message
> 
> I'm not proposing the use of the assert keyword anymore, but I am trying
> to reuse the syntax for familiarity. Perhaps -Wstrip would strip
> warnings out of the bytecode.

Why?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@effbot.org  Mon Dec 11 00:16:25 2000
From: fredrik@effbot.org (Fredrik Lundh)
Date: Mon, 11 Dec 2000 01:16:25 +0100
Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use
References: <200012081851.NAA32254@cj20424-a.reston1.va.home.com>
Message-ID: <000901c06307$9a814d60$3c6340d5@hagrid>

Guido wrote:
> Moshe proposes to add an overridable function sys.displayhook(obj)
> which will be called by the interpreter for the PRINT_EXPR opcode,
> instead of hardcoding the behavior.  The default implementation will
> of course have the current behavior, but this makes it much simpler to
> experiment with alternatives, e.g. using str() instead of repr() (or
> to choose between str() and repr() based on the type).

hmm.  instead of patching here and there, what's stopping us
from doing it the right way?  I'd prefer something like:

    import code

    class myCLI(code.InteractiveConsole):
        def displayhook(self, data):
            # non-standard display hook
            print str(data)

    sys.setcli(myCLI())

(in other words, why not move the *entire* command line interface
over to Python code)

</F>



From guido@python.org  Mon Dec 11 02:24:20 2000
From: guido@python.org (Guido van Rossum)
Date: Sun, 10 Dec 2000 21:24:20 -0500
Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use
In-Reply-To: Your message of "Mon, 11 Dec 2000 01:16:25 +0100."
 <000901c06307$9a814d60$3c6340d5@hagrid>
References: <200012081851.NAA32254@cj20424-a.reston1.va.home.com>
 <000901c06307$9a814d60$3c6340d5@hagrid>
Message-ID: <200012110224.VAA12844@cj20424-a.reston1.va.home.com>

> Guido wrote:
> > Moshe proposes to add an overridable function sys.displayhook(obj)
> > which will be called by the interpreter for the PRINT_EXPR opcode,
> > instead of hardcoding the behavior.  The default implementation will
> > of course have the current behavior, but this makes it much simpler to
> > experiment with alternatives, e.g. using str() instead of repr() (or
> > to choose between str() and repr() based on the type).

Effbot regurgitates:
> hmm.  instead of patching here and there, what's stopping us
> from doing it the right way?  I'd prefer something like:
> 
>     import code
> 
>     class myCLI(code.InteractiveConsole):
>         def displayhook(self, data):
>             # non-standard display hook
>             print str(data)
> 
>     sys.setcli(myCLI())
> 
> (in other words, why not move the *entire* command line interface
> over to Python code)

Indeed, this is why I've been hesitant to bless Moshe's hack.  I
finally decided to go for it because I don't see this redesign of the
CLI happening anytime soon.  In order to do it right, it would require
a redesign of the parser input handling, which is probably the oldest
code in Python (short of the long integer math, which predates Python
by several years).  The current code module is a hack, alas, and
doesn't always get it right the same way as the *real* CLI does
things.

So, rather than wait forever for the perfect solution, I think it's
okay to settle for less sooner.  "Now is better than never."

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paulp@ActiveState.com  Mon Dec 11 06:59:29 2000
From: paulp@ActiveState.com (Paul Prescod)
Date: Sun, 10 Dec 2000 22:59:29 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com>
 <3A33C533.ABA27C7C@prescod.net> <200012102217.RAA12550@cj20424-a.reston1.va.home.com>
Message-ID: <3A347B51.ADB3F12C@ActiveState.com>

Guido van Rossum wrote:
> 
>...
> 
> Disagree.  Warnings are there mostly for the Python system to warn the
> Python programmer.  The most heavy use will come from the standard
> library, not from user code.

Most Python code is part of some library or another. It may not be the
standard library but its still a library. Perl and Java both make
warnings (especially about deprecation) very easy *for user code*.

> >  * it should be possible to strip warnings as an optimization step. That
> > may require interpreter and syntax support.
> 
> I don't see the point of this.  I think this comes from our different
> views on who should issue warnings.

Everyone who creates a reusable library will want to issue warnings.
That is to say, most serious Python programmers.

Anyhow, let's presume that it is only the standard library that issues
warnings (for arguments sake). What if I have a speed-critical module
that triggers warnings in an inner loop. Turning off the warning doesn't
turn off the overhead of the warning infrastructure. I should be able to
turn off the overhead easily -- ideally from the Python command line.
And I still feel that part of that "overhead" is in the code that tests
to determine whether to issue the warnings. There should be a way to
turn off that overhead also.

 Paul


From paulp@ActiveState.com  Mon Dec 11 07:23:17 2000
From: paulp@ActiveState.com (Paul Prescod)
Date: Sun, 10 Dec 2000 23:23:17 -0800
Subject: [Python-Dev] Online help PEP
Message-ID: <3A3480E5.C2577AE6@ActiveState.com>

PEP: ???
Title: Python Online Help
Version: $Revision: 1.0 $
Author: paul@prescod.net, paulp@activestate.com (Paul Prescod)
Status: Draft
Type: Standards Track
Python-Version: 2.1
Status: Incomplete

Abstract

    This PEP describes a command-line driven online help facility
    for Python. The facility should be able to build on existing
    documentation facilities such as the Python documentation 
    and docstrings. It should also be extensible for new types and
    modules.

Interactive use:

    Simply typing "help" describes the help function (through repr 
    overloading).

    "help" can also be used as a function:

    The function takes the following forms of input:

        help( "string" ) -- built-in topic or global
        help( <ob> ) -- docstring from object or type
        help( "doc:filename" ) -- filename from Python documentation

    If you ask for a global, it can be a fully-qualfied name such as 
    help("xml.dom").

    You can also use the facility from a command-line

    python --help if

    In either situation, the output does paging similar to the "more"
    command. 

Implementation

    The help function is implemented in an onlinehelp module which is
    demand-loaded.

    There should be options for fetching help information from
environments 
    other than the command line through the onlinehelp module:

        onelinehelp.gethelp(object_or_string) -> string

    It should also be possible to override the help display function by
    assigning to onlinehelp.displayhelp(object_or_string).

    The module should be able to extract module information from either 
    the HTML or LaTeX versions of the Python documentation. Links should
    be accommodated in a "lynx-like" manner. 

    Over time, it should also be able to recognize when docstrings are 
    in "special" syntaxes like structured text, HTML and LaTeX and
decode 
    them appropriately.

    A prototype implementation is available with the Python source 
    distribution as nondist/sandbox/doctools/onlinehelp.py.

Built-in Topics

    help( "intro" )  - What is Python? Read this first!
    help( "keywords" )  - What are the keywords?
    help( "syntax" )  - What is the overall syntax?
    help( "operators" )  - What operators are available?
    help( "builtins" )  - What functions, types, etc. are built-in?
    help( "modules" )  - What modules are in the standard library?
    help( "copyright" )  - Who owns Python?
    help( "moreinfo" )  - Where is there more information?
    help( "changes" )  - What changed in Python 2.0?
    help( "extensions" )  - What extensions are installed?
    help( "faq" )  - What questions are frequently asked?
    help( "ack" )  - Who has done work on Python lately?

Security Issues

    This module will attempt to import modules with the same names as
    requested topics. Don't use the modules if you are not confident
that 
    everything in your pythonpath is from a trusted source.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:


From tim.one@home.com  Mon Dec 11 07:36:57 2000
From: tim.one@home.com (Tim Peters)
Date: Mon, 11 Dec 2000 02:36:57 -0500
Subject: [Python-Dev] FW: [Python-Help] indentation
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDPIDAA.tim.one@home.com>

While we're talking about pluggable CLIs, I share this fellow's confusion
over IDLE's CLI variant:  block code doesn't "look right" under IDLE because
sys.ps2 doesn't exist under IDLE.  Some days you can't make *anybody* happy
<wink>.

-----Original Message-----
...

Subject: [Python-Help] indentation
Sent: Sunday, December 10, 2000 7:32 AM

...

My Problem has to do with identation:

I put the following to idle:

>>> if not 1:
	print 'Hallo'
	else:

SyntaxError: invalid syntax

I get the Message above.

I know that else must be 4 spaces to the left, but idle doesn't let me do
this.

I have only the alternative to put to most left point. But than I
disturb the block structure and I get again the error message.

I want to have it like this:

>>> if not 1:
	print 'Hallo'
    else:

Can you help me?

...



From fredrik@pythonware.com  Mon Dec 11 11:36:53 2000
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 11 Dec 2000 12:36:53 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com>
Message-ID: <033701c06366$ab746580$0900a8c0@SPIFF>

christian wrote:
> That algorithm is really a gem which you should know,
> so let me try to explain it.

I think someone just won the "brain exploder 2000" award ;-)

to paraphrase Bertrand Russell,

    "Mathematics may be defined as the subject where I never
    know what you are talking about, nor whether what you are
    saying is true"

cheers /F



From thomas@xs4all.net  Mon Dec 11 12:12:09 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 11 Dec 2000 13:12:09 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <033701c06366$ab746580$0900a8c0@SPIFF>; from fredrik@pythonware.com on Mon, Dec 11, 2000 at 12:36:53PM +0100
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF>
Message-ID: <20001211131208.G4396@xs4all.nl>

On Mon, Dec 11, 2000 at 12:36:53PM +0100, Fredrik Lundh wrote:
> christian wrote:
> > That algorithm is really a gem which you should know,
> > so let me try to explain it.

> I think someone just won the "brain exploder 2000" award ;-)

By acclamation, I'd expect. I know it was the best laugh I had since last
week's Have I Got News For You, even though trying to understand it made me
glad I had boring meetings to recuperate in ;)

Highschool-dropout-ly y'rs,

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mal@lemburg.com  Mon Dec 11 12:33:18 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 11 Dec 2000 13:33:18 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF>
Message-ID: <3A34C98E.7C42FD24@lemburg.com>

Fredrik Lundh wrote:
> 
> christian wrote:
> > That algorithm is really a gem which you should know,
> > so let me try to explain it.
> 
> I think someone just won the "brain exploder 2000" award ;-)
> 
> to paraphrase Bertrand Russell,
> 
>     "Mathematics may be defined as the subject where I never
>     know what you are talking about, nor whether what you are
>     saying is true"

Hmm, I must have missed that one... care to repost ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tismer@tismer.com  Mon Dec 11 13:49:48 2000
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 11 Dec 2000 15:49:48 +0200
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF>
Message-ID: <3A34DB7C.FF7E82CE@tismer.com>


Fredrik Lundh wrote:
> 
> christian wrote:
> > That algorithm is really a gem which you should know,
> > so let me try to explain it.
> 
> I think someone just won the "brain exploder 2000" award ;-)
> 
> to paraphrase Bertrand Russell,
> 
>     "Mathematics may be defined as the subject where I never
>     know what you are talking about, nor whether what you are
>     saying is true"

:-))

Well, I was primarily targeting Guido, who said that he
came from math, and one cannot study math without standing
a basic algebra course, I think. I tried my best to explain
it for those who know at least how groups, fields, rings
and automorphisms work. Going into more details of the
theory would be off-topic for python-dev, but I will try
it in an upcoming DDJ article.

As you might have guessed, I didn't do this just for fun.
It is the old game of explaining what is there, convincing
everybody that you at least know what you are talking about,
and then three days later coming up with an improved
application of the theory.

Today is Monday, 2 days left. :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From guido@python.org  Mon Dec 11 15:12:24 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 10:12:24 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: Your message of "Mon, 11 Dec 2000 15:49:48 +0200."
 <3A34DB7C.FF7E82CE@tismer.com>
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF>
 <3A34DB7C.FF7E82CE@tismer.com>
Message-ID: <200012111512.KAA23622@cj20424-a.reston1.va.home.com>

> Fredrik Lundh wrote:
> > 
> > christian wrote:
> > > That algorithm is really a gem which you should know,
> > > so let me try to explain it.
> > 
> > I think someone just won the "brain exploder 2000" award ;-)
> > 
> > to paraphrase Bertrand Russell,
> > 
> >     "Mathematics may be defined as the subject where I never
> >     know what you are talking about, nor whether what you are
> >     saying is true"
> 
> :-))
> 
> Well, I was primarily targeting Guido, who said that he
> came from math, and one cannot study math without standing
> a basic algebra course, I think. I tried my best to explain
> it for those who know at least how groups, fields, rings
> and automorphisms work. Going into more details of the
> theory would be off-topic for python-dev, but I will try
> it in an upcoming DDJ article.

I do have a math degree, but it is 18 years old and I had to give up
after the first paragraph of your explanation.  It made me vividly
recall the first and only class on Galois Theory that I ever took --
after one hour I realized that this was not for me and I didn't have a
math brain after all.  I went back to the basement where the software
development lab was (i.e. a row of card punches :-).

> As you might have guessed, I didn't do this just for fun.
> It is the old game of explaining what is there, convincing
> everybody that you at least know what you are talking about,
> and then three days later coming up with an improved
> application of the theory.
> 
> Today is Monday, 2 days left. :-)

I'm very impressed.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Dec 11 15:15:02 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 10:15:02 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Sun, 10 Dec 2000 22:59:29 PST."
 <3A347B51.ADB3F12C@ActiveState.com>
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <200012102217.RAA12550@cj20424-a.reston1.va.home.com>
 <3A347B51.ADB3F12C@ActiveState.com>
Message-ID: <200012111515.KAA23764@cj20424-a.reston1.va.home.com>

[me]
> > Disagree.  Warnings are there mostly for the Python system to warn the
> > Python programmer.  The most heavy use will come from the standard
> > library, not from user code.

[Paul Prescod]
> Most Python code is part of some library or another. It may not be the
> standard library but its still a library. Perl and Java both make
> warnings (especially about deprecation) very easy *for user code*.

Hey.  I'm not making it impossible to use warnings.  I'm making it
very easy.  All you have to do is put "from warnings import warn" at
the top of your library module.  Requesting a built-in or even a new
statement is simply excessive.

> > >  * it should be possible to strip warnings as an optimization step. That
> > > may require interpreter and syntax support.
> > 
> > I don't see the point of this.  I think this comes from our different
> > views on who should issue warnings.
> 
> Everyone who creates a reusable library will want to issue warnings.
> That is to say, most serious Python programmers.
> 
> Anyhow, let's presume that it is only the standard library that issues
> warnings (for arguments sake). What if I have a speed-critical module
> that triggers warnings in an inner loop. Turning off the warning doesn't
> turn off the overhead of the warning infrastructure. I should be able to
> turn off the overhead easily -- ideally from the Python command line.
> And I still feel that part of that "overhead" is in the code that tests
> to determine whether to issue the warnings. There should be a way to
> turn off that overhead also.

So rewrite your code so that it doesn't trigger the warning.  When you
get a warning, you're doing something that could be done in a better
way.  So don't whine about the performance.

It's a quality of implementation issue whether C code that tests for
issues that deserve warnings can do the test without slowing down code
that doesn't deserve a warning.  Ditto for standard library code.

Here's an example.  I expect there will eventually (not in 2.1 yet!)
warnings in the deprecated string module.  If you get such a warning
in a time-critical piece of code, the solution is to use string
methods -- not to while about the performance of the backwards
compatibility code.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@digicool.com  Mon Dec 11 16:02:29 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 11 Dec 2000 11:02:29 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>
 <200012082258.RAA02389@cj20424-a.reston1.va.home.com>
 <3A33C533.ABA27C7C@prescod.net>
Message-ID: <14900.64149.910989.998348@anthem.concentric.net>

Some of my thoughts after reading the PEP and Paul/Guido's exchange.

- A function in the warn module is better than one in the sys module.
  "from warnings import warn" is good enough to not warrant a
  built-in.  I get the sense that the PEP description is behind
  Guido's currently implementation here.

- When PyErr_Warn() returns 1, does that mean a warning has been
  transmuted into an exception, or some other exception occurred
  during the setting of the warning?  (I think I know, but the PEP
  could be clearer here).

- It would be nice if lineno can be a range specification.  Other
  matches are based on regexps -- think of this as a line number
  regexp.

- Why not do setupwarnings() in site.py?

- Regexp matching on messages should be case insensitive.

- The second argument to sys.warn() or PyErr_Warn() can be any class,
  right?  If so, it's easy for me to have my own warning classes.
  What if I want to set up my own warnings filters?  Maybe if `action'
  could be a callable as well as a string.  Then in my IDE, I could
  set that to "mygui.popupWarningsDialog".

-Barry


From guido@python.org  Mon Dec 11 15:57:33 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 10:57:33 -0500
Subject: [Python-Dev] Online help PEP
In-Reply-To: Your message of "Sun, 10 Dec 2000 23:23:17 PST."
 <3A3480E5.C2577AE6@ActiveState.com>
References: <3A3480E5.C2577AE6@ActiveState.com>
Message-ID: <200012111557.KAA24266@cj20424-a.reston1.va.home.com>

I approve of the general idea.  Barry, please assign a PEP number.

> PEP: ???
> Title: Python Online Help
> Version: $Revision: 1.0 $
> Author: paul@prescod.net, paulp@activestate.com (Paul Prescod)
> Status: Draft
> Type: Standards Track
> Python-Version: 2.1
> Status: Incomplete
> 
> Abstract
> 
>     This PEP describes a command-line driven online help facility
>     for Python. The facility should be able to build on existing
>     documentation facilities such as the Python documentation 
>     and docstrings. It should also be extensible for new types and
>     modules.
> 
> Interactive use:
> 
>     Simply typing "help" describes the help function (through repr 
>     overloading).

Cute -- like license, copyright, credits I suppose.

>     "help" can also be used as a function:
> 
>     The function takes the following forms of input:
> 
>         help( "string" ) -- built-in topic or global

Why does a global require string quotes?

>         help( <ob> ) -- docstring from object or type
>         help( "doc:filename" ) -- filename from Python documentation

I'm missing

          help() -- table of contents

I'm not sure if the table of contents should be printed by the repr
output.

>     If you ask for a global, it can be a fully-qualfied name such as 
>     help("xml.dom").

Why are the string quotes needed?  When are they useful?

>     You can also use the facility from a command-line
> 
>     python --help if

Is this really useful?  Sounds like Perlism to me.

>     In either situation, the output does paging similar to the "more"
>     command. 

Agreed.  But how to implement paging in a platform-dependent manner?
On Unix, os.system("more") or "$PAGER" is likely to work.  On Windows,
I suppose we could use its MORE, although that's pretty braindead.  On
the Mac?  Also, inside IDLE or Pythonwin, invoking the system pager
isn't a good idea.

> Implementation
> 
>     The help function is implemented in an onlinehelp module which is
>     demand-loaded.

What does "demand-loaded" mean in a Python context?

>     There should be options for fetching help information from
>     environments other than the command line through the onlinehelp
>     module:
> 
>         onelinehelp.gethelp(object_or_string) -> string

Good idea.

>     It should also be possible to override the help display function by
>     assigning to onlinehelp.displayhelp(object_or_string).

Good idea.  Pythonwin and IDLE could use this.  But I'd like it to
work at least "okay" if they don't.

>     The module should be able to extract module information from either 
>     the HTML or LaTeX versions of the Python documentation. Links should
>     be accommodated in a "lynx-like" manner. 

I think this is beyond the scope.  The LaTeX isn't installed anywhere
(and processing would be too much work).  The HTML is installed only
on Windows, where there already is a way to get it to pop up in your
browser (actually two: it's in the Start menu, and also in IDLE's Help
menu).

>     Over time, it should also be able to recognize when docstrings are 
>     in "special" syntaxes like structured text, HTML and LaTeX and
>     decode them appropriately.

A standard syntax for docstrings is under development, PEP 216.  I
don't agree with the proposal there, but in any case the help PEP
should not attempt to legalize a different format than PEP 216.

>     A prototype implementation is available with the Python source 
>     distribution as nondist/sandbox/doctools/onlinehelp.py.

Neat.  I noticed that in a 24-line screen, the pagesize must be set to
21 to avoid stuff scrolling off the screen.  Maybe there's an off-by-3
error somewhere?

I also noticed that it always prints '1' when invoked as a function.
The new license pager in site.py avoids this problem.

help("operators") and several others raise an
AttributeError('handledocrl').

The "lynx-line links" don't work.

> Built-in Topics
> 
>     help( "intro" )  - What is Python? Read this first!
>     help( "keywords" )  - What are the keywords?
>     help( "syntax" )  - What is the overall syntax?
>     help( "operators" )  - What operators are available?
>     help( "builtins" )  - What functions, types, etc. are built-in?
>     help( "modules" )  - What modules are in the standard library?
>     help( "copyright" )  - Who owns Python?
>     help( "moreinfo" )  - Where is there more information?
>     help( "changes" )  - What changed in Python 2.0?
>     help( "extensions" )  - What extensions are installed?
>     help( "faq" )  - What questions are frequently asked?
>     help( "ack" )  - Who has done work on Python lately?

I think it's naive to expect this help facility to replace browsing
the website or the full documentation package.  There should be one
entry that says to point your browser there (giving the local
filesystem URL if available), and that's it.  The rest of the online
help facility should be concerned with exposing doc strings.

> Security Issues
> 
>     This module will attempt to import modules with the same names as
>     requested topics. Don't use the modules if you are not confident
>     that everything in your pythonpath is from a trusted source.

Yikes!  Another reason to avoid the "string" -> global variable
option.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Dec 11 16:53:37 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 11:53:37 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 11:02:29 EST."
 <14900.64149.910989.998348@anthem.concentric.net>
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net>
 <14900.64149.910989.998348@anthem.concentric.net>
Message-ID: <200012111653.LAA24545@cj20424-a.reston1.va.home.com>

> Some of my thoughts after reading the PEP and Paul/Guido's exchange.
> 
> - A function in the warn module is better than one in the sys module.
>   "from warnings import warn" is good enough to not warrant a
>   built-in.  I get the sense that the PEP description is behind
>   Guido's currently implementation here.

Yes.  I've updated the PEP to match the (2nd) implementation.

> - When PyErr_Warn() returns 1, does that mean a warning has been
>   transmuted into an exception, or some other exception occurred
>   during the setting of the warning?  (I think I know, but the PEP
>   could be clearer here).

I've clarified this now: it returns 1 in either case.  You have to do
exception handling in either case.  I'm not telling why -- you don't
need to know.  The caller of PyErr_Warn() should not attempt to catch
the exception -- if that's your intent, you shouldn't be calling
PyErr_Warn().  And PyErr_Warn() is complicated enough that it has to
allow raising an exception.

> - It would be nice if lineno can be a range specification.  Other
>   matches are based on regexps -- think of this as a line number
>   regexp.

Too much complexity already.

> - Why not do setupwarnings() in site.py?

See the PEP and the current implementation.  The delayed-loading of
the warnings module means that we have to save the -W options as
sys.warnoptions.  (This also makes them work when multiple
interpreters are used -- they all get the -W options.)

> - Regexp matching on messages should be case insensitive.

Good point!  Done in my version of the code.

> - The second argument to sys.warn() or PyErr_Warn() can be any class,
>   right?

Almost.  It must be derived from __builtin__.Warning.

>   If so, it's easy for me to have my own warning classes.
>   What if I want to set up my own warnings filters?  Maybe if `action'
>   could be a callable as well as a string.  Then in my IDE, I could
>   set that to "mygui.popupWarningsDialog".

No, for that purpose you would override warnings.showwarning().

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Mon Dec 11 16:58:39 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 11 Dec 2000 17:58:39 +0100
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <14900.64149.910989.998348@anthem.concentric.net>; from barry@digicool.com on Mon, Dec 11, 2000 at 11:02:29AM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net>
Message-ID: <20001211175839.H4396@xs4all.nl>

On Mon, Dec 11, 2000 at 11:02:29AM -0500, Barry A. Warsaw wrote:

> - A function in the warn module is better than one in the sys module.
>   "from warnings import warn" is good enough to not warrant a
>   built-in.  I get the sense that the PEP description is behind
>   Guido's currently implementation here.

+1 on this. I have a response to Guido's first posted PEP on my laptop, but
due to a weekend in Germany wasn't able to post it before he updated the
PEP. I guess I can delete the arguments for this, now ;) but lets just say I
think 'sys' is being a bit overused, and the case of a function in sys and
its data in another module is just plain silly.

> - When PyErr_Warn() returns 1, does that mean a warning has been
>   transmuted into an exception, or some other exception occurred
>   during the setting of the warning?  (I think I know, but the PEP
>   could be clearer here).

How about returning 1 for 'warning turned into exception' and -1 for 'normal
exception' ? It would be slightly more similar to other functions if '-1'
meant 'exception', and it would be easy to put in an if statement -- and
still allow C code to ignore the produced error, if it wanted to.

> - It would be nice if lineno can be a range specification.  Other
>   matches are based on regexps -- think of this as a line number
>   regexp.

+0 on this... I'm not sure if such fine-grained control is really necessary.
I liked the hint at 'per function' granularity, but I realise it's tricky to
do right, what with naming issues and all that. 

> - Regexp matching on messages should be case insensitive.

How about being able to pass in compiled regexp objects as well as strings ?
I haven't looked at the implementation at all, so I'm not sure how expensive
it would be, but it might also be nice to have users (= programmers) pass in
an object with its own 'match' method, so you can 'interactively' decide
whether or not to raise an exception, popup a window, and what not. Sort of
like letting 'action' be a callable, which I think is a good idea as well.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@python.org  Mon Dec 11 17:11:02 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 12:11:02 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 17:58:39 +0100."
 <20001211175839.H4396@xs4all.nl>
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net>
 <20001211175839.H4396@xs4all.nl>
Message-ID: <200012111711.MAA24818@cj20424-a.reston1.va.home.com>

> > - When PyErr_Warn() returns 1, does that mean a warning has been
> >   transmuted into an exception, or some other exception occurred
> >   during the setting of the warning?  (I think I know, but the PEP
> >   could be clearer here).
> 
> How about returning 1 for 'warning turned into exception' and -1 for 'normal
> exception' ? It would be slightly more similar to other functions if '-1'
> meant 'exception', and it would be easy to put in an if statement -- and
> still allow C code to ignore the produced error, if it wanted to.

Why would you want this?  The user clearly said that they wanted the
exception!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@effbot.org  Mon Dec 11 17:13:10 2000
From: fredrik@effbot.org (Fredrik Lundh)
Date: Mon, 11 Dec 2000 18:13:10 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34C98E.7C42FD24@lemburg.com>
Message-ID: <009a01c06395$a9da3220$3c6340d5@hagrid>

> Hmm, I must have missed that one... care to repost ?

doesn't everyone here read the daily URL?

here's a link:
http://mail.python.org/pipermail/python-dev/2000-December/010913.html

</F>



From barry@digicool.com  Mon Dec 11 17:18:04 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 11 Dec 2000 12:18:04 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>
 <200012082258.RAA02389@cj20424-a.reston1.va.home.com>
 <3A33C533.ABA27C7C@prescod.net>
 <14900.64149.910989.998348@anthem.concentric.net>
 <200012111653.LAA24545@cj20424-a.reston1.va.home.com>
Message-ID: <14901.3149.109401.151742@anthem.concentric.net>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> I've clarified this now: it returns 1 in either case.  You
    GvR> have to do exception handling in either case.  I'm not
    GvR> telling why -- you don't need to know.  The caller of
    GvR> PyErr_Warn() should not attempt to catch the exception -- if
    GvR> that's your intent, you shouldn't be calling PyErr_Warn().
    GvR> And PyErr_Warn() is complicated enough that it has to allow
    GvR> raising an exception.

Makes sense.

    >> - It would be nice if lineno can be a range specification.
    >> Other matches are based on regexps -- think of this as a line
    >> number regexp.

    GvR> Too much complexity already.

Okay, no biggie I think.

    >> - Why not do setupwarnings() in site.py?

    GvR> See the PEP and the current implementation.  The
    GvR> delayed-loading of the warnings module means that we have to
    GvR> save the -W options as sys.warnoptions.  (This also makes
    GvR> them work when multiple interpreters are used -- they all get
    GvR> the -W options.)

Cool.

    >> - Regexp matching on messages should be case insensitive.

    GvR> Good point!  Done in my version of the code.

Cool.

    >> - The second argument to sys.warn() or PyErr_Warn() can be any
    >> class, right?

    GvR> Almost.  It must be derived from __builtin__.Warning.

__builtin__.Warning == exceptions.Warning, right?

    >> If so, it's easy for me to have my own warning classes.  What
    >> if I want to set up my own warnings filters?  Maybe if `action'
    >> could be a callable as well as a string.  Then in my IDE, I
    >> could set that to "mygui.popupWarningsDialog".

    GvR> No, for that purpose you would override
    GvR> warnings.showwarning().

Cool.

Looks good.
-Barry


From thomas@xs4all.net  Mon Dec 11 18:04:56 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 11 Dec 2000 19:04:56 +0100
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012111711.MAA24818@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 12:11:02PM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> <200012111711.MAA24818@cj20424-a.reston1.va.home.com>
Message-ID: <20001211190455.I4396@xs4all.nl>

On Mon, Dec 11, 2000 at 12:11:02PM -0500, Guido van Rossum wrote:

> > How about returning 1 for 'warning turned into exception' and -1 for 'normal
> > exception' ? It would be slightly more similar to other functions if '-1'
> > meant 'exception', and it would be easy to put in an if statement -- and
> > still allow C code to ignore the produced error, if it wanted to.

> Why would you want this?  The user clearly said that they wanted the
> exception!

The difference is that in one case, the user will see the original
warning-turned-exception, and in the other she won't -- the warning will be
lost. At best she'll see (by looking at the traceback) the code intended to
give a warning (that might or might not have been turned into an exception)
and failed. The warning code might decide to do something aditional to
notify the user of the thing it intended to warn about, which ended up as a
'real' exception because of something else.

It's no biggy, obviously, except that if you change your mind it will be
hard to add it without breaking code. Even if you explicitly state the
return value should be tested for boolean value, not greater-than-zero
value.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@python.org  Mon Dec 11 18:16:58 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 13:16:58 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 19:04:56 +0100."
 <20001211190455.I4396@xs4all.nl>
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> <200012111711.MAA24818@cj20424-a.reston1.va.home.com>
 <20001211190455.I4396@xs4all.nl>
Message-ID: <200012111816.NAA25214@cj20424-a.reston1.va.home.com>

> > > How about returning 1 for 'warning turned into exception' and -1 for 'normal
> > > exception' ? It would be slightly more similar to other functions if '-1'
> > > meant 'exception', and it would be easy to put in an if statement -- and
> > > still allow C code to ignore the produced error, if it wanted to.
> 
> > Why would you want this?  The user clearly said that they wanted the
> > exception!
> 
> The difference is that in one case, the user will see the original
> warning-turned-exception, and in the other she won't -- the warning will be
> lost. At best she'll see (by looking at the traceback) the code intended to
> give a warning (that might or might not have been turned into an exception)
> and failed.

Yes -- this is a standard convention in Python.  if there's a bug in
code that is used to raise or handle an exception, you get a traceback
from that bug.

> The warning code might decide to do something aditional to
> notify the user of the thing it intended to warn about, which ended up as a
> 'real' exception because of something else.

Nah.  The warning code shouldn't worry about that.  If there's a bug
in PyErr_Warn(), that should get top priority until it's fixed.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Mon Dec 11 18:12:56 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 11 Dec 2000 19:12:56 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34C98E.7C42FD24@lemburg.com> <009a01c06395$a9da3220$3c6340d5@hagrid>
Message-ID: <3A351928.3A41C970@lemburg.com>

Fredrik Lundh wrote:
> 
> > Hmm, I must have missed that one... care to repost ?
> 
> doesn't everyone here read the daily URL?

No time for pull logic... only push logic ;-)

> here's a link:
> http://mail.python.org/pipermail/python-dev/2000-December/010913.html

Thanks.

A very nice introduction indeed. The only thing which
didn't come through in the first reading: why do we need
GF(p^n)'s in the first place ? The second reading then made this
clear: we need to assure that by iterating through the set of
possible coefficients we can actually reach all slots in the
dictionary... a gem indeed.

Now if we could only figure out an equally simple way of
producing perfect hash functions on-the-fly we could eliminate
the need for the PyObject_Compare()s... ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tim.one@home.com  Mon Dec 11 20:22:55 2000
From: tim.one@home.com (Tim Peters)
Date: Mon, 11 Dec 2000 15:22:55 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <033701c06366$ab746580$0900a8c0@SPIFF>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEFJIDAA.tim.one@home.com>

[/F, on Christian's GF tutorial]
> I think someone just won the "brain exploder 2000" award ;-)

Well, anyone can play.  When keys collide, what we need is a function f(i)
such that repeating
    i = f(i)
visits every int in (0, 2**N) exactly once before setting i back to its
initial value, for a fixed N and where the first i is in (0, 2**N).  This is
the quickest:

def f(i):
    i -= 1
    if i == 0:
        i = 2**N-1
    return i

Unfortunately, this leads to performance-destroying "primary collisions"
(see Knuth, or any other text w/ a section on hashing).

Other *good* possibilities include a pseudo-random number generator of
maximal period, or viewing the ints in (0, 2**N) as bit vectors indicating
set membership and generating all subsets of an N-element set in a Grey code
order.

The *form* of the function dictobject.c actually uses is:

def f(i):
    i <<= 1
    if i >= 2**N:
       i ^= MAGIC_CONSTANT_DEPENDING_ON_N
    return i

which is suitably non-linear and as fast as the naive method.  Given the
form of the function, you don't need any theory at all to find a value for
MAGIC_CONSTANT_DEPENDING_ON_N that simply works.  In fact, I verified all
the magic values in dictobject.c via brute force, because the person who
contributed the original code botched the theory slightly and gave us some
values that didn't work.  I'll rely on the theory if and only if we have to
extend this to 64-bit machines someday:  I'm too old to wait for a brute
search of a space with 2**64 elements <wink>.

mathematics-is-a-battle-against-mortality-ly y'rs  - tim



From greg@cosc.canterbury.ac.nz  Mon Dec 11 21:46:11 2000
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 12 Dec 2000 10:46:11 +1300 (NZDT)
Subject: [Python-Dev] Online help PEP
In-Reply-To: <200012111557.KAA24266@cj20424-a.reston1.va.home.com>
Message-ID: <200012112146.KAA01771@s454.cosc.canterbury.ac.nz>

Guido:
> Paul Prescod:
> > In either situation, the output does paging similar to the "more"
> > command. 
> Agreed.

Only if it can be turned off! I usually prefer to use the
scrolling capabilities of whatever shell window I'm using
rather than having some program's own idea of how to do
paging forced upon me when I don't want it.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From moshez@zadka.site.co.il  Tue Dec 12 06:33:02 2000
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Tue, 12 Dec 2000 08:33:02 +0200 (IST)
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
Message-ID: <20001212063302.05E0BA82E@darjeeling.zadka.site.co.il>

On Mon, 11 Dec 2000 15:22:55 -0500, "Tim Peters" <tim.one@home.com> wrote:

> Well, anyone can play.  When keys collide, what we need is a function f(i)
> such that repeating
>     i = f(i)
> visits every int in (0, 2**N) exactly once before setting i back to its
> initial value, for a fixed N and where the first i is in (0, 2**N).  

OK, maybe this is me being *real* stupid, but why? Why not [0, 2**n)?
Did 0 harm you in your childhood, and you're trying to get back? <0 wink>.

If we had an affine operation, instead of a linear one, we could have 
[0, 2**n). I won't repeat the proof here but changing

> def f(i):
>     i <<= 1
      i^=1 # This is the line I added
>     if i >= 2**N:
>        i ^= MAGIC_CONSTANT_DEPENDING_ON_N
>     return i

Makes you waltz all over [0, 2**n) if the original made you comple 
(0, 2**n). 

if-i'm-wrong-then-someone-should-shoot-me-to-save-me-the-embarrasment-ly y'rs,
Z.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From tim.one@home.com  Mon Dec 11 22:38:56 2000
From: tim.one@home.com (Tim Peters)
Date: Mon, 11 Dec 2000 17:38:56 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <20001212063302.05E0BA82E@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEGBIDAA.tim.one@home.com>

[Tim]
> Well, anyone can play.  When keys collide, what we need is a
> function f(i) such that repeating
>     i = f(i)
> visits every int in (0, 2**N) exactly once before setting i back to its
> initial value, for a fixed N and where the first i is in (0, 2**N).

[Moshe Zadka]
> OK, maybe this is me being *real* stupid, but why? Why not [0, 2**n)?
> Did 0 harm you in your childhood, and you're trying to get
> back? <0 wink>.

We don't need f at all unless we've already determined there's a collision
at some index h.  The i sequence is used to offset h (mod 2**N).  An
increment of 0 would waste time (h+0 == h, but we've already done a full
compare on the h'th table entry and already determined it wasn't equal to
what we're looking for).

IOW, there are only 2**N-1 slots still of interest by the time f is needed.

> If we had an affine operation, instead of a linear one, we could have
> [0, 2**n). I won't repeat the proof here but changing
>
> def f(i):
>     i <<= 1
>     i^=1 # This is the line I added
>     if i >= 2**N:
>         i ^= MAGIC_CONSTANT_DEPENDING_ON_N
>     return i
>
> Makes you waltz all over [0, 2**n) if the original made you comple
> (0, 2**n).

But, Moshe!  The proof would have been the most interesting part <wink>.



From gstein@lyra.org  Tue Dec 12 00:15:50 2000
From: gstein@lyra.org (Greg Stein)
Date: Mon, 11 Dec 2000 16:15:50 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012111653.LAA24545@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 11:53:37AM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com>
Message-ID: <20001211161550.Y7732@lyra.org>

On Mon, Dec 11, 2000 at 11:53:37AM -0500, Guido van Rossum wrote:
>...
> > - The second argument to sys.warn() or PyErr_Warn() can be any class,
> >   right?
> 
> Almost.  It must be derived from __builtin__.Warning.

Since you must do "from warnings import warn" before using the warnings,
then I think it makes sense to put the Warning classes into the warnings
module. (e.g. why increase the size of the builtins?)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido@python.org  Tue Dec 12 00:39:31 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 19:39:31 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 16:15:50 PST."
 <20001211161550.Y7732@lyra.org>
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com>
 <20001211161550.Y7732@lyra.org>
Message-ID: <200012120039.TAA02983@cj20424-a.reston1.va.home.com>

> Since you must do "from warnings import warn" before using the warnings,
> then I think it makes sense to put the Warning classes into the warnings
> module. (e.g. why increase the size of the builtins?)

I don't particularly care whether the Warning category classes are
builtins, but I can't declare them in the warnings module.  Typical
use from C is:

    if (PyErr_Warn(PyExc_DeprecationWarning,
		   "the strop module is deprecated"))
            return NULL;

PyErr_Warn() imports the warnings module on its first call.  But the
value of PyExc_DeprecationWarning c.s. must be available *before* the
first call, so they can't be imported from the warnings module!

My first version imported warnings at the start of the program, but
this almost doubled the start-up time, hence the design where the
module is imported only when needed.

The most convenient place to create the Warning category classes is in
the _exceptions module; doing it the easiest way there means that they
are automatically exported to __builtin__.  This doesn't bother me
enough to try and hide them.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Tue Dec 12 01:11:02 2000
From: gstein@lyra.org (Greg Stein)
Date: Mon, 11 Dec 2000 17:11:02 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012120039.TAA02983@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 07:39:31PM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com>
Message-ID: <20001211171102.C7732@lyra.org>

On Mon, Dec 11, 2000 at 07:39:31PM -0500, Guido van Rossum wrote:
> > Since you must do "from warnings import warn" before using the warnings,
> > then I think it makes sense to put the Warning classes into the warnings
> > module. (e.g. why increase the size of the builtins?)
> 
> I don't particularly care whether the Warning category classes are
> builtins, but I can't declare them in the warnings module.  Typical
> use from C is:
> 
>     if (PyErr_Warn(PyExc_DeprecationWarning,
> 		   "the strop module is deprecated"))
>             return NULL;
> 
> PyErr_Warn() imports the warnings module on its first call.  But the
> value of PyExc_DeprecationWarning c.s. must be available *before* the
> first call, so they can't be imported from the warnings module!

Do the following:

pywarn.h or pyerrors.h:

#define PyWARN_DEPRECATION "DeprecationWarning"

     ...
     if (PyErr_Warn(PyWARN_DEPRECATION,
 		   "the strop module is deprecated"))
             return NULL;

The PyErr_Warn would then use the string to dynamically look up / bind to
the correct value from the warnings module. By using the symbolic constant,
you will catch typos in the C code (e.g. if people passed raw strings, then
a typo won't be found until runtime; using symbols will catch the problem at
compile time).

The above strategy will allow for fully-delayed loading, and for all the
warnings to be located in the "warnings" module.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido@python.org  Tue Dec 12 01:21:41 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 20:21:41 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 17:11:02 PST."
 <20001211171102.C7732@lyra.org>
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com>
 <20001211171102.C7732@lyra.org>
Message-ID: <200012120121.UAA04576@cj20424-a.reston1.va.home.com>

> > PyErr_Warn() imports the warnings module on its first call.  But the
> > value of PyExc_DeprecationWarning c.s. must be available *before* the
> > first call, so they can't be imported from the warnings module!
> 
> Do the following:
> 
> pywarn.h or pyerrors.h:
> 
> #define PyWARN_DEPRECATION "DeprecationWarning"
> 
>      ...
>      if (PyErr_Warn(PyWARN_DEPRECATION,
>  		   "the strop module is deprecated"))
>              return NULL;
> 
> The PyErr_Warn would then use the string to dynamically look up / bind to
> the correct value from the warnings module. By using the symbolic constant,
> you will catch typos in the C code (e.g. if people passed raw strings, then
> a typo won't be found until runtime; using symbols will catch the problem at
> compile time).
> 
> The above strategy will allow for fully-delayed loading, and for all the
> warnings to be located in the "warnings" module.

Yeah, that would be a possibility, if it was deemed evil that the
warnings appear in __builtin__.  I don't see what's so evil about
that.

(There's also the problem that the C code must be able to create new
warning categories, as long as they are derived from the Warning base
class.  Your approach above doesn't support this.  I'm sure you can
figure a way around that too.  But I prefer to hear why you think it's
needed first.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Tue Dec 12 01:26:00 2000
From: gstein@lyra.org (Greg Stein)
Date: Mon, 11 Dec 2000 17:26:00 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012120121.UAA04576@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 08:21:41PM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com> <20001211171102.C7732@lyra.org> <200012120121.UAA04576@cj20424-a.reston1.va.home.com>
Message-ID: <20001211172600.E7732@lyra.org>

On Mon, Dec 11, 2000 at 08:21:41PM -0500, Guido van Rossum wrote:
>...
> > The above strategy will allow for fully-delayed loading, and for all the
> > warnings to be located in the "warnings" module.
> 
> Yeah, that would be a possibility, if it was deemed evil that the
> warnings appear in __builtin__.  I don't see what's so evil about
> that.
> 
> (There's also the problem that the C code must be able to create new
> warning categories, as long as they are derived from the Warning base
> class.  Your approach above doesn't support this.  I'm sure you can
> figure a way around that too.  But I prefer to hear why you think it's
> needed first.)

I'm just attempting to avoid dumping more names into __builtins__ is all. I
don't believe there is anything intrinsically bad about putting more names
in there, but avoiding the kitchen-sink metaphor for __builtins__ has got to
be a Good Thing :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido@python.org  Tue Dec 12 13:43:59 2000
From: guido@python.org (Guido van Rossum)
Date: Tue, 12 Dec 2000 08:43:59 -0500
Subject: [Python-Dev] Request review of gdbm patch
Message-ID: <200012121343.IAA06713@cj20424-a.reston1.va.home.com>

I'm asking for a review of the patch to gdbm at 

http://sourceforge.net/patch/?func=detailpatch&patch_id=102638&group_id=5470

I asked the author for clarification and this is what I got.

Can anybody suggest what to do?  His mail doesn't give me much
confidence in the patch. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Tue, 12 Dec 2000 13:24:13 +0100
From:    Damjan <arhiv@freemail.org.mk>
To:      Guido van Rossum <guido@python.org>
Subject: Re: your gdbm patch for Python

On Mon, Dec 11, 2000 at 03:51:03PM -0500, Guido van Rossum wrote:
> I'm looking at your patch at SourceForge:

First, I'm sorry it was such a mess of a patch, but I could't figure it out how
to send a more elaborate comment. (But then again, I would't have an email from
Guido van Rossum in my mail-box, to show of my friends :)

> and wondering two things:
> 
> (1) what does the patch do?
> 
> (2) why does the patch remove the 'f' / GDBM_FAST option?

 From the gdbm info page:
     ...The following may also be
     logically or'd into the database flags: GDBM_SYNC, which causes
     all database operations to be synchronized to the disk, and
     GDBM_NOLOCK, which prevents the library from performing any
     locking on the database file.  The option GDBM_FAST is now
     obsolete, since `gdbm' defaults to no-sync mode...
     ^^^^^^^^
(1) My patch adds two options to the gdbm.open(..) function. These are 'u' for
GDBM_NOLOCK, and 's' for GDBM_SYNC.

(2) GDBM_FAST is obsolete because gdbm defaults to GDBM_FAST, so it's removed.

I'm also thinking about adding a lock and unlock methods to the gdbm object,
but it seems that a gdbm database can only be locked and not unlocked.


- -- 
Damjan Georgievski		|           Дамјан Георгиевски
Skopje, Macedonia		|           Скопје, Македонија

------- End of Forwarded Message



From mal@lemburg.com  Tue Dec 12 13:49:40 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 12 Dec 2000 14:49:40 +0100
Subject: [Python-Dev] Codec aliasing and naming
Message-ID: <3A362CF4.2082A606@lemburg.com>

I just wanted to inform you of a change I plan for the standard
encodings search function to enable better support for aliasing
of encoding names.

The current implementation caches the aliases returned from the
codecs .getaliases() function in the encodings lookup cache
rather than in the alias cache. As a consequence, the hyphen to
underscore mapping is not applied to the aliases. A codec would
have to return a list of all combinations of names with hyphens
and underscores in order to emulate the standard lookup 
behaviour.

I have a ptach which fixes this and also assures that aliases
cannot be overwritten by codecs which register at some later
point in time. This assures that we won't run into situations
where a codec import suddenly overrides behaviour of previously
active codecs.

I would also like to propose the use of a new naming scheme
for codecs which enables drop-in installation. As discussed
on the i18n-sig list, people would like to install codecs
without having the users to call a codec registration function
or to touch site.py.

The standard search function in the encodings package has a
nice property (which I only noticed after the fact ;) which
allows using Python package names in the encoding names,
e.g. you can install a package 'japanese' and the access the
codecs in that package using 'japanese.shiftjis' without
having to bother registering a new codec search function
for the package -- the encodings package search function
will redirect the lookup to the 'japanese' package.

Using package names in the encoding name has several
advantages:
* you know where the codec comes from
* you can have mutliple codecs for the same encoding
* drop-in installation without registration is possible
* the need for a non-default encoding package is visible in the
  source code
* you no longer need to drop new codecs into the Python
  standard lib

Perhaps someone could add a note about this possibility
to the codec docs ?!

If noone objects, I'll apply the patch for the enhanced alias
support later today.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@python.org  Tue Dec 12 13:57:01 2000
From: guido@python.org (Guido van Rossum)
Date: Tue, 12 Dec 2000 08:57:01 -0500
Subject: [Python-Dev] Codec aliasing and naming
In-Reply-To: Your message of "Tue, 12 Dec 2000 14:49:40 +0100."
 <3A362CF4.2082A606@lemburg.com>
References: <3A362CF4.2082A606@lemburg.com>
Message-ID: <200012121357.IAA06846@cj20424-a.reston1.va.home.com>

> Perhaps someone could add a note about this possibility
> to the codec docs ?!

You can check it in yourself or mail it to Fred or submit it to SF...
I don't expect anyone else will jump in and document this properly.

> If noone objects, I'll apply the patch for the enhanced alias
> support later today.

Fine with me (but I don't use codecs -- where's the Dutch language
support? :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Tue Dec 12 14:38:20 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 12 Dec 2000 15:38:20 +0100
Subject: [Python-Dev] Codec aliasing and naming
References: <3A362CF4.2082A606@lemburg.com> <200012121357.IAA06846@cj20424-a.reston1.va.home.com>
Message-ID: <3A36385C.60C7F2B@lemburg.com>

Guido van Rossum wrote:
> 
> > Perhaps someone could add a note about this possibility
> > to the codec docs ?!
> 
> You can check it in yourself or mail it to Fred or submit it to SF...
> I don't expect anyone else will jump in and document this properly.

I'll submit a bug report so that this doesn't get lost in
the archives. Don't have time for it myself... alas, noone
really does seem to have time these days ;-)
 
> > If noone objects, I'll apply the patch for the enhanced alias
> > support later today.
> 
> Fine with me (but I don't use codecs -- where's the Dutch language
> support? :-).

OK. 

About the Dutch language support: this would make a nice
Christmas fun-project... a new standard module which interfaces
to babel.altavista.com (hmm, they don't list Dutch as a supported
language yet, but maybe if we bug them enough... ;).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From paulp@ActiveState.com  Tue Dec 12 18:11:13 2000
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 12 Dec 2000 10:11:13 -0800
Subject: [Python-Dev] Online help PEP
References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com>
Message-ID: <3A366A41.1A14EFD4@ActiveState.com>

Guido van Rossum wrote:
> 
>...
> >         help( "string" ) -- built-in topic or global
> 
> Why does a global require string quotes?

It doesn't, but if you happen to say 

help( "dir" ) instead of help( dir ), I think it should do the right
thing.

> I'm missing
> 
>           help() -- table of contents
> 
> I'm not sure if the table of contents should be printed by the repr
> output.

I don't see any benefit in having different behaviors for help and
help().

> >     If you ask for a global, it can be a fully-qualfied name such as
> >     help("xml.dom").
> 
> Why are the string quotes needed?  When are they useful?

When you haven't imported the thing you are asking about. Or when the
string comes from another UI like an editor window, command line or web
form.

> >     You can also use the facility from a command-line
> >
> >     python --help if
> 
> Is this really useful?  Sounds like Perlism to me.

I'm just trying to make it easy to quickly get answers to Python
questions. I could totally see someone writing code in VIM switching to
a bash window to type:

python --help os.path.dirname

That's alot easier than:

$ python
Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import os
>>> help(os.path.dirname)

And what does it hurt?

> >     In either situation, the output does paging similar to the "more"
> >     command.
> 
> Agreed.  But how to implement paging in a platform-dependent manner?
> On Unix, os.system("more") or "$PAGER" is likely to work.  On Windows,
> I suppose we could use its MORE, although that's pretty braindead.  On
> the Mac?  Also, inside IDLE or Pythonwin, invoking the system pager
> isn't a good idea.

The current implementation does paging internally. You could override it
to use the system pager (or no pager).

> What does "demand-loaded" mean in a Python context?

When you "touch" the help object, it loads the onlinehelp module which
has the real implementation. The thing in __builtins__ is just a
lightweight proxy.

> >     It should also be possible to override the help display function by
> >     assigning to onlinehelp.displayhelp(object_or_string).
> 
> Good idea.  Pythonwin and IDLE could use this.  But I'd like it to
> work at least "okay" if they don't.

Agreed. 

> >     The module should be able to extract module information from either
> >     the HTML or LaTeX versions of the Python documentation. Links should
> >     be accommodated in a "lynx-like" manner.
> 
> I think this is beyond the scope.  

Well, we have to do one of:

 * re-write a subset of the docs in a form that can be accessed from the
command line
 * access the existing docs in a form that's installed
 * auto-convert the docs into a form that's compatible

I've already implemented HTML parsing and LaTeX parsing is actually not
that far off. I just need impetus to finish a LaTeX-parsing project I
started on my last vacation.

The reason that LaTeX is interesting is because it would be nice to be
able to move documentation from existing LaTeX files into docstrings.

> The LaTeX isn't installed anywhere
> (and processing would be too much work).  
> The HTML is installed only
> on Windows, where there already is a way to get it to pop up in your
> browser (actually two: it's in the Start menu, and also in IDLE's Help
> menu).

If the documentation becomes an integral part of the Python code, then
it will be installed. It's ridiculous that it isn't already.
ActivePython does install the docs on all platforms.

> A standard syntax for docstrings is under development, PEP 216.  I
> don't agree with the proposal there, but in any case the help PEP
> should not attempt to legalize a different format than PEP 216.

I won't hold my breath for a standard Python docstring format. I've gone
out of my way to make the code format independent..

> Neat.  I noticed that in a 24-line screen, the pagesize must be set to
> 21 to avoid stuff scrolling off the screen.  Maybe there's an off-by-3
> error somewhere?

Yes.

> I also noticed that it always prints '1' when invoked as a function.
> The new license pager in site.py avoids this problem.

Okay.

> help("operators") and several others raise an
> AttributeError('handledocrl').

Fixed.

> The "lynx-line links" don't work.

I don't think that's implemented yet.

> I think it's naive to expect this help facility to replace browsing
> the website or the full documentation package.  There should be one
> entry that says to point your browser there (giving the local
> filesystem URL if available), and that's it.  The rest of the online
> help facility should be concerned with exposing doc strings.

I don't want to replace the documentation. But there is no reason we
should set out to make it incomplete. If its integrated with the HTML
then people can choose whatever access mechanism is easiest for them
right now

I'm trying hard not to be "naive". Realistically, nobody is going to
write a million docstrings between now and Python 2.1. It is much more
feasible to leverage the existing documentation that Fred and others
have spent months on.

> > Security Issues
> > 
> >     This module will attempt to import modules with the same names as
> >     requested topics. Don't use the modules if you are not confident
> >     that everything in your pythonpath is from a trusted source.
> Yikes!  Another reason to avoid the "string" -> global variable
> option.

I don't think we should lose that option. People will want to look up
information from non-executable environments like command lines, GUIs
and web pages. Perhaps you can point me to techniques for extracting
information from Python modules and packages without executing them.

 Paul


From guido@python.org  Tue Dec 12 20:46:09 2000
From: guido@python.org (Guido van Rossum)
Date: Tue, 12 Dec 2000 15:46:09 -0500
Subject: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE
Message-ID: <200012122046.PAA16915@cj20424-a.reston1.va.home.com>

------- Forwarded Message

Date:    Tue, 12 Dec 2000 12:38:20 -0800
From:    noreply@sourceforge.net
To:      noreply@sourceforge.net
Subject: SourceForge: PROJECT DOWNTIME NOTICE

ATTENTION SOURCEFORGE PROJECT ADMINISTRATORS

This update is being sent to project administrators only and contains
important information regarding your project. Please read it in its
entirety.


INFRASTRUCTURE UPGRADE, EXPANSION AND RELOCATION

As noted in the sitewide email sent this week, the SourceForge.net
infrastructure is being upgraded (and relocated). As part of this
projects, plans are underway to further increase capacity and
responsiveness.

We are scheduling the relocation of the systems serving project
subdomain web pages.


IMPORTANT:

This move will affect you in the following ways:

1. Service and availability of SourceForge.net and the development
tools provided will continue uninterupted.

2. Project page webservers hosting subdomains
(yourprojectname.sourceforge.net) will be down Friday December 15 from
9PM PST (12AM EST) until 3AM PST.

3. CVS will be unavailable (read only part of the time) from 7PM
until 3AM PST

4. Mailing lists and mail aliases will be unavailable until 3AM PST


- ---------------------
This email was sent from sourceforge.net. To change your email receipt
preferences, please visit the site and edit your account via the
"Account Maintenance" link.

Direct any questions to admin@sourceforge.net, or reply to this email.

------- End of Forwarded Message



From greg@cosc.canterbury.ac.nz  Tue Dec 12 22:42:01 2000
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 13 Dec 2000 11:42:01 +1300 (NZDT)
Subject: [Python-Dev] Online help PEP
In-Reply-To: <3A366A41.1A14EFD4@ActiveState.com>
Message-ID: <200012122242.LAA01902@s454.cosc.canterbury.ac.nz>

Paul Prescod:
> Guido:
> > Why are the string quotes needed?  When are they useful?
> When you haven't imported the thing you are asking about.

It would be interesting if the quoted form allowed you to
extract doc info from a module *without* having the side
effect of importing it.

This could no doubt be done for pure Python modules.
Would be rather tricky for extension modules, though,
I expect.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From barry@digicool.com  Wed Dec 13 02:21:36 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 12 Dec 2000 21:21:36 -0500
Subject: [Python-Dev] Two new PEPs, 232 & 233
Message-ID: <14902.56624.20961.768525@anthem.concentric.net>

I've just uploaded two new PEPs.  232 is a revision of my pre-PEP era
function attribute proposal.  233 is Paul Prescod's proposal for an
on-line help facility.

http://python.sourceforge.net/peps/pep-0232.html
http://python.sourceforge.net/peps/pep-0233.html

Let the games begin,
-Barry


From tim.one@home.com  Wed Dec 13 03:34:35 2000
From: tim.one@home.com (Tim Peters)
Date: Tue, 12 Dec 2000 22:34:35 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEGBIDAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEJHIDAA.tim.one@home.com>

[Moshe Zadka]
> If we had an affine operation, instead of a linear one, we could have
> [0, 2**n). I won't repeat the proof here but changing
>
> def f(i):
>     i <<= 1
>     i^=1 # This is the line I added
>     if i >= 2**N:
>         i ^= MAGIC_CONSTANT_DEPENDING_ON_N
>     return i
>
> Makes you waltz all over [0, 2**n) if the original made you comple
> (0, 2**n).

[Tim]
> But, Moshe!  The proof would have been the most interesting part <wink>.

Turns out the proof would have been intensely interesting, as you can see by
running the attached with and without the new line commented out.

don't-ever-trust-a-theoretician<wink>-ly y'rs  - tim


N = 2
MAGIC_CONSTANT_DEPENDING_ON_N = 7

def f(i):
    i <<= 1
    # i^=1 # This is the line I added
    if i >= 2**N:
        i ^= MAGIC_CONSTANT_DEPENDING_ON_N
    return i

i = 1
for nothing in range(4):
    print i,
    i = f(i)
print i



From akuchlin@mems-exchange.org  Wed Dec 13 03:55:33 2000
From: akuchlin@mems-exchange.org (A.M. Kuchling)
Date: Tue, 12 Dec 2000 22:55:33 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
Message-ID: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>

At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
received a patch from Thomas Gellekum that adds support for the panel
library that will add another 500 lines.  I'd like to split the C file
into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
get #included from the master _cursesmodule.c file.  

Do the powers that be approve of this idea?

--amk


From tim.one@home.com  Wed Dec 13 03:54:20 2000
From: tim.one@home.com (Tim Peters)
Date: Tue, 12 Dec 2000 22:54:20 -0500
Subject: FW: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE
Message-ID: <LNBBLJKPBEHFEDALKOLCKEJKIDAA.tim.one@home.com>

FYI, looks like SourceForge is scheduled to be unusable in a span covering
late Friday thru early Saturday (OTT -- One True Time, defined by the clocks
in Guido's house).

-----Original Message-----
From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On
Behalf Of Guido van Rossum
Sent: Tuesday, December 12, 2000 3:46 PM
To: python-dev@python.org
Subject: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE



------- Forwarded Message

Date:    Tue, 12 Dec 2000 12:38:20 -0800
From:    noreply@sourceforge.net
To:      noreply@sourceforge.net
Subject: SourceForge: PROJECT DOWNTIME NOTICE

ATTENTION SOURCEFORGE PROJECT ADMINISTRATORS

This update is being sent to project administrators only and contains
important information regarding your project. Please read it in its
entirety.


INFRASTRUCTURE UPGRADE, EXPANSION AND RELOCATION

As noted in the sitewide email sent this week, the SourceForge.net
infrastructure is being upgraded (and relocated). As part of this
projects, plans are underway to further increase capacity and
responsiveness.

We are scheduling the relocation of the systems serving project
subdomain web pages.


IMPORTANT:

This move will affect you in the following ways:

1. Service and availability of SourceForge.net and the development
tools provided will continue uninterupted.

2. Project page webservers hosting subdomains
(yourprojectname.sourceforge.net) will be down Friday December 15 from
9PM PST (12AM EST) until 3AM PST.

3. CVS will be unavailable (read only part of the time) from 7PM
until 3AM PST

4. Mailing lists and mail aliases will be unavailable until 3AM PST


---------------------
This email was sent from sourceforge.net. To change your email receipt
preferences, please visit the site and edit your account via the
"Account Maintenance" link.

Direct any questions to admin@sourceforge.net, or reply to this email.

------- End of Forwarded Message


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://www.python.org/mailman/listinfo/python-dev



From esr@thyrsus.com  Wed Dec 13 04:29:17 2000
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 12 Dec 2000 23:29:17 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>; from amk@mira.erols.com on Tue, Dec 12, 2000 at 10:55:33PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
Message-ID: <20001212232917.A22839@thyrsus.com>

A.M. Kuchling <amk@mira.erols.com>:
> At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
> received a patch from Thomas Gellekum that adds support for the panel
> library that will add another 500 lines.  I'd like to split the C file
> into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
> get #included from the master _cursesmodule.c file.  
> 
> Do the powers that be approve of this idea?

I doubt I qualify as a power that be, but I'm certainly +1 on panel support.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The biggest hypocrites on gun control are those who live in upscale
developments with armed security guards -- and who want to keep other
people from having guns to defend themselves.  But what about
lower-income people living in high-crime, inner city neighborhoods?
Should such people be kept unarmed and helpless, so that limousine
liberals can 'make a statement' by adding to the thousands of gun laws
already on the books?"
	--Thomas Sowell


From fdrake@acm.org  Wed Dec 13 06:24:01 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 13 Dec 2000 01:24:01 -0500 (EST)
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
Message-ID: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>

A.M. Kuchling writes:
 > At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
 > received a patch from Thomas Gellekum that adds support for the panel
 > library that will add another 500 lines.  I'd like to split the C file
 > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
 > get #included from the master _cursesmodule.c file.  

  Would it be reasonable to add panel support as a second extension
module?  Is there really a need for them to be in the same module,
since the panel library is a separate library?


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From gstein@lyra.org  Wed Dec 13 07:58:38 2000
From: gstein@lyra.org (Greg Stein)
Date: Tue, 12 Dec 2000 23:58:38 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>; from amk@mira.erols.com on Tue, Dec 12, 2000 at 10:55:33PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
Message-ID: <20001212235838.T8951@lyra.org>

On Tue, Dec 12, 2000 at 10:55:33PM -0500, A.M. Kuchling wrote:
> At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
> received a patch from Thomas Gellekum that adds support for the panel
> library that will add another 500 lines.  I'd like to split the C file
> into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
> get #included from the master _cursesmodule.c file.  

Why should they be #included? I thought that we can build multiple .c files
into a module...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 13 08:05:05 2000
From: gstein@lyra.org (Greg Stein)
Date: Wed, 13 Dec 2000 00:05:05 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects dictobject.c,2.68,2.69
In-Reply-To: <200012130102.RAA31828@slayer.i.sourceforge.net>; from tim_one@users.sourceforge.net on Tue, Dec 12, 2000 at 05:02:49PM -0800
References: <200012130102.RAA31828@slayer.i.sourceforge.net>
Message-ID: <20001213000505.U8951@lyra.org>

On Tue, Dec 12, 2000 at 05:02:49PM -0800, Tim Peters wrote:
> Update of /cvsroot/python/python/dist/src/Objects
> In directory slayer.i.sourceforge.net:/tmp/cvs-serv31776/python/dist/src/objects
> 
> Modified Files:
> 	dictobject.c 
> Log Message:
> Bring comments up to date (e.g., they still said the table had to be
> a prime size, which is in fact never true anymore ...).
>...
> --- 55,78 ----
>   
>   /*
> ! There are three kinds of slots in the table:
> ! 
> ! 1. Unused.  me_key == me_value == NULL
> !    Does not hold an active (key, value) pair now and never did.  Unused can
> !    transition to Active upon key insertion.  This is the only case in which
> !    me_key is NULL, and is each slot's initial state.
> ! 
> ! 2. Active.  me_key != NULL and me_key != dummy and me_value != NULL
> !    Holds an active (key, value) pair.  Active can transition to Dummy upon
> !    key deletion.  This is the only case in which me_value != NULL.
> ! 
> ! 3. Dummy.  me_key == dummy && me_value == NULL
> !    Previously held an active (key, value) pair, but that was deleted and an
> !    active pair has not yet overwritten the slot.  Dummy can transition to
> !    Active upon key insertion.  Dummy slots cannot be made Unused again
> !    (cannot have me_key set to NULL), else the probe sequence in case of
> !    collision would have no way to know they were once active.

4. The popitem finger.


:-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From moshez@zadka.site.co.il  Wed Dec 13 19:19:53 2000
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Wed, 13 Dec 2000 21:19:53 +0200 (IST)
Subject: [Python-Dev] Splitting up _cursesmodule
Message-ID: <20001213191953.7208DA82E@darjeeling.zadka.site.co.il>

On Tue, 12 Dec 2000 23:29:17 -0500, "Eric S. Raymond" <esr@thyrsus.com> wrote:
> A.M. Kuchling <amk@mira.erols.com>:
> > At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
> > received a patch from Thomas Gellekum that adds support for the panel
> > library that will add another 500 lines.  I'd like to split the C file
> > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
> > get #included from the master _cursesmodule.c file.  
> > 
> > Do the powers that be approve of this idea?
> 
> I doubt I qualify as a power that be, but I'm certainly +1 on panel support.
 
I'm +1 on panel support, but that seems the wrong solution. Why not
have several C moudles (_curses_panel,...) and manage a more unified
namespace with the Python wrapper modules?

/curses/panel.py -- from _curses_panel import *
etc.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From akuchlin@mems-exchange.org  Wed Dec 13 12:44:23 2000
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 13 Dec 2000 07:44:23 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 01:24:01AM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
Message-ID: <20001213074423.A30348@kronos.cnri.reston.va.us>

[CC'ing Thomas Gellekum <tg@melaten.rwth-aachen.de>]

On Wed, Dec 13, 2000 at 01:24:01AM -0500, Fred L. Drake, Jr. wrote:
>  Would it be reasonable to add panel support as a second extension
>module?  Is there really a need for them to be in the same module,
>since the panel library is a separate library?

Quite possibly, though the patch isn't structured that way.  The panel
module will need access to the type object for the curses window
object, so it'll have to ensure that _curses is already imported, but
that's no problem.

Thomas, do you feel capable of implementing it as a separate module,
or should I work on it?  Probably a _cursesmodule.h header will have
to be created to make various definitions available to external users
of the basic objects in _curses.  (Bonus: this means that the menu and
form libraries, if they ever get wrapped, can be separate modules, too.)

--amk



From tg@melaten.rwth-aachen.de  Wed Dec 13 14:00:46 2000
From: tg@melaten.rwth-aachen.de (Thomas Gellekum)
Date: 13 Dec 2000 15:00:46 +0100
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: Andrew Kuchling's message of "Wed, 13 Dec 2000 07:44:23 -0500"
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
 <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
 <20001213074423.A30348@kronos.cnri.reston.va.us>
Message-ID: <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>

Andrew Kuchling <akuchlin@mems-exchange.org> writes:

> [CC'ing Thomas Gellekum <tg@melaten.rwth-aachen.de>]
> 
> On Wed, Dec 13, 2000 at 01:24:01AM -0500, Fred L. Drake, Jr. wrote:
> >  Would it be reasonable to add panel support as a second extension
> >module?  Is there really a need for them to be in the same module,
> >since the panel library is a separate library?
> 
> Quite possibly, though the patch isn't structured that way.  The panel
> module will need access to the type object for the curses window
> object, so it'll have to ensure that _curses is already imported, but
> that's no problem.

You mean as separate modules like

import curses
import panel

? Hm. A panel object is associated with a window object, so it's
created from a window method. This means you'd need to add
window.new_panel() to PyCursesWindow_Methods[] and
curses.update_panels(), curses.panel_above() and curses.panel_below()
(or whatever they're called after we're through discussing this ;-))
to PyCurses_Methods[].

Also, the curses.panel_{above,below}() wrappers need access to the
list_of_panels via find_po().

> Thomas, do you feel capable of implementing it as a separate module,
> or should I work on it?

It's probably finished a lot sooner when you do it; OTOH, it would be
fun to try it. Let's carry this discussion a bit further.

>  Probably a _cursesmodule.h header will have
> to be created to make various definitions available to external
> users of the basic objects in _curses.

That's easy. The problem is that we want to extend those basic objects
in _curses.

>  (Bonus: this means that the
> menu and form libraries, if they ever get wrapped, can be separate
> modules, too.)

Sure, if we solve this for panel, the others are a SMOP. :-)

tg


From guido@python.org  Wed Dec 13 14:31:52 2000
From: guido@python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 09:31:52 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src README,1.106,1.107
In-Reply-To: Your message of "Wed, 13 Dec 2000 06:14:35 PST."
 <200012131414.GAA20849@slayer.i.sourceforge.net>
References: <200012131414.GAA20849@slayer.i.sourceforge.net>
Message-ID: <200012131431.JAA21243@cj20424-a.reston1.va.home.com>

> + --with-cxx=<compiler>: Some C++ compilers require that main() is
> +         compiled with the C++ if there is any C++ code in the application.
> +         Specifically, g++ on a.out systems may require that to support
> +         construction of global objects. With this option, the main() function
> +         of Python will be compiled with <compiler>; use that only if you
> +         plan to use C++ extension modules, and if your compiler requires
> +         compilation of main() as a C++ program.

Thanks for documenting this; see my continued reservation in the
(reopened) bug report.

Another question remains regarding the docs though: why is it bad to
always compile main.c with a C++ compiler?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Wed Dec 13 15:19:01 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 13 Dec 2000 10:19:01 -0500 (EST)
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
 <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
 <20001213074423.A30348@kronos.cnri.reston.va.us>
 <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>
Message-ID: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>

Thomas Gellekum writes:
 > You mean as separate modules like
 > 
 > import curses
 > import panel

  Or better yet:

	import curses
	import curses.panel

 > ? Hm. A panel object is associated with a window object, so it's
 > created from a window method. This means you'd need to add
 > window.new_panel() to PyCursesWindow_Methods[] and
 > curses.update_panels(), curses.panel_above() and curses.panel_below()
 > (or whatever they're called after we're through discussing this ;-))
 > to PyCurses_Methods[].

  Do these new functions have to be methods on the window objects, or
can they be functions in the new module that take a window as a
parameter?  The underlying window object can certainly provide slots
for the use of the panel (forms, ..., etc.) bindings, and simply
initialize them to NULL (or whatever) for newly created windows.

 > Also, the curses.panel_{above,below}() wrappers need access to the
 > list_of_panels via find_po().

  There's no reason that underlying utilities can't be provided by
_curses using a CObject.  The Extending & Embedding manual has a
section on using CObjects to provide a C API to a module without
having to link to it directly.

 > That's easy. The problem is that we want to extend those basic objects
 > in _curses.

  Again, I'm curious about the necessity of this.  I suspect it can be
avoided.  I think the approach I've hinted at above will allow you to
avoid this, and will allow the panel (forms, ...) support to be added
simply by adding additional modules as they are written and the
underlying libraries are installed on the host.
  I know the question of including these modules in the core
distribution has come up before, but the resurgence in interest in
these makes me want to bring it up again:  Does the curses package
(and the associated C extension(s)) belong in the standard library, or
does it make sense to spin out a distutils-based package?  I've no
objection to them being in the core, but it seems that the release
cycle may want to diverge from Python's.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From guido@python.org  Wed Dec 13 15:48:50 2000
From: guido@python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 10:48:50 -0500
Subject: [Python-Dev] Online help PEP
In-Reply-To: Your message of "Tue, 12 Dec 2000 10:11:13 PST."
 <3A366A41.1A14EFD4@ActiveState.com>
References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com>
 <3A366A41.1A14EFD4@ActiveState.com>
Message-ID: <200012131548.KAA21344@cj20424-a.reston1.va.home.com>

[Paul's PEP]
> > >         help( "string" ) -- built-in topic or global

[me]
> > Why does a global require string quotes?

[Paul]
> It doesn't, but if you happen to say 
> 
> help( "dir" ) instead of help( dir ), I think it should do the right
> thing.

Fair enough.

> > I'm missing
> > 
> >           help() -- table of contents
> > 
> > I'm not sure if the table of contents should be printed by the repr
> > output.
> 
> I don't see any benefit in having different behaviors for help and
> help().

Having the repr() overloading invoke the pager is dangerous.  The beta
version of the license command did this, and it caused some strange
side effects, e.g. vars(__builtins__) would start reading from input
and confuse the users.  The new version's repr() returns the desired
string if it's less than a page, and 'Type license() to see the full
license text' if the pager would need to be invoked.

> > >     If you ask for a global, it can be a fully-qualfied name such as
> > >     help("xml.dom").
> > 
> > Why are the string quotes needed?  When are they useful?
> 
> When you haven't imported the thing you are asking about. Or when the
> string comes from another UI like an editor window, command line or web
> form.

The implied import is a major liability.  If you can do this without
importing (e.g. by source code inspection), fine.  Otherwise, you
might issue some kind of message like "you must first import XXX.YYY".

> > >     You can also use the facility from a command-line
> > >
> > >     python --help if
> > 
> > Is this really useful?  Sounds like Perlism to me.
> 
> I'm just trying to make it easy to quickly get answers to Python
> questions. I could totally see someone writing code in VIM switching to
> a bash window to type:
> 
> python --help os.path.dirname
> 
> That's alot easier than:
> 
> $ python
> Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32
> Type "copyright", "credits" or "license" for more information.
> >>> import os
> >>> help(os.path.dirname)
> 
> And what does it hurt?

The hurt is code bloat in the interpreter and creeping featurism.  If
you need command line access to the docs (which may be a reasonable
thing to ask for, although to me it sounds backwards :-), it's better
to provide a separate command, e.g. pythondoc.  (Analog to perldoc.)

> > >     In either situation, the output does paging similar to the "more"
> > >     command.
> > 
> > Agreed.  But how to implement paging in a platform-dependent manner?
> > On Unix, os.system("more") or "$PAGER" is likely to work.  On Windows,
> > I suppose we could use its MORE, although that's pretty braindead.  On
> > the Mac?  Also, inside IDLE or Pythonwin, invoking the system pager
> > isn't a good idea.
> 
> The current implementation does paging internally. You could override it
> to use the system pager (or no pager).

Yes.  Please add that option to the PEP.

> > What does "demand-loaded" mean in a Python context?
> 
> When you "touch" the help object, it loads the onlinehelp module which
> has the real implementation. The thing in __builtins__ is just a
> lightweight proxy.

Please suggest an implementation.

> > >     It Should Also Be Possible To Override The Help Display Function By
> > >     Assigning To Onlinehelp.Displayhelp(Object_Or_String).
> > 
> > Good Idea.  Pythonwin And Idle Could Use This.  But I'D Like It To
> > Work At Least "Okay" If They Don'T.
> 
> Agreed. 

Glad You'Re So Agreeable. :)

> > >     The Module Should Be Able To Extract Module Information From Either
> > >     The Html Or Latex Versions Of The Python Documentation. Links Should
> > >     Be Accommodated In A "Lynx-Like" Manner.
> > 
> > I Think This Is Beyond The Scope.  
> 
> Well, We Have To Do One Of:
> 
>  * Re-Write A Subset Of The Docs In A Form That Can Be Accessed From The
> Command Line
>  * Access The Existing Docs In A Form That'S Installed
>  * Auto-Convert The Docs Into A Form That'S Compatible

I Really Don'T Think That This Tool Should Attempt To Do Everything.

If Someone *Really* Wants To Browse The Existing (Large) Doc Set In A
Terminal Emulation Window, Let Them Use Lynx And Point It To The
Documentation Set.  (I Agree That The Html Docs Should Be Installed,
By The Way.)

> I'Ve Already Implemented Html Parsing And Latex Parsing Is Actually Not
> That Far Off. I Just Need Impetus To Finish A Latex-Parsing Project I
> Started On My Last Vacation.

A Latex Parser Would Be Most Welcome -- If It Could Replace
Latex2html!  That Old Perl Program Is Really Ready For Retirement.
(Ask Fred.)

> The Reason That Latex Is Interesting Is Because It Would Be Nice To Be
> Able To Move Documentation From Existing Latex Files Into Docstrings.

That'S What Some People Think.  I Disagree That It Would Be Either
Feasible Or A Good Idea To Put All Documentation For A Typical Module
In Its Doc Strings.

> > The Latex Isn'T Installed Anywhere
> > (And Processing Would Be Too Much Work).  
> > The Html Is Installed Only
> > On Windows, Where There Already Is A Way To Get It To Pop Up In Your
> > Browser (Actually Two: It'S In The Start Menu, And Also In Idle'S Help
> > Menu).
> 
> If The Documentation Becomes An Integral Part Of The Python Code, Then
> It Will Be Installed. It'S Ridiculous That It Isn'T Already.

Why Is That Ridiculous?  It'S Just As Easy To Access Them Through The
Web For Most People.  If It'S Not, They Are Available In Easily
Downloadable Tarballs Supporting A Variety Of Formats.  That'S Just
Too Much To Be Included In The Standard Rpms.  (Also, Latex2html
Requires So Much Hand-Holding, And Is So Slow, That It'S Really Not A
Good Idea To Let "Make Install" Install The Html By Default.)

> Activepython Does Install The Docs On All Platforms.

Great.  More Power To You.

> > A Standard Syntax For Docstrings Is Under Development, Pep 216.  I
> > Don'T Agree With The Proposal There, But In Any Case The Help Pep
> > Should Not Attempt To Legalize A Different Format Than Pep 216.
> 
> I Won'T Hold My Breath For A Standard Python Docstring Format. I'Ve Gone
> Out Of My Way To Make The Code Format Independent..

To Tell You The Truth, I'M Not Holding My Breath Either. :-)  So your
code should just dump the doc string on stdout without interpreting it
in any way (except for paging).

> > Neat.  I noticed that in a 24-line screen, the pagesize must be set to
> > 21 to avoid stuff scrolling off the screen.  Maybe there's an off-by-3
> > error somewhere?
> 
> Yes.

It's buggier than just that.  The output of the pager prints an extra
"| " at the start of each page except for the first, and the first
page is a line longer than subsequent pages.

BTW, another bug: try help(cgi).  It's nice that it gives the default
value for arguments, but the defaults for FieldStorage.__init__ happen
to include os.environ.  Its entire value is dumped -- which causes the
pager to be off (it wraps over about 20 lines for me).  I think you
may have to truncate long values a bit, e.g. by using the repr module.

> > I also noticed that it always prints '1' when invoked as a function.
> > The new license pager in site.py avoids this problem.
> 
> Okay.

Where's the check-in? :-)

> > help("operators") and several others raise an
> > AttributeError('handledocrl').
> 
> Fixed.
> 
> > The "lynx-line links" don't work.
> 
> I don't think that's implemented yet.

I'm not sure what you intended to implement there.  I prefer to see
the raw URLs, then I can do whatever I normally do to paste them into
my preferred webbrowser (which *not* lynx :-).

> > I think it's naive to expect this help facility to replace browsing
> > the website or the full documentation package.  There should be one
> > entry that says to point your browser there (giving the local
> > filesystem URL if available), and that's it.  The rest of the online
> > help facility should be concerned with exposing doc strings.
> 
> I don't want to replace the documentation. But there is no reason we
> should set out to make it incomplete. If its integrated with the HTML
> then people can choose whatever access mechanism is easiest for them
> right now
> 
> I'm trying hard not to be "naive". Realistically, nobody is going to
> write a million docstrings between now and Python 2.1. It is much more
> feasible to leverage the existing documentation that Fred and others
> have spent months on.

I said above, and I'll say it again: I think the majority of people
would prefer to use their standard web browser to read the standard
docs.  It's not worth the effort to try to make those accessible
through help().  In fact, I'd encourage the development of a
command-line-invoked help facility that shows doc strings in the
user's preferred web browser -- the webbrowser module makes this
trivial.

> > > Security Issues
> > > 
> > >     This module will attempt to import modules with the same names as
> > >     requested topics. Don't use the modules if you are not confident
> > >     that everything in your pythonpath is from a trusted source.
> > Yikes!  Another reason to avoid the "string" -> global variable
> > option.
> 
> I don't think we should lose that option. People will want to look up
> information from non-executable environments like command lines, GUIs
> and web pages. Perhaps you can point me to techniques for extracting
> information from Python modules and packages without executing them.

I don't know specific tools, but any serious docstring processing tool
ends up parsing the source code for this very reason, so there's
probably plenty of prior art.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Wed Dec 13 16:07:22 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 13 Dec 2000 11:07:22 -0500 (EST)
Subject: [Python-Dev] Online help PEP
In-Reply-To: <200012131548.KAA21344@cj20424-a.reston1.va.home.com>
References: <3A3480E5.C2577AE6@ActiveState.com>
 <200012111557.KAA24266@cj20424-a.reston1.va.home.com>
 <3A366A41.1A14EFD4@ActiveState.com>
 <200012131548.KAA21344@cj20424-a.reston1.va.home.com>
Message-ID: <14903.40634.569192.704368@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > A Latex Parser Would Be Most Welcome -- If It Could Replace
 > Latex2html!  That Old Perl Program Is Really Ready For Retirement.
 > (Ask Fred.)

  Note that Doc/tools/sgmlconv/latex2esis.py already includes a
moderate start at a LaTeX parser.  Paragraph marking is done as a
separate step in Doc/tools/sgmlconv/docfixer.py, but I'd like to push
that down into the LaTeX handler.
  (Note that these tools are mostly broken at the moment, except for
latex2esis.py, which does most of what I need other than paragraph
marking.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From Barrett@stsci.edu  Wed Dec 13 16:34:40 2000
From: Barrett@stsci.edu (Paul Barrett)
Date: Wed, 13 Dec 2000 11:34:40 -0500 (EST)
Subject: [Python-Dev] Reference implementation for PEP 208 (coercion)
In-Reply-To: <20001210054646.A5219@glacier.fnational.com>
References: <20001210054646.A5219@glacier.fnational.com>
Message-ID: <14903.41669.883591.420446@nem-srvr.stsci.edu>

Neil Schemenauer writes:
 > Sourceforge unloads are not working.  The lastest version of the
 > patch for PEP 208 is here:
 > 
 >     http://arctrix.com/nas/python/coerce-6.0.diff
 > 
 > Operations on instances now call __coerce__ if it exists.  I
 > think the patch is now complete.  Converting other builtin types
 > to "new style numbers" can be done with a separate patch.

My one concern about this patch is whether the non-commutativity of
operators is preserved.  This issue being important for matrix
operations (not to be confused with element-wise array operations).

 -- Paul




From guido@python.org  Wed Dec 13 16:45:12 2000
From: guido@python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 11:45:12 -0500
Subject: [Python-Dev] Reference implementation for PEP 208 (coercion)
In-Reply-To: Your message of "Wed, 13 Dec 2000 11:34:40 EST."
 <14903.41669.883591.420446@nem-srvr.stsci.edu>
References: <20001210054646.A5219@glacier.fnational.com>
 <14903.41669.883591.420446@nem-srvr.stsci.edu>
Message-ID: <200012131645.LAA21719@cj20424-a.reston1.va.home.com>

> Neil Schemenauer writes:
>  > Sourceforge unloads are not working.  The lastest version of the
>  > patch for PEP 208 is here:
>  > 
>  >     http://arctrix.com/nas/python/coerce-6.0.diff
>  > 
>  > Operations on instances now call __coerce__ if it exists.  I
>  > think the patch is now complete.  Converting other builtin types
>  > to "new style numbers" can be done with a separate patch.
> 
> My one concern about this patch is whether the non-commutativity of
> operators is preserved.  This issue being important for matrix
> operations (not to be confused with element-wise array operations).

Yes, this is preserved.  (I'm spending most of my waking hours
understanding this patch -- it is a true piece of wizardry.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Wed Dec 13 17:38:00 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 13 Dec 2000 18:38:00 +0100
Subject: [Python-Dev] Reference implementation for PEP 208 (coercion)
References: <20001210054646.A5219@glacier.fnational.com>
 <14903.41669.883591.420446@nem-srvr.stsci.edu> <200012131645.LAA21719@cj20424-a.reston1.va.home.com>
Message-ID: <3A37B3F7.5640FAFC@lemburg.com>

Guido van Rossum wrote:
> 
> > Neil Schemenauer writes:
> >  > Sourceforge unloads are not working.  The lastest version of the
> >  > patch for PEP 208 is here:
> >  >
> >  >     http://arctrix.com/nas/python/coerce-6.0.diff
> >  >
> >  > Operations on instances now call __coerce__ if it exists.  I
> >  > think the patch is now complete.  Converting other builtin types
> >  > to "new style numbers" can be done with a separate patch.
> >
> > My one concern about this patch is whether the non-commutativity of
> > operators is preserved.  This issue being important for matrix
> > operations (not to be confused with element-wise array operations).
> 
> Yes, this is preserved.  (I'm spending most of my waking hours
> understanding this patch -- it is a true piece of wizardry.)

The fact that coercion didn't allow detection of parameter
order was the initial cause for my try at fixing it back then.
I was confronted with the fact that at C level there was no way
to tell whether the operands were in the order left, right or
right, left -- as a result I used a gross hack in mxDateTime
to still make this work...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From esr@thyrsus.com  Wed Dec 13 21:01:46 2000
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 13 Dec 2000 16:01:46 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 10:19:01AM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>
Message-ID: <20001213160146.A24753@thyrsus.com>

Fred L. Drake, Jr. <fdrake@acm.org>:
>   I know the question of including these modules in the core
> distribution has come up before, but the resurgence in interest in
> these makes me want to bring it up again:  Does the curses package
> (and the associated C extension(s)) belong in the standard library, or
> does it make sense to spin out a distutils-based package?  I've no
> objection to them being in the core, but it seems that the release
> cycle may want to diverge from Python's.

Curses needs to be in the core for political reasons.  Specifically, 
to support CML2 without requiring any extra packages or downloads
beyond the stock Python interpreter.

And what makes CML2 so constrained and so important?  It's my bid to
replace the Linux kernel's configuration machinery.  It has many
advantages over the existing config system, but the linux developers
are *very* resistant to adding things to the kernel's minimum build
kit.  Python alone may prove too much for them to swallow (though
there are hopeful signs they will); Python plus a separately
downloadable curses module would definitely be too much.

Guido attaches sufficient importance to getting Python into the kernel
build machinery that he approved adding ncurses to the standard modules
on that basis.  This would be a huge design win for us, raising Python's
visibility considerably.

So curses must stay in the core.  I don't have a requirement for
panels; my present curses front end simulates them. But if panels were
integrated into the core I could simplify the front-end code
significantly.  Every line I can remove from my stuff (even if it, in
effect, is just migrating into the Python core) makes it easier to
sell CML2 into the kernel.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Experience should teach us to be most on our guard to protect liberty
when the government's purposes are beneficient...  The greatest dangers
to liberty lurk in insidious encroachment by men of zeal, well meaning
but without understanding."
	-- Supreme Court Justice Louis Brandeis


From jheintz@isogen.com  Wed Dec 13 21:10:32 2000
From: jheintz@isogen.com (John D. Heintz)
Date: Wed, 13 Dec 2000 15:10:32 -0600
Subject: [Python-Dev] Announcing ZODB-Corba code release
Message-ID: <3A37E5C8.7000800@isogen.com>

Here is the first release of code that exposes a ZODB database through 
CORBA (omniORB).

The code is functioning, the docs are sparse, and it should work on your 
machines.  ;-)

I am only going to be in town for the next two days, then I will be 
unavailable until Jan 1.

See http://www.zope.org/Members/jheintz/ZODB_CORBA_Connection to 
download the code.

It's not perfect, but it works for me.

Enjoy,
John


-- 
. . . . . . . . . . . . . . . . . . . . . . . .

John D. Heintz | Senior Engineer

1016 La Posada Dr. | Suite 240 | Austin TX 78752
T 512.633.1198 | jheintz@isogen.com

w w w . d a t a c h a n n e l . c o m



From guido@python.org  Wed Dec 13 21:19:01 2000
From: guido@python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 16:19:01 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: Your message of "Wed, 13 Dec 2000 16:01:46 EST."
 <20001213160146.A24753@thyrsus.com>
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>
 <20001213160146.A24753@thyrsus.com>
Message-ID: <200012132119.QAA11060@cj20424-a.reston1.va.home.com>

> So curses must stay in the core.  I don't have a requirement for
> panels; my present curses front end simulates them. But if panels were
> integrated into the core I could simplify the front-end code
> significantly.  Every line I can remove from my stuff (even if it, in
> effect, is just migrating into the Python core) makes it easier to
> sell CML2 into the kernel.

On the other hand you may want to be conservative.  You already have
to require Python 2.0 (I presume).  The panel stuff will be available
in 2.1 at the earliest.  You probably shouldn't throw out your panel
emulation until your code has already been accepted...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@loewis.home.cs.tu-berlin.de  Wed Dec 13 21:56:27 2000
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 13 Dec 2000 22:56:27 +0100
Subject: [Python-Dev] CVS: python/dist/src README,1.106,1.107
Message-ID: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de>

> Another question remains regarding the docs though: why is it bad to
> always compile main.c with a C++ compiler?

For the whole thing to work, it may also be necessary to link the
entire application with a C++ compiler; that in turn may bind to the
C++ library. Linking with the system's C++ library means that the
Python executable cannot be as easily exchanged between installations
of the operating system - you'd also need to have the right version of
the C++ library to run it. If the C++ library is static, that may also
increase the size of the executable.

I can't really point to a specific problem that would occur on a
specific system I use if main() was compiled with a C++
compiler. However, on the systems I use (Windows, Solaris, Linux), you
can build C++ extension modules even if Python was not compiled as a
C++ application.

On Solaris and Windows, you'd also have to chose the C++ compiler you
want to use (MSVC++, SunPro CC, or g++); in turn, different C++
runtime systems would be linked into the application.

Regards,
Martin


From esr@thyrsus.com  Wed Dec 13 22:03:59 2000
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 13 Dec 2000 17:03:59 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012132119.QAA11060@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 04:19:01PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com>
Message-ID: <20001213170359.A24915@thyrsus.com>

Guido van Rossum <guido@python.org>:
> > So curses must stay in the core.  I don't have a requirement for
> > panels; my present curses front end simulates them. But if panels were
> > integrated into the core I could simplify the front-end code
> > significantly.  Every line I can remove from my stuff (even if it, in
> > effect, is just migrating into the Python core) makes it easier to
> > sell CML2 into the kernel.
> 
> On the other hand you may want to be conservative.  You already have
> to require Python 2.0 (I presume).  The panel stuff will be available
> in 2.1 at the earliest.  You probably shouldn't throw out your panel
> emulation until your code has already been accepted...

Yes, that's how I am currently expecting it to play out -- but if the 2.4.0
kernel is delayed another six months, I'd change my mind.  I'll explain this,
because python-dev people should grok what the surrounding politics and timing 
are.

I actually debated staying with 1.5.2 as a base version.  What changed
my mind was two things.  One: by going to 2.0 I could drop close to 600
lines and three entire support modules from CML2, slimming down its 
footprint in the kernel tree significantly (by more than 10% of the 
entire code volume, actually).

Second: CML2 is not going to be seriously evaluated until 2.4.0 final is out.
Linus made this clear when I demoed it for him at LWE.  My best guess about 
when that will happen is late January into Februrary.  By the time Red Hat
issues its next distro after that (probably May or thenabouts) it's a safe
bet 2.0 will be on it, and everywhere else.

But if the 2.4.0 kernel slips another six months yet again, and our
2.1 commes out relatively quickly (like, just before the 9th Python
Conference :-)) then we *might* have time to get 2.1 into the distros
before CML2 gets the imprimatur.

So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel
will be delayed yet again :-).
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Ideology, politics and journalism, which luxuriate in failure, are
impotent in the face of hope and joy.
	-- P. J. O'Rourke


From nas@arctrix.com  Wed Dec 13 15:37:45 2000
From: nas@arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 07:37:45 -0800
Subject: [Python-Dev] CVS: python/dist/src README,1.106,1.107
In-Reply-To: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Wed, Dec 13, 2000 at 10:56:27PM +0100
References: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de>
Message-ID: <20001213073745.C17148@glacier.fnational.com>

These are issues to consider for Python 3000 as well.  AFAKI, C++
ABIs are a nighmare.

  Neil


From fdrake@acm.org  Wed Dec 13 22:29:25 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 13 Dec 2000 17:29:25 -0500 (EST)
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001213170359.A24915@thyrsus.com>
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
 <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
 <20001213074423.A30348@kronos.cnri.reston.va.us>
 <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>
 <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>
 <20001213160146.A24753@thyrsus.com>
 <200012132119.QAA11060@cj20424-a.reston1.va.home.com>
 <20001213170359.A24915@thyrsus.com>
Message-ID: <14903.63557.282592.796169@cj42289-a.reston1.va.home.com>

Eric S. Raymond writes:
 > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel
 > will be delayed yet again :-).

  Politics aside, I think development of curses-related extensions
like panels and forms doesn't need to be delayed.  I've posted what I
think are relavant technical comments already, and leave it up to the
developers of any new modules to get them written -- I don't know
enough curses to offer any help there.
  Regardless of how the curses package is distributed and deployed, I
don't see any reason to delay development in its existing location in
the Python CVS repository.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From nas@arctrix.com  Wed Dec 13 15:41:54 2000
From: nas@arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 07:41:54 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001213170359.A24915@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 13, 2000 at 05:03:59PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com>
Message-ID: <20001213074154.D17148@glacier.fnational.com>

On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote:
> CML2 is not going to be seriously evaluated until 2.4.0 final
> is out.  Linus made this clear when I demoed it for him at LWE.
> My best guess about when that will happen is late January into
> Februrary.  By the time Red Hat issues its next distro after
> that (probably May or thenabouts) it's a safe bet 2.0 will be
> on it, and everywhere else.

I don't think that is a very safe bet.  Python 2.0 missed the
Debian Potato boat.  I have no idea when Woody is expected to be
released but I expect it may take longer than that if history is
any indication.

  Neil


From guido@python.org  Wed Dec 13 23:03:31 2000
From: guido@python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 18:03:31 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: Your message of "Wed, 13 Dec 2000 07:41:54 PST."
 <20001213074154.D17148@glacier.fnational.com>
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com>
 <20001213074154.D17148@glacier.fnational.com>
Message-ID: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>

> I don't think that is a very safe bet.  Python 2.0 missed the
> Debian Potato boat.

This may have had to do more with the unresolved GPL issues.  I
recently received a mail from Stallman indicating that an agreement
with CNRI has been reached; they have agreed (in principle, at least)
to specific changes to the CNRI license that will defuse the
choice-of-law clause when it is combined with GPL-licensed code "in a
non-separable way".  A glitch here is that the BeOpen license probably
has to be changed too, but I believe that that's all doable.

> I have no idea when Woody is expected to be
> released but I expect it may take longer than that if history is
> any indication.

And who or what is Woody?

Feeling-left-out,

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Wed Dec 13 23:16:09 2000
From: gstein@lyra.org (Greg Stein)
Date: Wed, 13 Dec 2000 15:16:09 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com>
Message-ID: <20001213151609.E8951@lyra.org>

On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote:
>...
> > I have no idea when Woody is expected to be
> > released but I expect it may take longer than that if history is
> > any indication.
> 
> And who or what is Woody?

One of the Debian releases. Dunno if it is the "next" release, but there ya
go.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Wed Dec 13 23:18:34 2000
From: gstein@lyra.org (Greg Stein)
Date: Wed, 13 Dec 2000 15:18:34 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001213170359.A24915@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 13, 2000 at 05:03:59PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com>
Message-ID: <20001213151834.F8951@lyra.org>

On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote:
>...
> So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel
> will be delayed yet again :-).

The kernel is not going to be delayed that much. Linus wants it to go out
this month. Worst case, I could see January. But no way on six months.

But as Fred said: that should not change panels going into the curses
support at all. You can always have a "compat.py" module in CML2 that
provides functionality for prior-to-2.1 releases of Python.

I'd also be up for a separate _curses_panels module, loaded into the curses
package.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From esr@thyrsus.com  Wed Dec 13 23:33:02 2000
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 13 Dec 2000 18:33:02 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001213151834.F8951@lyra.org>; from gstein@lyra.org on Wed, Dec 13, 2000 at 03:18:34PM -0800
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213151834.F8951@lyra.org>
Message-ID: <20001213183302.A25160@thyrsus.com>

Greg Stein <gstein@lyra.org>:
> On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote:
> >...
> > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel
> > will be delayed yet again :-).
> 
> The kernel is not going to be delayed that much. Linus wants it to go out
> this month. Worst case, I could see January. But no way on six months.

I know what Linus wants.  That's why I'm estimating end of January or 
earlier Februrary -- the man's error curve on these estimates has a 
certain, er, *consistency* about it.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Alcohol still kills more people every year than all `illegal' drugs put
together, and Prohibition only made it worse.  Oppose the War On Some Drugs!


From nas@arctrix.com  Wed Dec 13 17:18:48 2000
From: nas@arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 09:18:48 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com>
Message-ID: <20001213091848.A17326@glacier.fnational.com>

On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote:
> > I don't think that is a very safe bet.  Python 2.0 missed the
> > Debian Potato boat.
> 
> This may have had to do more with the unresolved GPL issues.

I can't remember the exact dates but I think Debian Potato was
frozen before Python 2.0 was released.  Once a Debian release is
frozen packages are not upgraded except under unusual
circumstances.

> I recently received a mail from Stallman indicating that an
> agreement with CNRI has been reached; they have agreed (in
> principle, at least) to specific changes to the CNRI license
> that will defuse the choice-of-law clause when it is combined
> with GPL-licensed code "in a non-separable way".  A glitch here
> is that the BeOpen license probably has to be changed too, but
> I believe that that's all doable.

This is great news.

> > I have no idea when Woody is expected to be
> > released but I expect it may take longer than that if history is
> > any indication.
> 
> And who or what is Woody?

Woody would be another character from the Pixar movie "Toy Story"
(just like Rex, Bo, Potato, Slink, and Hamm).  I believe Bruce
Perens used to work a Pixar.  Debian uses a code name for the
development release until a release number is assigned.  This
avoids some problems but has the disadvantage of confusing people
who are not familiar with Debian.  I should have said "the next
stable release of Debian".


  Neil (aka nas@debian.org)


From akuchlin@mems-exchange.org  Thu Dec 14 00:26:32 2000
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 13 Dec 2000 19:26:32 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 10:19:01AM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>
Message-ID: <20001213192632.A30585@kronos.cnri.reston.va.us>

On Wed, Dec 13, 2000 at 10:19:01AM -0500, Fred L. Drake, Jr. wrote:
>  Do these new functions have to be methods on the window objects, or
>can they be functions in the new module that take a window as a
>parameter?  The underlying window object can certainly provide slots

Panels and windows have a 1-1 association, but they're separate
objects.  The window.new_panel function could become just a method
which takes a window as its first argument; it would only need the
TypeObject for PyCursesWindow, in order to do typechecking.

> > Also, the curses.panel_{above,below}() wrappers need access to the
> > list_of_panels via find_po().

The list_of_panels is used only in the curses.panel module, so it
could be private to that module, since only panel-related functions
care about it.  

I'm ambivalent about the list_of_panels.  It's a linked list storing
(PyWindow, PyPanel) pairs.  Probably it should use a dictionary
instead of implementing a little list, just to reduce the amount of
code.

>does it make sense to spin out a distutils-based package?  I've no
>objection to them being in the core, but it seems that the release
>cycle may want to diverge from Python's.

Consensus seemed to be to leave it in; I'd have no objection to
removing it, but either course is fine with me.

So, I suggest we create _curses_panel.c, which would be available as
curses.panel.  (A panel.py module could then add any convenience
functions that are required.)

Thomas, do you want to work on this, or should I?

--amk


From nas@arctrix.com  Wed Dec 13 17:43:06 2000
From: nas@arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 09:43:06 -0800
Subject: [Python-Dev] OT: Debian and Python
In-Reply-To: <20001214010534.M4396@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 14, 2000 at 01:05:34AM +0100
References: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl>
Message-ID: <20001213094306.C17326@glacier.fnational.com>

On Thu, Dec 14, 2000 at 01:05:34AM +0100, Thomas Wouters wrote:
> Note to the debian-pythoneers: woody still carries Python 1.5.2, not 2.0.
> Someone created a separate set of 2.0-packages, but they didn't include
> readline and gdbm support because of the licencing issues. (Posted on c.l.py
> sometime this week.)

I've had Python packages for Debian stable for a while.  I guess
I should have posted a link:

    http://arctrix.com/nas/python/debian/

Most useful modules are enabled.

> I'm *almost* tempted enough to learn enough about
> dpkg/.deb files to build my own licence-be-damned set

Its quite easy.  Debian source packages are basicly a diff.
Applying the diff will create a "debian" directory and in that
directory will be a makefile called "rules".  Use the target
"binary" to create new binary packages.  Good things to know are
that you must be in the source directory when you run the
makefile (ie. ./debian/rules binary).  You should be running a
shell under fakeroot to get the install permissions right
(running "fakeroot" will do).  You need to have the Debian
developer tools installed.  There is a list somewhere on
debian.org.  "apt-get source <packagename>" will get, extract and
patch a package ready for tweaking and building (handy for
getting stuff from unstable to run on stable).  

This is too off topic for python-dev.  If anyone needs more info
they can email me directly.

  Neil


From thomas@xs4all.net  Thu Dec 14 00:05:34 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 14 Dec 2000 01:05:34 +0100
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com>
Message-ID: <20001214010534.M4396@xs4all.nl>

On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote:
> > I don't think that is a very safe bet.  Python 2.0 missed the Debian
> > Potato boat.
> 
> This may have had to do more with the unresolved GPL issues.

This is very likely. Debian is very licence -- or at least GPL -- aware.
Which is a pity, really, because I already prefer it over RedHat in all
other cases (and RedHat is also pretty licence aware, just less piously,
devoutly, beyond-practicality-IMHO dedicated to the GPL.)

> > I have no idea when Woody is expected to be released but I expect it may
> > take longer than that if history is any indication.

BTW, I believe Debian uses a fairly steady release schedule, something like
an unstable->stable switch every year or 6 months or so ? I seem to recall
seeing something like that on the debian website, but can't check right now.

> And who or what is Woody?

Woody is Debian's current development branch, the current bearer of the
alias 'unstable'. It'll become Debian 2.3 (I believe, I don't pay attention
to version numbers, I just run unstable :) once it's stabilized. 'potato' is
the previous development branch, and currently the 'stable' branch. You can
compare them with 'rawhide' and 'redhat-7.0', respectively :)

(With the enormous difference that you can upgrade your debian install to a
new version (even the devel version, or update your machine to the latest
devel snapshot) while you are using it, without having to reboot ;)

Note to the debian-pythoneers: woody still carries Python 1.5.2, not 2.0.
Someone created a separate set of 2.0-packages, but they didn't include
readline and gdbm support because of the licencing issues. (Posted on c.l.py
sometime this week.) I'm *almost* tempted enough to learn enough about
dpkg/.deb files to build my own licence-be-damned set, but it'd be a lot of
work to mirror the current debian 1.5.2 set of packages (which include
numeric, imaging, mxTools, GTK/GNOME, and a shitload of 3rd party modules)
in 2.0. Ponder, maybe it could be done semi-automatically, from the
src-deb's of those packages.

By the way, in woody, there are 52 packages with 'python' in the name, and
32 with 'perl' in the name... Pity all of my perl-hugging hippy-friends are
still blindly using RedHat, and refuse to listen to my calls from the
Debian/Python-dark-side :-)

Oh, and the names 'woody' and 'potato' came from the movie Toy Story, in
case you wondered ;)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From esr@snark.thyrsus.com  Thu Dec 14 00:46:37 2000
From: esr@snark.thyrsus.com (Eric S. Raymond)
Date: Wed, 13 Dec 2000 19:46:37 -0500
Subject: [Python-Dev] Business related to the upcoming Python conference
Message-ID: <200012140046.TAA25289@snark.thyrsus.com>

I'm sending this to python-dev because I believe most or all of the
reviewers for my PC9 paper are on this list.  Paul, would you please
forward to any who were not?

First, my humble apologies for not having got my PC9 reviews in on time.
I diligently read my assigned papers early, but I couldn't do the
reviews early because of technical problems with my Foretec account --
and then I couldn't do them late because the pre-deadline crunch
happened while I was on a ten-day speaking and business trip in Japan
and California, with mostly poor or nonexistent Internet access.

Matters were not helped by a nasty four-month-old problem in my
personal life coming to a head right in the middle of the trip.  Nor
by the fact that the trip included the VA Linux Systems annual
stockholders' meeting and the toughest Board of Directors' meeting in
my tenure.  We had to hammer out a strategic theory of what to do now
that the dot-com companies who used to be our best companies aren't
getting funded any more.  Unfortunately, it's at times like this that
Board members earn their stock options.  Management oversight.
Fiduciary responsibility.  Mumble...

Second, the feedback I received on the paper was *excellent*, and I
will be making many of the recommended changes.  I've already extended
the discussion of "Why Python?" including addressing the weaknesses of
Scheme and Prolog for this application.  I have said more about uses
of CML2 beyond the Linux kernel.  I am working on a discussion of the
politics of CML2 option, but may save that for the stand-up talk
rather than the written paper.  I will try to trim the CML2 language
reference for the final version.

(The reviewer who complained about the lack of references on the SAT 
problem should be pleased to hear that URLs to relevant papers are in
fact included in the masters.  I hope they show in the final version
as rendered for publication.)
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The Constitution is not neutral. It was designed to take the
government off the backs of the people.
	-- Justice William O. Douglas 


From moshez@zadka.site.co.il  Thu Dec 14 12:22:24 2000
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Thu, 14 Dec 2000 14:22:24 +0200 (IST)
Subject: [Python-Dev] Splitting up _cursesmodule
Message-ID: <20001214122224.739EEA82E@darjeeling.zadka.site.co.il>

On Wed, 13 Dec 2000 07:41:54 -0800, Neil Schemenauer <nas@arctrix.com> wrote:

> I don't think that is a very safe bet.  Python 2.0 missed the
> Debian Potato boat.

By a long time -- potato was frozen for a few months when 2.0 came out.

>  I have no idea when Woody is expected to be
> released but I expect it may take longer than that if history is
> any indication.

My bet is that woody starts freezing as soon as 2.4.0 is out. 
Note that once it starts freezing, 2.1 doesn't have a shot
of getting in, regardless of how long it takes to freeze.
OTOH, since in woody time there's a good chance for the "testing"
distribution, a lot more people would be running something
that *can* and *will* upgrade to 2.1 almost as soon as it is
out.
(For the record, most of the Debian users I know run woody on 
their server)
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From jeremy@alum.mit.edu  Thu Dec 14 05:04:43 2000
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 14 Dec 2000 00:04:43 -0500 (EST)
Subject: [Python-Dev] new draft of PEP 227
Message-ID: <14904.21739.804346.650062@bitdiddle.concentric.net>

I've got a new draft of PEP 227.  The terminology and wording are more
convoluted than they need to be.  I'll do at least one revision just
to say things more clearly, but I'd appreciate comments on the
proposed spec if you can read the current draft.

Jeremy



From cgw@fnal.gov  Thu Dec 14 06:03:01 2000
From: cgw@fnal.gov (Charles G Waldman)
Date: Thu, 14 Dec 2000 00:03:01 -0600 (CST)
Subject: [Python-Dev] Memory leaks in tupleobject.c
Message-ID: <14904.25237.654143.861733@buffalo.fnal.gov>

I've been running a set of memory-leak tests against the latest Python
and have found that running "test_extcall" leaks memory.  This gave me
a strange sense of deja vu, having fixed this once before...

>From the CVS logs for tupleobject.c:

 revision 2.31
 date: 2000/04/21 21:15:05;  author: guido;  state: Exp;  lines: +59 -16
 Patch by Charles G Waldman to avoid a sneaky memory leak in
 _PyTuple_Resize(). 

 revision 2.47
 date: 2000/10/05 19:36:49;  author: nascheme;  state: Exp;  lines: +24 -86
 Simplify _PyTuple_Resize by not using the tuple free list and dropping
 support for the last_is_sticky flag.  A few hard to find bugs may be
 fixed by this patch since the old code was buggy.

The 2.47 patch seems to have re-introduced the memory leak which was
fixed in 2.31.  Maybe the old code was buggy, but the "right thing"
would have been to fix it, not to throw it away.... if _PyTuple_Resize
simply ignores the tuple free list, memory will be leaked.


From nas@arctrix.com  Wed Dec 13 23:43:43 2000
From: nas@arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 15:43:43 -0800
Subject: [Python-Dev] Memory leaks in tupleobject.c
In-Reply-To: <14904.25237.654143.861733@buffalo.fnal.gov>; from cgw@fnal.gov on Thu, Dec 14, 2000 at 12:03:01AM -0600
References: <14904.25237.654143.861733@buffalo.fnal.gov>
Message-ID: <20001213154343.A18303@glacier.fnational.com>

On Thu, Dec 14, 2000 at 12:03:01AM -0600, Charles G Waldman wrote:
>  date: 2000/10/05 19:36:49;  author: nascheme;  state: Exp;  lines: +24 -86
>  Simplify _PyTuple_Resize by not using the tuple free list and dropping
>  support for the last_is_sticky flag.  A few hard to find bugs may be
>  fixed by this patch since the old code was buggy.
> 
> The 2.47 patch seems to have re-introduced the memory leak which was
> fixed in 2.31.  Maybe the old code was buggy, but the "right thing"
> would have been to fix it, not to throw it away.... if _PyTuple_Resize
> simply ignores the tuple free list, memory will be leaked.

Guilty as charged.  Can you explain how the current code is
leaking memory?  I can see one problem with deallocating size=0
tuples.  Are there any more leaks?

  Neil


From cgw@fnal.gov  Thu Dec 14 06:57:05 2000
From: cgw@fnal.gov (Charles G Waldman)
Date: Thu, 14 Dec 2000 00:57:05 -0600 (CST)
Subject: [Python-Dev] Memory leaks in tupleobject.c
In-Reply-To: <20001213154343.A18303@glacier.fnational.com>
References: <14904.25237.654143.861733@buffalo.fnal.gov>
 <20001213154343.A18303@glacier.fnational.com>
Message-ID: <14904.28481.292539.354303@buffalo.fnal.gov>

Neil Schemenauer writes:

 > Guilty as charged.  Can you explain how the current code is
 > leaking memory?  I can see one problem with deallocating size=0
 > tuples.  Are there any more leaks?

Actually, I think I may have spoken too hastily - it's late and I'm
tired and I should be sleeping rather than staring at the screen 
(like I've been doing since 8:30 this morning) - I jumped to
conclusions - I'm not really sure that it was your patch that caused
the leak; all I can say with 100% certainty is that if you run
"test_extcall" in a loop, memory usage goes through the ceiling....
It's not just the cyclic garbage caused by the "saboteur" function
because even with this commented out, the memory leak persists.

I'm actually trying to track down a different memory leak, something
which is currently causing trouble in one of our production servers
(more about this some other time) and just as a sanity check I ran my
little "leaktest.py" script over all the test_*.py modules in the
distribution, and found that test_extcall triggers leaks... having
analyzed and fixed this once before (see the CVS logs for
tupleobject.c), I jumped to conclusions about the reason for its
return.  I'll take a more clear-headed and careful look tomorrow and
post something (hopefully) a little more conclusive.  It may have been
some other change that caused this memory leak to re-appear.  If you
feel inclined to investigate, just do "reload(test.test_extcall)" in a
loop and watch the memory usage with ps or top or what-have-you...

	 -C



From paulp@ActiveState.com  Thu Dec 14 07:00:21 2000
From: paulp@ActiveState.com (Paul Prescod)
Date: Wed, 13 Dec 2000 23:00:21 -0800
Subject: [Python-Dev] new draft of PEP 227
References: <14904.21739.804346.650062@bitdiddle.concentric.net>
Message-ID: <3A387005.6725DAAE@ActiveState.com>

Jeremy Hylton wrote:
> 
> I've got a new draft of PEP 227.  The terminology and wording are more
> convoluted than they need to be.  I'll do at least one revision just
> to say things more clearly, but I'd appreciate comments on the
> proposed spec if you can read the current draft.

It set me to thinking:

Python should never require declarations. But would it necessarily be a
problem for Python to have a variable declaration syntax? Might not the
existence of declarations simplify some aspects of the proposal and of
backwards compatibility?

Along the same lines, might a new rule make Python code more robust? We
could say that a local can only shadow a global if the local is formally
declared. It's pretty rare that there is a good reason to shadow a
global and Python makes it too easy to do accidentally.

 Paul Prescod


From paulp@ActiveState.com  Thu Dec 14 07:29:35 2000
From: paulp@ActiveState.com (Paul Prescod)
Date: Wed, 13 Dec 2000 23:29:35 -0800
Subject: [Python-Dev] Online help scope
Message-ID: <3A3876DF.5554080C@ActiveState.com>

I think Guido and I are pretty far apart on the scope and requirements
of this online help thing so I'd like some clarification and opinions
from the peanut gallery.

Consider these scenarios

a) Signature

>>> help( dir )
dir([object]) -> list of stringsb) 

b) Usage hint

>>> help( dir )
dir([object]) -> list of stringsb) 

Return an alphabetized list of names comprising (some of) the attributes
of the given object.  Without an argument, the names in the current
scope
are listed.  With an instance argument, only the instance attributes are
returned.  With a class argument, attributes of the base class are not
returned.  For other types or arguments, this may list members or
methods.

c) Complete documentation, paged(man-style)

>>> help( dir )
dir([object]) -> list of stringsb) 

Without arguments, return the list of names in the current local symbol
table. With an argument, attempts to return a list of valid attribute
for that object. This information is gleaned from the object's __dict__,
__methods__ and __members__ attributes, if defined. The list is not
necessarily complete; e.g., for classes, attributes defined in base
classes are not included, and for class instances, methods are not
included. The resulting list is sorted alphabetically. For example: 

  >>> import sys
  >>> dir()
  ['sys']
  >>> dir(sys)
  ['argv', 'exit', 'modules', 'path', 'stderr', 'stdin', 'stdout']

d) Complete documentation in a user-chosen hypertext window

>>> help( dir )
(Netscape or lynx pops up)

I'm thinking that maybe we need two functions:

 * help
 * pythondoc

pythondoc("dir") would launch the Python documentation for the "dir"
command.

> That'S What Some People Think.  I Disagree That It Would Be Either
> Feasible Or A Good Idea To Put All Documentation For A Typical Module
> In Its Doc Strings.

Java and Perl people do it regularly. I think that in the greater world
of software development, the inline model has won (or is winning) and I
don't see a compelling reason to fight the tide. There will always be
out-of-line tutorials, discussions, books etc. 

The canonical module documentation could be inline. That improves the
liklihood of it being maintained. The LaTeX documentation is a major
bottleneck and moving to XML or SGML will not help. Programmers do not
want to learn documentation systems or syntaxes. They want to write code
and comments.

> I said above, and I'll say it again: I think the majority of people
> would prefer to use their standard web browser to read the standard
> docs.  It's not worth the effort to try to make those accessible
> through help().  

No matter what we decide on the issue above, reusing the standard
documentation is the only practical way of populating the help system in
the short-term. Right now, today, there is a ton of documentation that
exists only in LaTeX and HTML. Tons of modules have no docstrings.
Keywords have no docstrings. Compare the docstring for
urllib.urlretrieve to the HTML documentation.

In fact, you've given me a good idea: if the HTML is not available
locally, I can access it over the web.

 Paul Prescod


From paulp@ActiveState.com  Thu Dec 14 07:29:53 2000
From: paulp@ActiveState.com (Paul Prescod)
Date: Wed, 13 Dec 2000 23:29:53 -0800
Subject: [Python-Dev] Online help PEP
References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com>
 <3A366A41.1A14EFD4@ActiveState.com> <200012131548.KAA21344@cj20424-a.reston1.va.home.com>
Message-ID: <3A3876F1.D3E65E90@ActiveState.com>

Guido van Rossum wrote:
> 
> Having the repr() overloading invoke the pager is dangerous.  The beta
> version of the license command did this, and it caused some strange
> side effects, e.g. vars(__builtins__) would start reading from input
> and confuse the users.  The new version's repr() returns the desired
> string if it's less than a page, and 'Type license() to see the full
> license text' if the pager would need to be invoked.

I'll add this to the PEP.

> The implied import is a major liability.  If you can do this without
> importing (e.g. by source code inspection), fine.  Otherwise, you
> might issue some kind of message like "you must first import XXX.YYY".

Okay, I'll add to the PEP that an open issue is what strategy to use,
but that we want to avoid implicit import.

> The hurt is code bloat in the interpreter and creeping featurism.  If
> you need command line access to the docs (which may be a reasonable
> thing to ask for, although to me it sounds backwards :-), it's better
> to provide a separate command, e.g. pythondoc.  (Analog to perldoc.)

Okay, I'll add a pythondoc proposal to the PEP.

> Yes.  Please add that option to the PEP.

Done.

> > > What does "demand-loaded" mean in a Python context?
> >
> > When you "touch" the help object, it loads the onlinehelp module which
> > has the real implementation. The thing in __builtins__ is just a
> > lightweight proxy.
> 
> Please suggest an implementation.

In the PEP.

> Glad You'Re So Agreeable. :)

What happened to your capitalization? elisp gone awry? 

> ...
> To Tell You The Truth, I'M Not Holding My Breath Either. :-)  So your
> code should just dump the doc string on stdout without interpreting it
> in any way (except for paging).

I'll do this for the first version.

> It's buggier than just that.  The output of the pager prints an extra
> "| " at the start of each page except for the first, and the first
> page is a line longer than subsequent pages.

For some reason that I now I forget, that code is pretty hairy.

> BTW, another bug: try help(cgi).  It's nice that it gives the default
> value for arguments, but the defaults for FieldStorage.__init__ happen
> to include os.environ.  Its entire value is dumped -- which causes the
> pager to be off (it wraps over about 20 lines for me).  I think you
> may have to truncate long values a bit, e.g. by using the repr module.

Okay. There are a lot of little things we need to figure out. Such as
whether we should print out docstrings for private methods etc.

>...
> I don't know specific tools, but any serious docstring processing tool
> ends up parsing the source code for this very reason, so there's
> probably plenty of prior art.

Okay, I'll look into it.

 Paul


From tim.one@home.com  Thu Dec 14 07:35:00 2000
From: tim.one@home.com (Tim Peters)
Date: Thu, 14 Dec 2000 02:35:00 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <3A387005.6725DAAE@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENPIDAA.tim.one@home.com>

[Paul Prescod]
> ...
> Along the same lines, might a new rule make Python code more robust?
> We could say that a local can only shadow a global if the local is
> formally declared. It's pretty rare that there is a good reason to
> shadow a global and Python makes it too easy to do accidentally.

I've rarely seen problems due to shadowing a global, but have often seen
problems due to shadowing a builtin.  Alas, if this rule were extended to
builtins too-- where it would do the most good --then the names of builtins
would effectively become reserved words (any code shadowing them today would
be broken until declarations were added, and any code working today may
break tomorrow if a new builtin were introduced that happened to have the
same name as a local).



From pf@artcom-gmbh.de  Thu Dec 14 07:42:59 2000
From: pf@artcom-gmbh.de (Peter Funk)
Date: Thu, 14 Dec 2000 08:42:59 +0100 (MET)
Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py)
In-Reply-To: <200012132039.MAA07496@slayer.i.sourceforge.net> from Moshe Zadka at "Dec 13, 2000 12:39:24 pm"
Message-ID: <m146T2Z-000DmFC@artcom0.artcom-gmbh.de>

Hi,

I think the following change is incompatible and will break applications.

At least I have some server type applications that rely on 
'allow_reuse_address' defaulting to 0, because they use
the 'address already in use' exception, to make sure, that exactly one
server process is running on this port.  One of these applications, 
which is BTW build on top of Fredrik Lundhs 'xmlrpclib' fails to work,
if I change this default in SocketServer.py.  

Would you please explain the reasoning behind this change?

Moshe Zadka:
> *** SocketServer.py	2000/09/01 03:25:14	1.19
> --- SocketServer.py	2000/12/13 20:39:17	1.20
> ***************
> *** 158,162 ****
>       request_queue_size = 5
>   
> !     allow_reuse_address = 0
>   
>       def __init__(self, server_address, RequestHandlerClass):
> --- 158,162 ----
>       request_queue_size = 5
>   
> !     allow_reuse_address = 1
>   
>       def __init__(self, server_address, RequestHandlerClass):

Regards, Peter
-- 
Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260
office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen)


From paul@prescod.net  Thu Dec 14 07:57:30 2000
From: paul@prescod.net (Paul Prescod)
Date: Wed, 13 Dec 2000 23:57:30 -0800
Subject: [Python-Dev] new draft of PEP 227
References: <LNBBLJKPBEHFEDALKOLCEENPIDAA.tim.one@home.com>
Message-ID: <3A387D6A.782E6A3B@prescod.net>

Tim Peters wrote:
> 
> ...
> 
> I've rarely seen problems due to shadowing a global, but have often seen
> problems due to shadowing a builtin.  

Really?

I think that there are two different issues here. One is consciously
choosing to create a new variable but not understanding that there
already exists a variable by that name. (i.e. str, list).

Another is trying to assign to a global but actually shadowing it. There
is no way that anyone coming from another language is going to consider
this transcript reasonable:

>>> a=5
>>> def show():
...    print a
...
>>> def set(val):
...     a=val
...
>>> a
5
>>> show()
5
>>> set(10)
>>> show()
5

It doesn't seem to make any sense. My solution is to make the assignment
in "set" illegal unless you add a declaration that says: "No, really. I
mean it. Override that sucker." As the PEP points out, overriding is
seldom a good idea so the requirement to declare would be rarely
invoked.

Actually, one could argue that there is no good reason to even *allow*
the shadowing of globals. You can always add an underscore to the end of
the variable name to disambiguate.

> Alas, if this rule were extended to
> builtins too-- where it would do the most good --then the names of builtins
> would effectively become reserved words (any code shadowing them today would
> be broken until declarations were added, and any code working today may
> break tomorrow if a new builtin were introduced that happened to have the
> same name as a local).

I have no good solutions to the shadowing-builtins accidently problem.
But I will say that those sorts of problems are typically less subtle:

str = "abcdef"
...
str(5) # You'll get a pretty good error message here!

The "right answer" in terms of namespace theory is to consistently refer
to builtins with a prefix (whether "__builtins__" or "$") but that's
pretty unpalatable from an aesthetic point of view.

 Paul Prescod


From tim.one@home.com  Thu Dec 14 08:41:19 2000
From: tim.one@home.com (Tim Peters)
Date: Thu, 14 Dec 2000 03:41:19 -0500
Subject: [Python-Dev] Online help scope
In-Reply-To: <3A3876DF.5554080C@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEOBIDAA.tim.one@home.com>

[Paul Prescod]
> I think Guido and I are pretty far apart on the scope and requirements
> of this online help thing so I'd like some clarification and opinions
> from the peanut gallery.
>
> Consider these scenarios
>
> a) Signature
> ...
> b) Usage hint
> ...
> c) Complete documentation, paged(man-style)
> ...
> d) Complete documentation in a user-chosen hypertext window
> ...

Guido's style guide has a lot to say about docstrings, suggesting that they
were intended to support two scenarios:  #a+#b together (the first line of a
multi-line docstring), and #c+#d together (the entire docstring).  In this
respect I think Guido was (consciously or not) aping elisp's conventions, up
to but not including the elisp convention for naming the arguments in the
first line of a docstring.  The elisp conventions were very successful
(simple, and useful in practice), so aping them is a good thing.

We've had stalemate ever since:  there isn't a single style of writing
docstrings in practice because no single docstring processor has been
blessed, while no docstring processor can gain momentum before being
blessed.  Every attempt to date has erred by trying to do too much, thus
attracting so much complaint that it can't ever become blessed.  The current
argument over PEP 233 appears more of the same.

The way to break the stalemate is to err on the side of simplicity:  just
cater to the two obvious (first-line vs whole-string) cases, and for
existing docstrings only.  HTML vs plain text is fluff.  Paging vs
non-paging is fluff.  Dumping to stdout vs displaying in a browser is fluff.
Jumping through hoops for functions and modules whose authors didn't bother
to write docstrings is fluff.  Etc.  People fight over fluff until it fills
the air and everyone chokes to death on it <0.9 wink>.

Something dirt simple can get blessed, and once *anything* is blessed, a
million docstrings will bloom.

[Guido]
> That'S What Some People Think.  I Disagree That It Would Be Either
> Feasible Or A Good Idea To Put All Documentation For A Typical Module
> In Its Doc Strings.

I'm with Paul on this one:  that's what module.__doc__ is for, IMO (Javadoc
is great, Eiffel's embedded doc tools are great, Perl POD is great, even
REBOL's interactive help is great).  All Java, Eiffel, Perl and REBOL have
in common that Python lacks is *a* blessed system, no matter how crude.

[back to Paul]
> ...
> No matter what we decide on the issue above, reusing the standard
> documentation is the only practical way of populating the help system
> in the short-term. Right now, today, there is a ton of documentation
> that exists only in LaTeX and HTML. Tons of modules have no docstrings.

Then write tools to automatically create docstrings from the LaTeX and HTML,
but *check in* the results (i.e., add the docstrings so created to the
codebase), and keep the help system simple.

> Keywords have no docstrings.

Neither do integers, but they're obvious too <wink>.



From thomas@xs4all.net  Thu Dec 14 09:13:49 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 14 Dec 2000 10:13:49 +0100
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001214010534.M4396@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 14, 2000 at 01:05:34AM +0100
References: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl>
Message-ID: <20001214101348.N4396@xs4all.nl>

On Thu, Dec 14, 2000 at 01:05:34AM +0100, Thomas Wouters wrote:

> By the way, in woody, there are 52 packages with 'python' in the name, and
> 32 with 'perl' in the name...

Ah, not true, sorry. I shouldn't have posted off-topic stuff after being
awoken by machine-down-alarms ;) That was just what my reasonably-default
install had installed. Debian has what looks like most CPAN modules as
packages, too, so it's closer to a 110/410 spread (python/perl.) Still, not
a bad number :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mal@lemburg.com  Thu Dec 14 10:32:58 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 11:32:58 +0100
Subject: [Python-Dev] new draft of PEP 227
References: <14904.21739.804346.650062@bitdiddle.concentric.net>
Message-ID: <3A38A1DA.7EC49149@lemburg.com>

Jeremy Hylton wrote:
> 
> I've got a new draft of PEP 227.  The terminology and wording are more
> convoluted than they need to be.  I'll do at least one revision just
> to say things more clearly, but I'd appreciate comments on the
> proposed spec if you can read the current draft.

The PEP doesn't mention the problems I pointed out about 
breaking the lookup schemes w/r to symbols in methods, classes
and globals.

Please add a comment about this to the PEP + maybe the example
I gave in one the posts to python-dev about it. I consider
the problem serious enough to limit the nested scoping
to lambda functions (or functions in general) only if that's
possible.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Thu Dec 14 10:55:38 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 11:55:38 +0100
Subject: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule)
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl>
Message-ID: <3A38A72A.4011B5BD@lemburg.com>

Thomas Wouters wrote:
> 
> On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote:
> > > I don't think that is a very safe bet.  Python 2.0 missed the Debian
> > > Potato boat.
> >
> > This may have had to do more with the unresolved GPL issues.
> 
> This is very likely. Debian is very licence -- or at least GPL -- aware.
> Which is a pity, really, because I already prefer it over RedHat in all
> other cases (and RedHat is also pretty licence aware, just less piously,
> devoutly, beyond-practicality-IMHO dedicated to the GPL.)
 
About the GPL issue: as I understood Guido's post, RMS still regards
the choice of law clause as being incompatible to the GPL (heck,
doesn't this guy ever think about international trade terms,
the United Nations Convention on International Sale of Goods
or local law in one of the 200+ countries where you could deploy
GPLed software... is the GPL only meant for US programmers ?).

I am currently rewriting my open source licenses as well and among
other things I chose to integrate a choice of law clause as well.
Seeing RMS' view of things, I guess that my license will be regarded
as incompatible to the GPL which is sad even though I'm in good
company... e.g. the Apache license, the Zope license, etc. Dual
licensing is not possible as it would reopen the loop-wholes in the
GPL I tried to fix in my license. Any idea on how to proceed ?

Another issue: since Python doesn't link Python scripts, is it
still true that if one (pure) Python package is covered by the GPL, 
then all other packages needed by that application will also fall
under GPL ?

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From gstein@lyra.org  Thu Dec 14 11:57:43 2000
From: gstein@lyra.org (Greg Stein)
Date: Thu, 14 Dec 2000 03:57:43 -0800
Subject: (offtopic) Re: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <3A38A72A.4011B5BD@lemburg.com>; from mal@lemburg.com on Thu, Dec 14, 2000 at 11:55:38AM +0100
References: <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> <3A38A72A.4011B5BD@lemburg.com>
Message-ID: <20001214035742.Z8951@lyra.org>

On Thu, Dec 14, 2000 at 11:55:38AM +0100, M.-A. Lemburg wrote:
>...
> I am currently rewriting my open source licenses as well and among
> other things I chose to integrate a choice of law clause as well.
> Seeing RMS' view of things, I guess that my license will be regarded
> as incompatible to the GPL which is sad even though I'm in good
> company... e.g. the Apache license, the Zope license, etc. Dual
> licensing is not possible as it would reopen the loop-wholes in the
> GPL I tried to fix in my license. Any idea on how to proceed ?

Only RMS is under the belief that the Apache license is incompatible. It is
either clause 4 or 5 (I forget which) where we state that certain names
(e.g. "Apache") cannot be used in derived products' names and promo
materials. RMS views this as an "additional restriction on redistribution",
which is apparently not allowed by the GPL.

We (the ASF) generally feel he is being a royal pain in the ass with this.
We've sent him a big, long email asking for clarification / resolution, but
haven't heard back (we sent it a month or so ago). Basically, his FUD
creates views such as yours ("the Apache license is incompatible with the
GPL") because people just take his word for it. We plan to put together a
web page to outline our own thoughts and licensing beliefs/philosophy.

We're also planning to rev our license to rephrase/alter the particular
clause, but for logistic purposes (putting the project name in there ties it
to the particular project; we want a generic ASF license that can be applied
to all of the projects without a search/replace).

At this point, the ASF is taking the position of ignoring him and his
controlling attitude(*) and beliefs. There is the outstanding letter to him,
but that doesn't really change our point of view.

Cheers,
-g

(*) for a person espousing freedom, it is rather ironic just how much of a
control freak he is (stemming from a no-compromise position to guarantee
peoples' freedoms, he always wants things done his way)

-- 
Greg Stein, http://www.lyra.org/


From tg@melaten.rwth-aachen.de  Thu Dec 14 13:07:12 2000
From: tg@melaten.rwth-aachen.de (Thomas Gellekum)
Date: 14 Dec 2000 14:07:12 +0100
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: Andrew Kuchling's message of "Wed, 13 Dec 2000 19:26:32 -0500"
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
 <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
 <20001213074423.A30348@kronos.cnri.reston.va.us>
 <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>
 <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>
 <20001213192632.A30585@kronos.cnri.reston.va.us>
Message-ID: <kq3dfrkv7j.fsf@cip12.melaten.rwth-aachen.de>

Andrew Kuchling <akuchlin@mems-exchange.org> writes:

> I'm ambivalent about the list_of_panels.  It's a linked list storing
> (PyWindow, PyPanel) pairs.  Probably it should use a dictionary
> instead of implementing a little list, just to reduce the amount of
> code.

I don't like it either, so feel free to shred it. As I said, this is
the first (piece of an) extension module I've written and I thought it
would be easier to implement a little list than to manage a Python
list or such in C.

> So, I suggest we create _curses_panel.c, which would be available as
> curses.panel.  (A panel.py module could then add any convenience
> functions that are required.)
> 
> Thomas, do you want to work on this, or should I?

Just do it. I'll try to add more examples in the meantime.

tg


From fredrik@pythonware.com  Thu Dec 14 13:19:08 2000
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 14 Dec 2000 14:19:08 +0100
Subject: [Python-Dev] fuzzy logic?
Message-ID: <015101c065d0$717d1680$0900a8c0@SPIFF>

here's a simple (but somewhat strange) test program:

def spam():
    a = 1
    if (0):
        global a
        print "global a"
    a = 2

def egg():
    b = 1
    if 0:
        global b
        print "global b"
    b = 2

egg()
spam()

print a
print b

if I run this under 1.5.2, I get:

    2
    Traceback (innermost last):
        File "<stdin>", line 19, in ?
    NameError: b

</F>



From gstein@lyra.org  Thu Dec 14 13:42:11 2000
From: gstein@lyra.org (Greg Stein)
Date: Thu, 14 Dec 2000 05:42:11 -0800
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: <015101c065d0$717d1680$0900a8c0@SPIFF>; from fredrik@pythonware.com on Thu, Dec 14, 2000 at 02:19:08PM +0100
References: <015101c065d0$717d1680$0900a8c0@SPIFF>
Message-ID: <20001214054210.G8951@lyra.org>

I would take a guess that the "if 0:" is optimized away *before* the
inspection for a "global" statement. But the compiler doesn't know how to
optimize away "if (0):", so the global statement remains.

Ah. Just checked. Look at compile.c::com_if_stmt(). There is a call to
"is_constant_false()" in there.

Heh. Looks like is_constant_false() could be made a bit smarter. But the
point is valid: you can make is_constant_false() as smart as you want, and
you'll still end up with "funny" global behavior.

Cheers,
-g

On Thu, Dec 14, 2000 at 02:19:08PM +0100, Fredrik Lundh wrote:
> here's a simple (but somewhat strange) test program:
> 
> def spam():
>     a = 1
>     if (0):
>         global a
>         print "global a"
>     a = 2
> 
> def egg():
>     b = 1
>     if 0:
>         global b
>         print "global b"
>     b = 2
> 
> egg()
> spam()
> 
> print a
> print b
> 
> if I run this under 1.5.2, I get:
> 
>     2
>     Traceback (innermost last):
>         File "<stdin>", line 19, in ?
>     NameError: b
> 
> </F>
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Greg Stein, http://www.lyra.org/


From mwh21@cam.ac.uk  Thu Dec 14 13:58:24 2000
From: mwh21@cam.ac.uk (Michael Hudson)
Date: 14 Dec 2000 13:58:24 +0000
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: "Fredrik Lundh"'s message of "Thu, 14 Dec 2000 14:19:08 +0100"
References: <015101c065d0$717d1680$0900a8c0@SPIFF>
Message-ID: <m3snnr14vz.fsf@atrus.jesus.cam.ac.uk>

1) Is there anything is the standard library that does the equivalent
   of

import symbol,token

def decode_ast(ast):
    if token.ISTERMINAL(ast[0]):
        return (token.tok_name[ast[0]], ast[1])
    else:
        return (symbol.sym_name[ast[0]],)+tuple(map(decode_ast,ast[1:]))

  so that, eg:

>>> pprint.pprint(decode.decode_ast(parser.expr("0").totuple()))
('eval_input',
 ('testlist',
  ('test',
   ('and_test',
    ('not_test',
     ('comparison',
      ('expr',
       ('xor_expr',
        ('and_expr',
         ('shift_expr',
          ('arith_expr',
           ('term',
            ('factor', ('power', ('atom', ('NUMBER', '0'))))))))))))))),
 ('NEWLINE', ''),
 ('ENDMARKER', ''))

  ?  Should there be?  (Especially if it was a bit better written).

... and Greg's just said everything else I wanted to!

Cheers,
M.

-- 
  please realize that the Common  Lisp community is more than 40 
  years old.  collectively, the community has already been where 
  every clueless newbie  will be going for the next three years.  
  so relax, please.                     -- Erik Naggum, comp.lang.lisp



From guido@python.org  Thu Dec 14 14:51:26 2000
From: guido@python.org (Guido van Rossum)
Date: Thu, 14 Dec 2000 09:51:26 -0500
Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py)
In-Reply-To: Your message of "Thu, 14 Dec 2000 08:42:59 +0100."
 <m146T2Z-000DmFC@artcom0.artcom-gmbh.de>
References: <m146T2Z-000DmFC@artcom0.artcom-gmbh.de>
Message-ID: <200012141451.JAA15637@cj20424-a.reston1.va.home.com>

> I think the following change is incompatible and will break applications.
> 
> At least I have some server type applications that rely on 
> 'allow_reuse_address' defaulting to 0, because they use
> the 'address already in use' exception, to make sure, that exactly one
> server process is running on this port.  One of these applications, 
> which is BTW build on top of Fredrik Lundhs 'xmlrpclib' fails to work,
> if I change this default in SocketServer.py.  
> 
> Would you please explain the reasoning behind this change?

The reason for the patch is that without this, if you kill a TCP server
and restart it right away, you'll get a 'port in use" error -- TCP has
some kind of strange wait period after a connection is closed before
it can be reused.  The patch avoids this error.

As far as I know, with TCP, code using SO_REUSEADDR still cannot bind
to the port when another process is already using it, but for UDP, the
semantics may be different.

Is your server using UDP?

Try this patch if your problem is indeed related to UDP:

*** SocketServer.py	2000/12/13 20:39:17	1.20
--- SocketServer.py	2000/12/14 14:48:16
***************
*** 268,273 ****
--- 268,275 ----
  
      """UDP server class."""
  
+     allow_reuse_address = 0
+ 
      socket_type = socket.SOCK_DGRAM
  
      max_packet_size = 8192

If this works for you, I'll check it in, of course.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@alum.mit.edu  Thu Dec 14 14:52:37 2000
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 14 Dec 2000 09:52:37 -0500 (EST)
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <3A38A1DA.7EC49149@lemburg.com>
References: <14904.21739.804346.650062@bitdiddle.concentric.net>
 <3A38A1DA.7EC49149@lemburg.com>
Message-ID: <14904.57013.371474.691948@bitdiddle.concentric.net>

>>>>> "MAL" == M -A Lemburg <mal@lemburg.com> writes:

  MAL> Jeremy Hylton wrote:
  >>
  >> I've got a new draft of PEP 227.  The terminology and wording are
  >> more convoluted than they need to be.  I'll do at least one
  >> revision just to say things more clearly, but I'd appreciate
  >> comments on the proposed spec if you can read the current draft.

  MAL> The PEP doesn't mention the problems I pointed out about
  MAL> breaking the lookup schemes w/r to symbols in methods, classes
  MAL> and globals.

I believe it does.  There was some discussion on python-dev and
with others in private email about how classes should be handled.

The relevant section of the specification is:

    If a name is used within a code block, but it is not bound there
    and is not declared global, the use is treated as a reference to
    the nearest enclosing function region.  (Note: If a region is
    contained within a class definition, the name bindings that occur
    in the class block are not visible to enclosed functions.)

  MAL> Please add a comment about this to the PEP + maybe the example
  MAL> I gave in one the posts to python-dev about it. I consider the
  MAL> problem serious enough to limit the nested scoping to lambda
  MAL> functions (or functions in general) only if that's possible.

If there was some other concern you had, then I don't know what it
was.  I recall that you had a longish example that raised a NameError
immediately :-).

Jeremy


From mal@lemburg.com  Thu Dec 14 15:02:33 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 16:02:33 +0100
Subject: [Python-Dev] new draft of PEP 227
References: <14904.21739.804346.650062@bitdiddle.concentric.net>
 <3A38A1DA.7EC49149@lemburg.com> <14904.57013.371474.691948@bitdiddle.concentric.net>
Message-ID: <3A38E109.54C07565@lemburg.com>

Jeremy Hylton wrote:
> 
> >>>>> "MAL" == M -A Lemburg <mal@lemburg.com> writes:
> 
>   MAL> Jeremy Hylton wrote:
>   >>
>   >> I've got a new draft of PEP 227.  The terminology and wording are
>   >> more convoluted than they need to be.  I'll do at least one
>   >> revision just to say things more clearly, but I'd appreciate
>   >> comments on the proposed spec if you can read the current draft.
> 
>   MAL> The PEP doesn't mention the problems I pointed out about
>   MAL> breaking the lookup schemes w/r to symbols in methods, classes
>   MAL> and globals.
> 
> I believe it does.  There was some discussion on python-dev and
> with others in private email about how classes should be handled.
> 
> The relevant section of the specification is:
> 
>     If a name is used within a code block, but it is not bound there
>     and is not declared global, the use is treated as a reference to
>     the nearest enclosing function region.  (Note: If a region is
>     contained within a class definition, the name bindings that occur
>     in the class block are not visible to enclosed functions.)

Well hidden ;-)

Honestly, I think that you should either make this specific case
more visible to readers of the PEP since this single detail would
produce most of the problems with nested scopes.

BTW, what about nested classes ? AFAIR, the PEP only talks about
nested functions.

>   MAL> Please add a comment about this to the PEP + maybe the example
>   MAL> I gave in one the posts to python-dev about it. I consider the
>   MAL> problem serious enough to limit the nested scoping to lambda
>   MAL> functions (or functions in general) only if that's possible.
> 
> If there was some other concern you had, then I don't know what it
> was.  I recall that you had a longish example that raised a NameError
> immediately :-).

The idea behind the example should have been clear, though.

x = 1
class C:
   x = 2
   def test(self):
       print x
  
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fdrake@acm.org  Thu Dec 14 15:09:57 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 14 Dec 2000 10:09:57 -0500 (EST)
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: <m3snnr14vz.fsf@atrus.jesus.cam.ac.uk>
References: <015101c065d0$717d1680$0900a8c0@SPIFF>
 <m3snnr14vz.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <14904.58053.282537.260186@cj42289-a.reston1.va.home.com>

Michael Hudson writes:
 > 1) Is there anything is the standard library that does the equivalent
 >    of

  No, but I have a chunk of code that does in a different way.  Where
in the library do you think it belongs?  The compiler package sounds
like the best place, but that's not installed by default.  (Jeremy, is
that likely to change soon?)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From mwh21@cam.ac.uk  Thu Dec 14 15:47:33 2000
From: mwh21@cam.ac.uk (Michael Hudson)
Date: 14 Dec 2000 15:47:33 +0000
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: "Fred L. Drake, Jr."'s message of "Thu, 14 Dec 2000 10:09:57 -0500 (EST)"
References: <015101c065d0$717d1680$0900a8c0@SPIFF> <m3snnr14vz.fsf@atrus.jesus.cam.ac.uk> <14904.58053.282537.260186@cj42289-a.reston1.va.home.com>
Message-ID: <m3vgsnovhm.fsf@atrus.jesus.cam.ac.uk>

"Fred L. Drake, Jr." <fdrake@acm.org> writes:

> Michael Hudson writes:
>  > 1) Is there anything is the standard library that does the equivalent
>  >    of
> 
>   No, but I have a chunk of code that does in a different way.  

I'm guessing everyone who's played with the parser much does, hence
the suggestion.  I agree my implementation is probably not optimal - I
just threw it together as quickly as I could!

> Where in the library do you think it belongs?  The compiler package
> sounds like the best place, but that's not installed by default.
> (Jeremy, is that likely to change soon?)

Actually, I'd have thought the parser module would be most natural,
but that would probably mean doing the _module.c trick, and it's
probably not worth the bother.  OTOH, it seems that wrapping any given
extension module in a python module is becoming if anything the norm,
so maybe it is.

Cheers,
M.

-- 
  I don't remember any dirty green trousers.
                                             -- Ian Jackson, ucam.chat



From nowonder@nowonder.de  Thu Dec 14 15:50:10 2000
From: nowonder@nowonder.de (Peter Schneider-Kamp)
Date: Thu, 14 Dec 2000 16:50:10 +0100
Subject: [Python-Dev] [PEP-212] new draft
Message-ID: <3A38EC32.210BD1A2@nowonder.de>

In an attempt to revive PEP 212 - Loop counter iteration I have
updated the draft. The HTML version can be found at:

http://python.sourceforge.net/peps/pep-0212.html

I will appreciate any form of comments and/or criticisms.

Peter

P.S.: Now I have posted it - should I update the Post-History?
      Or is that for posts to c.l.py?


From pf@artcom-gmbh.de  Thu Dec 14 15:56:08 2000
From: pf@artcom-gmbh.de (Peter Funk)
Date: Thu, 14 Dec 2000 16:56:08 +0100 (MET)
Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py)
In-Reply-To: <200012141451.JAA15637@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 14, 2000  9:51:26 am"
Message-ID: <m146ajo-000DmFC@artcom0.artcom-gmbh.de>

Hi,

Moshes checkin indeed makes a lot of sense.  Sorry for the irritation.

Guido van Rossum:
> The reason for the patch is that without this, if you kill a TCP server
> and restart it right away, you'll get a 'port in use" error -- TCP has
> some kind of strange wait period after a connection is closed before
> it can be reused.  The patch avoids this error.
> 
> As far as I know, with TCP, code using SO_REUSEADDR still cannot bind
> to the port when another process is already using it, but for UDP, the
> semantics may be different.
> 
> Is your server using UDP?

No and I must admit, that I didn't tested carefully enough:  From
a quick look at my process listing I assumed there were indeed
two server processes running concurrently which would have broken
the needed mutual exclusion.  But the second process went in
a sleep-and-retry-to-connect-loop which I simply forgot about.
This loop was initially built into my server to wait until the
"strange wait period" you mentioned above was over or a certain
number of retries has been exceeded.

I guess I can take this ugly work-around out with Python 2.0 and newer,
since the BaseHTTPServer.py shipped with Python 2.0 already contained
allow_reuse_address = 1 default in the HTTPServer class.

BTW: I've took my old W.Richard Stevens Unix Network Programming
from the shelf.  After rereading the rather terse paragraph about
SO_REUSEADDR I guess the wait period is necessary to make sure, that
their is no connect pending from an outside client on this TCP port.
I can't find nothing about UDP and REUSE.

Regards, Peter


From guido@python.org  Thu Dec 14 16:17:27 2000
From: guido@python.org (Guido van Rossum)
Date: Thu, 14 Dec 2000 11:17:27 -0500
Subject: [Python-Dev] Online help scope
In-Reply-To: Your message of "Wed, 13 Dec 2000 23:29:35 PST."
 <3A3876DF.5554080C@ActiveState.com>
References: <3A3876DF.5554080C@ActiveState.com>
Message-ID: <200012141617.LAA16179@cj20424-a.reston1.va.home.com>

> I think Guido and I are pretty far apart on the scope and requirements
> of this online help thing so I'd like some clarification and opinions
> from the peanut gallery.

I started replying but I think Tim's said it all.  Let's do something
dead simple.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@digicool.com  Thu Dec 14 17:14:01 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Thu, 14 Dec 2000 12:14:01 -0500
Subject: [Python-Dev] [PEP-212] new draft
References: <3A38EC32.210BD1A2@nowonder.de>
Message-ID: <14904.65497.940293.975775@anthem.concentric.net>

>>>>> "PS" == Peter Schneider-Kamp <nowonder@nowonder.de> writes:

    PS> P.S.: Now I have posted it - should I update the Post-History?
    PS> Or is that for posts to c.l.py?

Originally, I'd thought of it as tracking the posting history to
c.l.py.  I'm not sure how useful that header is after all -- maybe in
just giving a start into the python-list archives...

-Barry


From tim.one@home.com  Thu Dec 14 17:33:41 2000
From: tim.one@home.com (Tim Peters)
Date: Thu, 14 Dec 2000 12:33:41 -0500
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: <015101c065d0$717d1680$0900a8c0@SPIFF>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEOPIDAA.tim.one@home.com>

Note that the behavior of both functions is undefined ("Names listed in a
global statement must not be used in the same code block textually preceding
that global statement", from the Lang Ref, and "if" does not introduce a new
code block in Python's terminology).

But you'll get the same outcome via these trivial variants, which sidestep
that problem:

def spam():
    if (0):
        global a
        print "global a"
    a = 2

def egg():
    if 0:
        global b
        print "global b"
    b = 2

*Now* you can complain <wink>.


> -----Original Message-----
> From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On
> Behalf Of Fredrik Lundh
> Sent: Thursday, December 14, 2000 8:19 AM
> To: python-dev@python.org
> Subject: [Python-Dev] fuzzy logic?
>
>
> here's a simple (but somewhat strange) test program:
>
> def spam():
>     a = 1
>     if (0):
>         global a
>         print "global a"
>     a = 2
>
> def egg():
>     b = 1
>     if 0:
>         global b
>         print "global b"
>     b = 2
>
> egg()
> spam()
>
> print a
> print b
>
> if I run this under 1.5.2, I get:
>
>     2
>     Traceback (innermost last):
>         File "<stdin>", line 19, in ?
>     NameError: b
>
> </F>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://www.python.org/mailman/listinfo/python-dev



From tim.one@home.com  Thu Dec 14 18:46:09 2000
From: tim.one@home.com (Tim Peters)
Date: Thu, 14 Dec 2000 13:46:09 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule)
In-Reply-To: <3A38A72A.4011B5BD@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPFIDAA.tim.one@home.com>

[MAL]
> About the GPL issue: as I understood Guido's post, RMS still regards
> the choice of law clause as being incompatible to the GPL

Yes.  Actually, I don't know what RMS really thinks -- his public opinions
on legal issues appear to be echoes of what Eben Moglen tells him.  Like his
views or not, Moglen is a tenured law professor

> (heck, doesn't this guy ever think about international trade terms,
> the United Nations Convention on International Sale of Goods
> or local law in one of the 200+ countries where you could deploy
> GPLed software...

Yes.

> is the GPL only meant for US programmers ?).

No.  Indeed, that's why the GPL is grounded in copyright law, because
copyright law is the most uniform (across countries) body of law we've got.
Most commentary I've seen suggests that the GPL has its *weakest* legal legs
in the US!

> I am currently rewriting my open source licenses as well and among
> other things I chose to integrate a choice of law clause as well.
> Seeing RMS' view of things, I guess that my license will be regarded
> as incompatible to the GPL

Yes.

> which is sad even though I'm in good company... e.g. the Apache
> license, the Zope license, etc. Dual licensing is not possible as
> it would reopen the loop-wholes in the GPL I tried to fix in my
> license. Any idea on how to proceed ?

You can wait to see how the CNRI license turns out, then copy it if it's
successful; you can approach the FSF directly; you can stop trying to do it
yourself and reuse some license that's already been blessed by the FSF; or
you can give up on GPL compatibility (according to the FSF).  I don't see
any other choices.

> Another issue: since Python doesn't link Python scripts, is it
> still true that if one (pure) Python package is covered by the GPL,
> then all other packages needed by that application will also fall
> under GPL ?

Sorry, couldn't make sense of the question.  Just as well, since you should
ask about it on a GNU forum anyway <wink>.



From mal@lemburg.com  Thu Dec 14 20:02:05 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 21:02:05 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <LNBBLJKPBEHFEDALKOLCIEPFIDAA.tim.one@home.com>
Message-ID: <3A39273D.4AE24920@lemburg.com>

Tim Peters wrote:
> 
> [MAL]
> > About the GPL issue: as I understood Guido's post, RMS still regards
> > the choice of law clause as being incompatible to the GPL
> 
> Yes.  Actually, I don't know what RMS really thinks -- his public opinions
> on legal issues appear to be echoes of what Eben Moglen tells him.  Like his
> views or not, Moglen is a tenured law professor

But it's his piece of work, isn't it ? He's the one who can change it.
 
> > (heck, doesn't this guy ever think about international trade terms,
> > the United Nations Convention on International Sale of Goods
> > or local law in one of the 200+ countries where you could deploy
> > GPLed software...
> 
> Yes.

Strange, then how come he sees the choice of law clause as a problem:
without explicitely ruling out the applicability of the UN CISC,
this clause is waived by it anyway... at least according to a 
specialist on software law here in Germany.

> > is the GPL only meant for US programmers ?).
> 
> No.  Indeed, that's why the GPL is grounded in copyright law, because
> copyright law is the most uniform (across countries) body of law we've got.
> Most commentary I've seen suggests that the GPL has its *weakest* legal legs
> in the US!

Huh ? Just an example: in Germany customer rights assure a 6 month
warranty on everything you buy or obtain in some other way. Liability
is another issue: there are some very unpleasant laws which render
most of the "no liability" paragraphs in licenses useless in Germany.

Even better: since the license itself is written in English a
German party could simply consider the license non-binding, since
he or she hasn't agreed to accept contract in foreign languages.
France has similar interpretations.

> > I am currently rewriting my open source licenses as well and among
> > other things I chose to integrate a choice of law clause as well.
> > Seeing RMS' view of things, I guess that my license will be regarded
> > as incompatible to the GPL
> 
> Yes.
> 
> > which is sad even though I'm in good company... e.g. the Apache
> > license, the Zope license, etc. Dual licensing is not possible as
> > it would reopen the loop-wholes in the GPL I tried to fix in my
> > license. Any idea on how to proceed ?
> 
> You can wait to see how the CNRI license turns out, then copy it if it's
> successful; you can approach the FSF directly; you can stop trying to do it
> yourself and reuse some license that's already been blessed by the FSF; or
> you can give up on GPL compatibility (according to the FSF).  I don't see
> any other choices.

I guess I'll go with the latter.
 
> > Another issue: since Python doesn't link Python scripts, is it
> > still true that if one (pure) Python package is covered by the GPL,
> > then all other packages needed by that application will also fall
> > under GPL ?
> 
> Sorry, couldn't make sense of the question.  Just as well, since you should
> ask about it on a GNU forum anyway <wink>.

Isn't this question (whether the GPL virus applies to byte-code
as well) important to Python programmers as well ?

Oh well, nevermind... it's still nice to hear that CNRI and RMS
have finally made up their minds to render Python GPL-compatible --
whatever this means ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From cgw@fnal.gov  Thu Dec 14 21:06:43 2000
From: cgw@fnal.gov (Charles G Waldman)
Date: Thu, 14 Dec 2000 15:06:43 -0600 (CST)
Subject: [Python-Dev] memory leaks
Message-ID: <14905.13923.659879.100243@buffalo.fnal.gov>

The following code (extracted from test_extcall.py) leaks memory:

class Foo:
   def method(self, arg1, arg2):
        return arg1 + arg2

def f():
    err = None
    try:
        Foo.method(*(1, 2, 3))
    except TypeError, err:
        pass
    del err



One-line fix (also posted to Sourceforge):

--- Python/ceval.c	2000/10/30 17:15:19	2.213
+++ Python/ceval.c	2000/12/14 20:54:02
@@ -1905,8 +1905,7 @@
 							 class))) {
 				    PyErr_SetString(PyExc_TypeError,
 	    "unbound method must be called with instance as first argument");
-				    x = NULL;
-				    break;
+				    goto extcall_fail;
 				}
 			    }
 			}



I think that there are a bunch more memory leaks lurking around...
this only fixes one of them.  I'll send more info as I find out what's
going on.



From tim.one@home.com  Thu Dec 14 21:28:09 2000
From: tim.one@home.com (Tim Peters)
Date: Thu, 14 Dec 2000 16:28:09 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <3A39273D.4AE24920@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEPLIDAA.tim.one@home.com>

I'm not going to argue about the GPL.  Take it up with the FSF!  I will say
that if you do get the FSF's attention, Moglen will have an instant counter
to any objection you're likely to raise -- he's been thinking about this for
10 years, and he's heard it all.  And in our experience, RMS won't commit to
anything before running it past Moglen.

[MAL]
> But it's his [RMS's] piece of work, isn't it ? He's the one who can
> change it.

Akin to saying Python is Guido's piece of work.  Yes, no, kinda, more true
at some times than others, ditto respects.  RMS has consistently said that
any changes for the next version of the GPL will take at least a year, due
to extensive legal review required first.  Would be more clearly true to say
that the first version of the GPL was RMS's alone -- but version 2 came out
in 1991.

> ...
> Strange, then how come he sees the choice of law clause as a problem:
> without explicitely ruling out the applicability of the UN CISC,
> this clause is waived by it anyway... at least according to a
> specialist on software law here in Germany.
> ... [and other "who knows?" objections] ...

Guido quoted the text of your Wed, 06 Sep 2000 14:19:09 +0200 "Re:
[License-py20] Re: GPL incompability as seen from Europe" msg to Moglen, who
dismissed it almost offhandedly as "layman's commentary".  You'll have to
ask him why:  MAL, we're not lawyers.  We're incompetent to have this
discussion -- or at least I am, and Moglen thinks you are too <wink>.

>>> Another issue: since Python doesn't link Python scripts, is it
>>> still true that if one (pure) Python package is covered by the GPL,
>>> then all other packages needed by that application will also fall
>>> under GPL ?

[Tim]
>> Sorry, couldn't make sense of the question.  Just as well,
>> since you should ask about it on a GNU forum anyway <wink>.

[MAL]
> Isn't this question (whether the GPL virus applies to byte-code
> as well) important to Python programmers as well ?

I don't know -- like I said, I couldn't make sense of the question, i.e. I
couldn't figure out what it is you're asking.  I *suspect* it's based on a
misunderstanding of the GPL; for example, gcc is a GPL'ed application that
requires stuff from the OS in order to do its job of compiling, but that
doesn't mean that every OS it runs on falls under the GPL.  The GPL contains
no restrictions on *use*, it restricts only copying, modifying and
distributing (the specific rights granted by copyright law).  I don't see
any way to read the GPL as restricting your ability to distribute a GPL'ed
program P on its own, no matter what the status of the packages that P may
rely upon for operation.

The GPL is also not viral in the sense that it cannot infect an unwitting
victim.  Nothing whatsoever you do or don't do can make *any* other program
Q "fall under" the GPL -- only Q's owner can set the license for Q.  The GPL
purportedly can prevent you from distributing (but not from using) a program
that links with a GPL'ed program, but that doesn't appear to be what you're
asking about.  Or is it?

If you were to put, say, mxDateTime, under the GPL, then yes, I believe the
FSF would claim I could not distribute my program T that uses mxDateTime
unless T were also under the GPL or a GPL-compatible license.  But if
mxDateTime is not under the GPL, then nothing I do with T can magically
change the mxDateTime license to the GPL (although if your mxDateTime
license allows me to redistribute mxDateTime under a different license, then
it allows me to ship a copy of mxDateTime under the GPL).

That said, the whole theory of GPL linking is muddy to me, especially since
the word "link" (and its variants) doesn't appear in the GPL.

> Oh well, nevermind... it's still nice to hear that CNRI and RMS
> have finally made up their minds to render Python GPL-compatible --
> whatever this means ;-)

I'm not sure it means anything yet.  CNRI and the FSF believed they reached
agreement before, but that didn't last after Moglen and Kahn each figured
out what the other was really suggesting.



From mal@lemburg.com  Thu Dec 14 22:25:31 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 23:25:31 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <LNBBLJKPBEHFEDALKOLCEEPLIDAA.tim.one@home.com>
Message-ID: <3A3948DB.9165E404@lemburg.com>

Tim Peters wrote:
> 
> I'm not going to argue about the GPL.  Take it up with the FSF! 

Sorry, I got a bit carried away -- I don't want to take it up
with the FSF, simply because I couldn't care less. What's bugging
me is that this one guy is splitting the OSS world in two even 
though both halfs actually want the same thing: software which
you can use for free with full source code. I find that a very 
poor situation.

> I will say
> that if you do get the FSF's attention, Moglen will have an instant counter
> to any objection you're likely to raise -- he's been thinking about this for
> 10 years, and he's heard it all.  And in our experience, RMS won't commit to
> anything before running it past Moglen.
> 
> [MAL]
> > But it's his [RMS's] piece of work, isn't it ? He's the one who can
> > change it.
> 
> Akin to saying Python is Guido's piece of work.  Yes, no, kinda, more true
> at some times than others, ditto respects.  RMS has consistently said that
> any changes for the next version of the GPL will take at least a year, due
> to extensive legal review required first.  Would be more clearly true to say
> that the first version of the GPL was RMS's alone -- but version 2 came out
> in 1991.

Point taken.
 
> > ...
> > Strange, then how come he sees the choice of law clause as a problem:
> > without explicitely ruling out the applicability of the UN CISC,
> > this clause is waived by it anyway... at least according to a
> > specialist on software law here in Germany.
> > ... [and other "who knows?" objections] ...
> 
> Guido quoted the text of your Wed, 06 Sep 2000 14:19:09 +0200 "Re:
> [License-py20] Re: GPL incompability as seen from Europe" msg to Moglen, who
> dismissed it almost offhandedly as "layman's commentary".  You'll have to
> ask him why:  MAL, we're not lawyers.  We're incompetent to have this
> discussion -- or at least I am, and Moglen thinks you are too <wink>.

I'm not a lawyer either, but I am able to apply common sense and 
know about German trade laws. Anyway, here a reference which
covers all the controversial subjects. It's in German, but these
guys qualify as lawyers ;-) ...

	http://www.ifross.de/ifross_html/index.html

There's also a book on the subject in German which covers
all aspects of software licensing. Here's the reference in
case anyone cares:

	Jochen Marly, Softwareüberlassungsverträge
	C.H. Beck, München, 2000
 
> >>> Another issue: since Python doesn't link Python scripts, is it
> >>> still true that if one (pure) Python package is covered by the GPL,
> >>> then all other packages needed by that application will also fall
> >>> under GPL ?
> 
> [Tim]
> >> Sorry, couldn't make sense of the question.  Just as well,
> >> since you should ask about it on a GNU forum anyway <wink>.
> 
> [MAL]
> > Isn't this question (whether the GPL virus applies to byte-code
> > as well) important to Python programmers as well ?
> 
> I don't know -- like I said, I couldn't make sense of the question, i.e. I
> couldn't figure out what it is you're asking.  I *suspect* it's based on a
> misunderstanding of the GPL; for example, gcc is a GPL'ed application that
> requires stuff from the OS in order to do its job of compiling, but that
> doesn't mean that every OS it runs on falls under the GPL.  The GPL contains
> no restrictions on *use*, it restricts only copying, modifying and
> distributing (the specific rights granted by copyright law).  I don't see
> any way to read the GPL as restricting your ability to distribute a GPL'ed
> program P on its own, no matter what the status of the packages that P may
> rely upon for operation.

This is very controversial: if an application Q needs a GPLed 
library P to work, then P and Q form a new whole in the sense of
the GPL. And this even though P wasn't even distributed together
with Q. Don't ask me why, but that's how RMS and folks look at it.

It can be argued that the dynamic linker actually integrates
P into Q, but is the same argument valid for a Python program Q
which relies on a GPLed package P ? (The relationship between
Q and P is one of providing interfaces -- there is no call address
patching required for the setup to work.)

> The GPL is also not viral in the sense that it cannot infect an unwitting
> victim.  Nothing whatsoever you do or don't do can make *any* other program
> Q "fall under" the GPL -- only Q's owner can set the license for Q.  The GPL
> purportedly can prevent you from distributing (but not from using) a program
> that links with a GPL'ed program, but that doesn't appear to be what you're
> asking about.  Or is it?

No. What's viral about the GPL is that you can turn an application
into a GPLed one by merely linking the two together -- that's why
e.g. the libc is distributed under the LGPL which doesn't have this
viral property.
 
> If you were to put, say, mxDateTime, under the GPL, then yes, I believe the
> FSF would claim I could not distribute my program T that uses mxDateTime
> unless T were also under the GPL or a GPL-compatible license.  But if
> mxDateTime is not under the GPL, then nothing I do with T can magically
> change the mxDateTime license to the GPL (although if your mxDateTime
> license allows me to redistribute mxDateTime under a different license, then
> it allows me to ship a copy of mxDateTime under the GPL).
>
> That said, the whole theory of GPL linking is muddy to me, especially since
> the word "link" (and its variants) doesn't appear in the GPL.

True.
 
> > Oh well, nevermind... it's still nice to hear that CNRI and RMS
> > have finally made up their minds to render Python GPL-compatible --
> > whatever this means ;-)
> 
> I'm not sure it means anything yet.  CNRI and the FSF believed they reached
> agreement before, but that didn't last after Moglen and Kahn each figured
> out what the other was really suggesting.

Oh boy...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From greg@cosc.canterbury.ac.nz  Thu Dec 14 23:19:09 2000
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 15 Dec 2000 12:19:09 +1300 (NZDT)
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <3A3948DB.9165E404@lemburg.com>
Message-ID: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz>

"M.-A. Lemburg" <mal@lemburg.com>:
> if an application Q needs a GPLed 
> library P to work, then P and Q form a new whole in the sense of
> the GPL.

I don't see how Q can *need* any particular library P
to work. The most it can need is some library with
an API which is compatible with P's. So I don't
buy that argument.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Thu Dec 14 23:58:24 2000
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 15 Dec 2000 12:58:24 +1300 (NZDT)
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <3A387005.6725DAAE@ActiveState.com>
Message-ID: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz>

Paul Prescod <paulp@ActiveState.com>:

> We could say that a local can only shadow a global 
> if the local is formally declared.

How do you intend to enforce that? Seems like it would
require a test on every assignment to a local, to make
sure nobody has snuck in a new global since the function
was compiled.

> Actually, one could argue that there is no good reason to 
> even *allow* the shadowing of globals.

If shadowing were completely disallowed, it would make it
impossible to write a completely self-contained function
whose source could be moved from one environment to another
without danger of it breaking. I wouldn't like the language
to have a characteristic like that.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Fri Dec 15 00:06:12 2000
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 15 Dec 2000 13:06:12 +1300 (NZDT)
Subject: [Python-Dev] Online help scope
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEOBIDAA.tim.one@home.com>
Message-ID: <200012150006.NAA02154@s454.cosc.canterbury.ac.nz>

Tim Peters <tim.one@home.com>:

> [Paul Prescod]

> > Keywords have no docstrings.

> Neither do integers, but they're obvious too <wink>.

Oh, I don't know, it could be useful.

>>> help(2)
The first prime number.

>>> help(2147483647)
sys.maxint, the largest Python small integer.

>>> help(42)
The answer to the ultimate question of life, the universe
and everything. See also: ultimate_question.

>>> help("ultimate_question")
[Importing research.mice.earth]
[Calling earth.find_ultimate_question]
This may take about 10 million years, please be patient...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From barry@digicool.com  Fri Dec 15 00:33:16 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Thu, 14 Dec 2000 19:33:16 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <3A3948DB.9165E404@lemburg.com>
 <200012142319.MAA02140@s454.cosc.canterbury.ac.nz>
Message-ID: <14905.26316.407495.981198@anthem.concentric.net>

>>>>> "GE" == Greg Ewing <greg@cosc.canterbury.ac.nz> writes:

    GE> I don't see how Q can *need* any particular library P to
    GE> work. The most it can need is some library with an API which
    GE> is compatible with P's. So I don't buy that argument.

It's been my understanding that the FSF's position on this is as
follows.  If the only functional implementation of the API is GPL'd
software then simply writing your code against that API is tantamount
to linking with that software.  Their reasoning is that the clear
intent of the programmer (shut up, Chad) is to combine the program
with GPL code.  As soon as there is a second, non-GPL implementation
of the API, you're fine because while you may not distribute your
program with the GPL'd software linked in, those who receive your
software wouldn't be forced to combine GPL and non-GPL code.

-Barry


From tim.one@home.com  Fri Dec 15 03:01:36 2000
From: tim.one@home.com (Tim Peters)
Date: Thu, 14 Dec 2000 22:01:36 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <3A3948DB.9165E404@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEAEIEAA.tim.one@home.com>

[MAL]
> Sorry, I got a bit carried away -- I don't want to take it up
> with the FSF, simply because I couldn't care less.

Well, nobody else is able to Pronounce on what the FSF believes or will do.
Which tells me that you're not really interested in playing along with the
FSF here after all -- which we both knew from the start anyway <wink>.

> What's bugging me is that this one guy is splitting the OSS world

There are many people on the FSF bandwagon.  I'm not one of them, but I can
count.

> in two even though both halfs actually want the same thing: software
> which you can use for free with full source code. I find that a very
> poor situation.

RMS would not agree that both halves want the same thing; to the contrary,
he's openly contemptuous of the Open Source movement -- which you also knew
from the start.

> [stuff about German law I won't touch with 12-foot schnitzel]

OTOH, a German FSF advocate assured me:

    I also tend to forget that the system of the law works different
    in the US as in Germany. In Germany something that most people
    will believe (called "common grounds") play a role in the court.
    So if you knew, because it is widely known what the GPL means,
    than it is harder to attack that in court.

In the US, when something gets to court it doesn't matter at all what people
believed about it.  Heck, we'll let mass murderers go free if a comma was in
the wrong place in a 1592 statute, or send a kid to jail for life for using
crack cocaine instead of the flavor favored by stockbrokers <wink>.  I hope
the US is unique in that respect, but it does makes the GPL weaker here
because even if *everyone* in our country believed the GPL means what RMS
says it means, a US court would give that no weight in its logic-chopping.

>>> Another issue: since Python doesn't link Python scripts, is it
>>> still true that if one (pure) Python package is covered by the GPL,
>>> then all other packages needed by that application will also fall
>>> under GPL ?

> This is very controversial: if an application Q needs a GPLed
> library P to work, then P and Q form a new whole in the sense of
> the GPL. And this even though P wasn't even distributed together
> with Q. Don't ask me why, but that's how RMS and folks look at it.

Understood, but have you reread your question above, which I've said twice I
can't make sense of?  That's not what you were asking about.  Your question
above asks, if anything, the opposite:  the *application* Q is GPL'ed, and
the question above asks whether that means the *Ps* it depends on must also
be GPL'ed.  To the best of my ability, I've answered "NO" to that one, and
"YES" to the question it appears you meant to ask.

> It can be argued that the dynamic linker actually integrates
> P into Q, but is the same argument valid for a Python program Q
> which relies on a GPLed package P ? (The relationship between
> Q and P is one of providing interfaces -- there is no call address
> patching required for the setup to work.)

As before, I believe the FSF will say YES.  Unless there's also a non-GPL'ed
implementation of the same interface that people could use just as well.
See my extended mxDateTime example too.

> ...
> No. What's viral about the GPL is that you can turn an application
> into a GPLed one by merely linking the two together

No, you cannot.  You can link them together all day without any hassle.
What you cannot do is *distribute* it unless the aggregate is first placed
under the GPL (or a GPL-compatible license) too.  If you distribute it
without taking that step, that doesn't turn it into a GPL'ed application
either -- in that case you've simply (& supposedly) violated the license on
P, so your distribution was simply (& supposedly) illegal.  And that is in
fact the end result that people who knowingly use the GPL want (granting
that it appears most people who use the GPL do so unknowing of its
consequences).

> -- that's why e.g. the libc is distributed under the LGPL which
> doesn't have this viral property.

You should read RMS on why glibc is under the LGPL:

    http://www.fsf.org/philosophy/why-not-lgpl.html

It will at least disabuse you of the notion that RMS and you are after the
same thing <wink>.



From paulp@ActiveState.com  Fri Dec 15 04:02:08 2000
From: paulp@ActiveState.com (Paul Prescod)
Date: Thu, 14 Dec 2000 20:02:08 -0800
Subject: [Python-Dev] new draft of PEP 227
References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz>
Message-ID: <3A3997C0.F977AF51@ActiveState.com>

Greg Ewing wrote:
> 
> Paul Prescod <paulp@ActiveState.com>:
> 
> > We could say that a local can only shadow a global
> > if the local is formally declared.
> 
> How do you intend to enforce that? Seems like it would
> require a test on every assignment to a local, to make
> sure nobody has snuck in a new global since the function
> was compiled.

I would expect that all of the checks would be at compile-time. Except
for __dict__ hackery, I think it is doable. Python already keeps track
of all assignments to locals and all assignments to globals in a
function scope. The only addition is keeping track of assignments at a
global scope.

> > Actually, one could argue that there is no good reason to
> > even *allow* the shadowing of globals.
> 
> If shadowing were completely disallowed, it would make it
> impossible to write a completely self-contained function
> whose source could be moved from one environment to another
> without danger of it breaking. I wouldn't like the language
> to have a characteristic like that.

That seems like a very esoteric requirement. How often do you have
functions that do not rely *at all* on their environment (other
functions, import statements, global variables).

When you move code you have to do some rewriting or customizing of the
environment in 94% of the cases. How much effort do you want to spend on
the other 6%? Also, there are tools that are designed to help you move
code without breaking programs (refactoring editors). They can just as
easily handle renaming local variables as adding import statements and
fixing up function calls.

 Paul Prescod


From mal@lemburg.com  Fri Dec 15 10:05:59 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 11:05:59 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz>
Message-ID: <3A39ED07.6B3EE68E@lemburg.com>

Greg Ewing wrote:
> 
> "M.-A. Lemburg" <mal@lemburg.com>:
> > if an application Q needs a GPLed
> > library P to work, then P and Q form a new whole in the sense of
> > the GPL.
> 
> I don't see how Q can *need* any particular library P
> to work. The most it can need is some library with
> an API which is compatible with P's. So I don't
> buy that argument.

It's the view of the FSF, AFAIK. You can't distribute an application
in binary which dynamically links against libreadline (which is GPLed)
on the user's machine, since even though you don't distribute
libreadline the application running on the user's machine is
considered the "whole" in terms of the GPL.

FWIW, I don't agree with that view either, but that's probably
because I'm a programmer and not a lawyer :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Fri Dec 15 10:25:12 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 11:25:12 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <LNBBLJKPBEHFEDALKOLCKEAEIEAA.tim.one@home.com>
Message-ID: <3A39F188.E366B481@lemburg.com>

Tim Peters wrote:
> 
> [Tim and MAL talking about the FSF and their views]
> 
> [Tim and MAL showing off as hobby advocates ;-)]
> 
> >>> Another issue: since Python doesn't link Python scripts, is it
> >>> still true that if one (pure) Python package is covered by the GPL,
> >>> then all other packages needed by that application will also fall
> >>> under GPL ?
> 
> > This is very controversial: if an application Q needs a GPLed
> > library P to work, then P and Q form a new whole in the sense of
> > the GPL. And this even though P wasn't even distributed together
> > with Q. Don't ask me why, but that's how RMS and folks look at it.
> 
> Understood, but have you reread your question above, which I've said twice I
> can't make sense of? 

I know, it was backwards. 

Take an example: I have a program which
wants to process MP3 files in some way. Now because of some stroke
is luck, all Python MP3 modules out there are covered by the GPL.

Now I could write an application which uses a certain interface
and then tell the user to install the MP3 module separately.
As Barry mentioned, this setup will cause distribution of my 
application to be illegal because I could have only done so
by putting the application under the GPL.

> You should read RMS on why glibc is under the LGPL:
> 
>     http://www.fsf.org/philosophy/why-not-lgpl.html
> 
> It will at least disabuse you of the notion that RMS and you are after the
> same thing <wink>.

:-) 

Let's stop this discussion and get back to those cheerful things
like Christmas Bells and Santa Clause... :-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From akuchlin@mems-exchange.org  Fri Dec 15 13:27:24 2000
From: akuchlin@mems-exchange.org (A.M. Kuchling)
Date: Fri, 15 Dec 2000 08:27:24 -0500
Subject: [Python-Dev] Use of %c and Py_UNICODE
Message-ID: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com>

unicodeobject.c contains this code:

                PyErr_Format(PyExc_ValueError,
                            "unsupported format character '%c' (0x%x) "
                            "at index %i",
                            c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat));

c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits,
so '%\u3000' % 1 results in an error message containing "'\000'
(0x3000)".  Is this worth fixing?  I'd say no, since the hex value is
more useful for Unicode strings anyway.  (I still wanted to mention
this little buglet, since I just touched this bit of code.)

--amk



From jack@oratrix.nl  Fri Dec 15 14:26:15 2000
From: jack@oratrix.nl (Jack Jansen)
Date: Fri, 15 Dec 2000 15:26:15 +0100
Subject: [Python-Dev] reuse of address default value (was Re:
 [Python-checkins] CVS: python/dist/src/Lib SocketServer.py)
In-Reply-To: Message by Guido van Rossum <guido@python.org> ,
 Thu, 14 Dec 2000 09:51:26 -0500 , <200012141451.JAA15637@cj20424-a.reston1.va.home.com>
Message-ID: <20001215142616.705993B9B44@snelboot.oratrix.nl>

> The reason for the patch is that without this, if you kill a TCP server
> and restart it right away, you'll get a 'port in use" error -- TCP has
> some kind of strange wait period after a connection is closed before
> it can be reused.  The patch avoids this error.

Well, actually there's a pretty good reason for the "port in use" behaviour: 
the TCP standard more-or-less requires it. A srchost/srcport/dsthost/dstport 
combination should not be reused until the maximum TTL has passed, because 
there may still be "old" retransmissions around. Especially the "open" packets 
are potentially dangerous.

Setting the reuse bit while you're debugging is fine, but setting it in 
general is not a very good idea...
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 




From guido@python.org  Fri Dec 15 14:31:19 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 09:31:19 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: Your message of "Thu, 14 Dec 2000 20:02:08 PST."
 <3A3997C0.F977AF51@ActiveState.com>
References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz>
 <3A3997C0.F977AF51@ActiveState.com>
Message-ID: <200012151431.JAA19799@cj20424-a.reston1.va.home.com>

> Greg Ewing wrote:
> > 
> > Paul Prescod <paulp@ActiveState.com>:
> > 
> > > We could say that a local can only shadow a global
> > > if the local is formally declared.
> > 
> > How do you intend to enforce that? Seems like it would
> > require a test on every assignment to a local, to make
> > sure nobody has snuck in a new global since the function
> > was compiled.
> 
> I would expect that all of the checks would be at compile-time. Except
> for __dict__ hackery, I think it is doable. Python already keeps track
> of all assignments to locals and all assignments to globals in a
> function scope. The only addition is keeping track of assignments at a
> global scope.
> 
> > > Actually, one could argue that there is no good reason to
> > > even *allow* the shadowing of globals.
> > 
> > If shadowing were completely disallowed, it would make it
> > impossible to write a completely self-contained function
> > whose source could be moved from one environment to another
> > without danger of it breaking. I wouldn't like the language
> > to have a characteristic like that.
> 
> That seems like a very esoteric requirement. How often do you have
> functions that do not rely *at all* on their environment (other
> functions, import statements, global variables).
> 
> When you move code you have to do some rewriting or customizing of the
> environment in 94% of the cases. How much effort do you want to spend on
> the other 6%? Also, there are tools that are designed to help you move
> code without breaking programs (refactoring editors). They can just as
> easily handle renaming local variables as adding import statements and
> fixing up function calls.

Can we cut this out please?  Paul is misguided.  There's no reason to
forbid a local shadowing a global.  All languages with nested scopes
allow this.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@digicool.com  Fri Dec 15 16:17:08 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Fri, 15 Dec 2000 11:17:08 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz>
 <3A39ED07.6B3EE68E@lemburg.com>
Message-ID: <14906.17412.221040.895357@anthem.concentric.net>

>>>>> "M" == M  <mal@lemburg.com> writes:

    M> It's the view of the FSF, AFAIK. You can't distribute an
    M> application in binary which dynamically links against
    M> libreadline (which is GPLed) on the user's machine, since even
    M> though you don't distribute libreadline the application running
    M> on the user's machine is considered the "whole" in terms of the
    M> GPL.

    M> FWIW, I don't agree with that view either, but that's probably
    M> because I'm a programmer and not a lawyer :)

I'm not sure I agree with that view either, but mostly because there
is a non-GPL replacement for parts of the readline API:

    http://www.cstr.ed.ac.uk/downloads/editline.html

Don't know anything about it, so it may not be featureful enough for
Python's needs, but if licensing is really a problem, it might be
worth looking into.

-Barry


From paulp@ActiveState.com  Fri Dec 15 16:16:37 2000
From: paulp@ActiveState.com (Paul Prescod)
Date: Fri, 15 Dec 2000 08:16:37 -0800
Subject: [Python-Dev] new draft of PEP 227
References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz>
 <3A3997C0.F977AF51@ActiveState.com> <200012151431.JAA19799@cj20424-a.reston1.va.home.com>
Message-ID: <3A3A43E5.347AAF6C@ActiveState.com>

Guido van Rossum wrote:
> 
> ...
> 
> Can we cut this out please?  Paul is misguided.  There's no reason to
> forbid a local shadowing a global.  All languages with nested scopes
> allow this.

Python is the only one I know of that implicitly shadows without
requiring some form of declaration. JavaScript has it right: reading and
writing of globals are symmetrical. In the rare case that you explicitly
want to shadow, you need a declaration. Python's rule is confusing,
implicit and error causing. In my opinion, of course. If you are
dead-set against explicit declarations then I would say that disallowing
the ambiguous construct is better than silently treating it as a
declaration.

 Paul Prescod


From guido@python.org  Fri Dec 15 16:23:07 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 11:23:07 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: Your message of "Fri, 15 Dec 2000 08:16:37 PST."
 <3A3A43E5.347AAF6C@ActiveState.com>
References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> <3A3997C0.F977AF51@ActiveState.com> <200012151431.JAA19799@cj20424-a.reston1.va.home.com>
 <3A3A43E5.347AAF6C@ActiveState.com>
Message-ID: <200012151623.LAA27630@cj20424-a.reston1.va.home.com>

> Python is the only one I know of that implicitly shadows without
> requiring some form of declaration. JavaScript has it right: reading and
> writing of globals are symmetrical. In the rare case that you explicitly
> want to shadow, you need a declaration. Python's rule is confusing,
> implicit and error causing. In my opinion, of course. If you are
> dead-set against explicit declarations then I would say that disallowing
> the ambiguous construct is better than silently treating it as a
> declaration.

Let's agree to differ.  This will never change.  In Python, assignment
is declaration.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Dec 15 17:01:33 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 12:01:33 -0500
Subject: [Python-Dev] Use of %c and Py_UNICODE
In-Reply-To: Your message of "Fri, 15 Dec 2000 08:27:24 EST."
 <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com>
References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com>
Message-ID: <200012151701.MAA28058@cj20424-a.reston1.va.home.com>

> unicodeobject.c contains this code:
> 
>                 PyErr_Format(PyExc_ValueError,
>                             "unsupported format character '%c' (0x%x) "
>                             "at index %i",
>                             c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat));
> 
> c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits,
> so '%\u3000' % 1 results in an error message containing "'\000'
> (0x3000)".  Is this worth fixing?  I'd say no, since the hex value is
> more useful for Unicode strings anyway.  (I still wanted to mention
> this little buglet, since I just touched this bit of code.)

Sounds like the '%c' should just be deleted.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bckfnn@worldonline.dk  Fri Dec 15 17:05:42 2000
From: bckfnn@worldonline.dk (Finn Bock)
Date: Fri, 15 Dec 2000 17:05:42 GMT
Subject: [Python-Dev] CWD in sys.path.
Message-ID: <3a3a480b.28490597@smtp.worldonline.dk>

Hi,

I'm trying to understand the initialization of sys.path and especially
if CWD is supposed to be included in sys.path by default. (I understand
the purpose of sys.path[0], that is not the focus of my question).

My setup is Python2.0 on Win2000, no PYTHONHOME or PYTHONPATH envvars.

In this setup, an empty string exists as sys.path[1], but I'm unsure if
this is by careful design or some freak accident. The empty entry is
added because

  HKEY_LOCAL_MACHINE\SOFTWARE\Python\PythonCore\2.0\PythonPath 

does *not* have any subkey. There are a default value, but that value
appears to be ignored. If I add a subkey "foo":

  HKEY_LOCAL_MACHINE\SOFTWARE\Python\PythonCore\2.0\PythonPath\foo 

with a default value of "d:\foo", the CWD is no longer in sys.path.

i:\java\jython.cvs\org\python\util>d:\Python20\python.exe  -S
Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', 'd:\\foo', 'D:\\PYTHON20\\DLLs', 'D:\\PYTHON20\\lib',
'D:\\PYTHON20\\lib\\plat-win', 'D:\\PYTHON20\\lib\\lib-tk',
'D:\\PYTHON20']
>>>

I noticed that some of the PYTHONPATH macros in PC/config.h includes the
'.', others does not.

So, to put it as a question (for jython): Should CWD be included in
sys.path? Are there some situation (like embedding) where CWD shouldn't
be in sys.path?

regards,
finn


From guido@python.org  Fri Dec 15 17:12:03 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 12:12:03 -0500
Subject: [Python-Dev] CWD in sys.path.
In-Reply-To: Your message of "Fri, 15 Dec 2000 17:05:42 GMT."
 <3a3a480b.28490597@smtp.worldonline.dk>
References: <3a3a480b.28490597@smtp.worldonline.dk>
Message-ID: <200012151712.MAA02544@cj20424-a.reston1.va.home.com>

On Unix, CWD is not in sys.path unless as sys.path[0].

--Guido van Rossum (home page: http://www.python.org/~guido/)


From moshez@zadka.site.co.il  Sat Dec 16 01:43:41 2000
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Sat, 16 Dec 2000 03:43:41 +0200 (IST)
Subject: [Python-Dev] new draft of PEP 227
Message-ID: <20001216014341.5BA97A82E@darjeeling.zadka.site.co.il>

On Fri, 15 Dec 2000 08:16:37 -0800, Paul Prescod <paulp@ActiveState.com> wrote:

> Python is the only one I know of that implicitly shadows without
> requiring some form of declaration.

Perl and Scheme permit implicit shadowing too.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From tismer@tismer.com  Fri Dec 15 16:42:18 2000
From: tismer@tismer.com (Christian Tismer)
Date: Fri, 15 Dec 2000 18:42:18 +0200
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL (Splitting up
 _cursesmodule)
References: <LNBBLJKPBEHFEDALKOLCIEPFIDAA.tim.one@home.com>
Message-ID: <3A3A49EA.5D9418E@tismer.com>


Tim Peters wrote:
...
> > Another issue: since Python doesn't link Python scripts, is it
> > still true that if one (pure) Python package is covered by the GPL,
> > then all other packages needed by that application will also fall
> > under GPL ?
> 
> Sorry, couldn't make sense of the question.  Just as well, since you should
> ask about it on a GNU forum anyway <wink>.

The GNU license is transitive. It automatically extends on other
parts of a project, unless they are identifiable, independent
developments. As soon as a couple of modules is published together,
based upon one GPL-ed module, this propagates. I think this is
what MAL meant?
Anyway, I'd be interested to hear what the GNU forum says.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From akuchlin@mems-exchange.org  Fri Dec 15 18:10:34 2000
From: akuchlin@mems-exchange.org (A.M. Kuchling)
Date: Fri, 15 Dec 2000 13:10:34 -0500
Subject: [Python-Dev] What to do about PEP 229?
Message-ID: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com>

I began writing the fabled fancy setup script described in PEP 229,
and then realized there was duplication going on here.  The code in
setup.py would need to know what libraries, #defines, &c., are needed
by each module in order to check if they're needed and set them.  But
if Modules/Setup can be used to override setup.py's behaviour, then
much of this information would need to be in that file, too; the
details of compiling a module are in two places. 

Possibilities:

1) Setup contains fully-loaded module descriptions, and the setup
   script drops unneeded bits.  For example, the socket module
   requires -lnsl on some platforms.  The Setup file would contain
   "socket socketmodule.c -lnsl" on all platforms, and setup.py would 
   check for an nsl library and only use if it's there.  

   This seems dodgy to me; what if -ldbm is needed on one platform and
   -lndbm on another?

2) Drop setup completely and just maintain setup.py, with some
   different overriding mechanism.  This is more radical.  Adding a
   new module is then not just a matter of editing a simple text file;
   you'd have to modify setup.py, making it more like maintaining an
   autoconf script.
  
Remember, the underlying goal of PEP 229 is to have the out-of-the-box
Python installation you get from "./configure;make" contain many more
useful modules; right now you wouldn't get zlib, syslog, resource, any
of the DBM modules, PyExpat, &c.  I'm not wedded to using Distutils to
get that, but think that's the only practical way; witness the hackery
required to get the DB module automatically compiled.  

You can also wave your hands in the direction of packagers such as
ActiveState or Red Hat, and say "let them make to compile everything".
But this problem actually inconveniences *me*, since I always build
Python myself and have to extensively edit Setup, so I'd like to fix
the problem.

Thoughts? 

--amk



From nas@arctrix.com  Fri Dec 15 12:03:04 2000
From: nas@arctrix.com (Neil Schemenauer)
Date: Fri, 15 Dec 2000 04:03:04 -0800
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <14906.17412.221040.895357@anthem.concentric.net>; from barry@digicool.com on Fri, Dec 15, 2000 at 11:17:08AM -0500
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net>
Message-ID: <20001215040304.A22056@glacier.fnational.com>

On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote:
> I'm not sure I agree with that view either, but mostly because there
> is a non-GPL replacement for parts of the readline API:
> 
>     http://www.cstr.ed.ac.uk/downloads/editline.html

It doesn't work with the current readline module.  It is much
smaller than readline and works just as well in my experience.
Would there be any interest in including a copy with the standard
distribution?  The license is quite nice (X11 type).

  Neil


From nas@arctrix.com  Fri Dec 15 12:14:50 2000
From: nas@arctrix.com (Neil Schemenauer)
Date: Fri, 15 Dec 2000 04:14:50 -0800
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012151509.HAA18093@slayer.i.sourceforge.net>; from gvanrossum@users.sourceforge.net on Fri, Dec 15, 2000 at 07:09:46AM -0800
References: <200012151509.HAA18093@slayer.i.sourceforge.net>
Message-ID: <20001215041450.B22056@glacier.fnational.com>

On Fri, Dec 15, 2000 at 07:09:46AM -0800, Guido van Rossum wrote:
> Update of /cvsroot/python/python/dist/src/Lib
> In directory slayer.i.sourceforge.net:/tmp/cvs-serv18082
> 
> Modified Files:
> 	httplib.py 
> Log Message:
> Get rid of string functions.

Can you explain the logic behind this recent interest in removing
string functions from the standard library?  It it performance?
Some unicode issue?  I don't have a great attachment to string.py
but I also don't see the justification for the amount of work it
requires.

  Neil


From guido@python.org  Fri Dec 15 19:29:37 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 14:29:37 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Fri, 15 Dec 2000 04:14:50 PST."
 <20001215041450.B22056@glacier.fnational.com>
References: <200012151509.HAA18093@slayer.i.sourceforge.net>
 <20001215041450.B22056@glacier.fnational.com>
Message-ID: <200012151929.OAA03073@cj20424-a.reston1.va.home.com>

> Can you explain the logic behind this recent interest in removing
> string functions from the standard library?  It it performance?
> Some unicode issue?  I don't have a great attachment to string.py
> but I also don't see the justification for the amount of work it
> requires.

I figure that at *some* point we should start putting our money where
our mouth is, deprecate most uses of the string module, and start
warning about it.  Not in 2.1 probably, given my experience below.

As a realistic test of the warnings module I played with some warnings
about the string module, and then found that say most of the std
library modules use it, triggering an extraordinary amount of
warnings.  I then decided to experiment with the conversion.  I
quickly found out it's too much work to do manually, so I'll hold off
until someone comes up with a tool that does 99% of the work.

(The selection of std library modules to convert manually was
triggered by something pretty random -- I decided to silence a
particular cron job I was running. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Barrett@stsci.edu  Fri Dec 15 19:32:10 2000
From: Barrett@stsci.edu (Paul Barrett)
Date: Fri, 15 Dec 2000 14:32:10 -0500 (EST)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>
References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>
Message-ID: <14906.17712.830224.481130@nem-srvr.stsci.edu>

Guido,

Here are my comments on PEP 207.  (I've also gone back and read most
of the 1998 discussion.  What a tedious, in terms of time, but
enlightening, in terms of content, discussion that was.)

| - New function:
|
|      PyObject *PyObject_RichCompare(PyObject *, PyObject *, enum cmp_op)
|
|      This performs the requested rich comparison, returning a Python
|      object or raising an exception.  The 3rd argument must be one of
|      LT, LE, EQ, NE, GT or GE.

I'd much prefer '<', '<=', '=', etc. to LT, LE, EQ, etc.


|    Classes
|
|    - Classes can define new special methods __lt__, __le__, __gt__,
|      __ge__, __eq__, __ne__ to override the corresponding operators.
|      (You gotta love the Fortran heritage.)  If a class overrides
|      __cmp__ as well, it is only used by PyObject_Compare().

Likewise, I'd prefer __less__, __lessequal__, __equal__, etc. to
__lt__, __le__, __eq__, etc.  I'm not keen on the FORTRAN derived
symbolism.  I also find it contrary to Python's heritage of being
clear and concise.  I don't mind typing __lessequal__ (or
__less_equal__) once per class for the additional clarity.


|    - Should we even bother upgrading the existing types?

Isn't this question partly related to the coercion issue and which
type of comparison takes precedence?  And if so, then I would think
the answer would be 'yes'.  Or better still see below my suggestion of
adding poor and rich comparison operators along with matrix-type
operators. 


    - If so, how should comparisons on container types be defined?
      Suppose we have a list whose items define rich comparisons.  How
      should the itemwise comparisons be done?  For example:

        def __lt__(a, b): # a<b for lists
            for i in range(min(len(a), len(b))):
                ai, bi = a[i], b[i]
                if ai < bi: return 1
                if ai == bi: continue
                if ai > bi: return 0
                raise TypeError, "incomparable item types"
            return len(a) < len(b)

      This uses the same sequence of comparisons as cmp(), so it may
      as well use cmp() instead:

        def __lt__(a, b): # a<b for lists
            for i in range(min(len(a), len(b))):
                c = cmp(a[i], b[i])
                if c < 0: return 1
                if c == 0: continue
                if c > 0: return 0
                assert 0 # unreachable
            return len(a) < len(b)

      And now there's not really a reason to change lists to rich
      comparisons.

I don't understand this example.  If a[i] and b[i] define rich
comparisons, then 'a[i] < b[i]' is likely to return a non-boolean
value.  Yet the 'if' statement expects a boolean value.  I don't see
how the above example will work.

This example also makes me think that the proposals for new operators
(ie.  PEP 211 and 225) are a good idea.  The discussion of rich
comparisons in 1998 also lends some support to this.  I can see many
uses for two types of comparison operators (as well as the proposed
matrix-type operators), one set for poor or boolean comparisons and
one for rich or non-boolean comparisons.  For example, numeric arrays
can define both.  Rich comparison operators would return an array of
boolean values, while poor comparison operators return a boolean value
by performing an implied 'and.reduce' operation.  These operators
provide clarity and conciseness, without much change to current Python 
behavior.

 -- Paul


From guido@python.org  Fri Dec 15 19:51:04 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 14:51:04 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: Your message of "Fri, 15 Dec 2000 14:32:10 EST."
 <14906.17712.830224.481130@nem-srvr.stsci.edu>
References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>
 <14906.17712.830224.481130@nem-srvr.stsci.edu>
Message-ID: <200012151951.OAA03219@cj20424-a.reston1.va.home.com>

> Here are my comments on PEP 207.  (I've also gone back and read most
> of the 1998 discussion.  What a tedious, in terms of time, but
> enlightening, in terms of content, discussion that was.)
> 
> | - New function:
> |
> |      PyObject *PyObject_RichCompare(PyObject *, PyObject *, enum cmp_op)
> |
> |      This performs the requested rich comparison, returning a Python
> |      object or raising an exception.  The 3rd argument must be one of
> |      LT, LE, EQ, NE, GT or GE.
> 
> I'd much prefer '<', '<=', '=', etc. to LT, LE, EQ, etc.

This is only at the C level.  Having to do a string compare is too
slow.  Since some of these are multi-character symbols, a character
constant doesn't suffice (multi-character character constants are not
portable).

> |    Classes
> |
> |    - Classes can define new special methods __lt__, __le__, __gt__,
> |      __ge__, __eq__, __ne__ to override the corresponding operators.
> |      (You gotta love the Fortran heritage.)  If a class overrides
> |      __cmp__ as well, it is only used by PyObject_Compare().
> 
> Likewise, I'd prefer __less__, __lessequal__, __equal__, etc. to
> __lt__, __le__, __eq__, etc.  I'm not keen on the FORTRAN derived
> symbolism.  I also find it contrary to Python's heritage of being
> clear and concise.  I don't mind typing __lessequal__ (or
> __less_equal__) once per class for the additional clarity.

I don't care about Fortran, but you just showed why I think the short
operator names are better: there's less guessing or disagreement about
how they are to be spelled.  E.g. should it be __lessthan__ or
__less_than__ or __less__?

> |    - Should we even bother upgrading the existing types?
> 
> Isn't this question partly related to the coercion issue and which
> type of comparison takes precedence?  And if so, then I would think
> the answer would be 'yes'.

It wouldn't make much of a difference -- comparisons between different
types numbers would get the same outcome either way.

> Or better still see below my suggestion of
> adding poor and rich comparison operators along with matrix-type
> operators. 
> 
> 
>     - If so, how should comparisons on container types be defined?
>       Suppose we have a list whose items define rich comparisons.  How
>       should the itemwise comparisons be done?  For example:
> 
>         def __lt__(a, b): # a<b for lists
>             for i in range(min(len(a), len(b))):
>                 ai, bi = a[i], b[i]
>                 if ai < bi: return 1
>                 if ai == bi: continue
>                 if ai > bi: return 0
>                 raise TypeError, "incomparable item types"
>             return len(a) < len(b)
> 
>       This uses the same sequence of comparisons as cmp(), so it may
>       as well use cmp() instead:
> 
>         def __lt__(a, b): # a<b for lists
>             for i in range(min(len(a), len(b))):
>                 c = cmp(a[i], b[i])
>                 if c < 0: return 1
>                 if c == 0: continue
>                 if c > 0: return 0
>                 assert 0 # unreachable
>             return len(a) < len(b)
> 
>       And now there's not really a reason to change lists to rich
>       comparisons.
> 
> I don't understand this example.  If a[i] and b[i] define rich
> comparisons, then 'a[i] < b[i]' is likely to return a non-boolean
> value.  Yet the 'if' statement expects a boolean value.  I don't see
> how the above example will work.

Sorry.  I was thinking of list items that contain objects that respond
to the new overloading protocol, but still return Boolean outcomes.
My conclusion is that __cmp__ is just as well.

> This example also makes me think that the proposals for new operators
> (ie.  PEP 211 and 225) are a good idea.  The discussion of rich
> comparisons in 1998 also lends some support to this.  I can see many
> uses for two types of comparison operators (as well as the proposed
> matrix-type operators), one set for poor or boolean comparisons and
> one for rich or non-boolean comparisons.  For example, numeric arrays
> can define both.  Rich comparison operators would return an array of
> boolean values, while poor comparison operators return a boolean value
> by performing an implied 'and.reduce' operation.  These operators
> provide clarity and conciseness, without much change to current Python 
> behavior.

Maybe.  That can still be decided later.  Right now, adding operators
is not on the table for 2.1 (if only because there are two conflicting
PEPs); adding rich comparisons *is* on the table because it doesn't
change the parser (and because the rich comparisons idea was already
pretty much worked out two years ago).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Fri Dec 15 21:08:02 2000
From: tim.one@home.com (Tim Peters)
Date: Fri, 15 Dec 2000 16:08:02 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012151929.OAA03073@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com>

[Neil Schemenauer]
> Can you explain the logic behind this recent interest in removing
> string functions from the standard library?  It it performance?
> Some unicode issue?  I don't have a great attachment to string.py
> but I also don't see the justification for the amount of work it
> requires.

[Guido]
> I figure that at *some* point we should start putting our money where
> our mouth is, deprecate most uses of the string module, and start
> warning about it.  Not in 2.1 probably, given my experience below.

I think this begs Neil's questions:  *is* our mouth there <ahem>, and if so,
why?  The only public notice of impending string module deprecation anyone
came up with was a vague note on the 1.6 web page, and one not repeated in
any of the 2.0 release material.

"string" is right up there with "os" and "sys" as a FIM (Frequently Imported
Module), so the required code changes will be massive.  As a user, I don't
see what's in it for me to endure that pain:  the string module functions
work fine!   Neither are they warts in the language, any more than that we
say sin(pi) instead of pi.sin().  Keeping the functions around doesn't hurt
anybody that I can see.

> As a realistic test of the warnings module I played with some warnings
> about the string module, and then found that say most of the std
> library modules use it, triggering an extraordinary amount of
> warnings.  I then decided to experiment with the conversion.  I
> quickly found out it's too much work to do manually, so I'll hold off
> until someone comes up with a tool that does 99% of the work.

Ah, so that's the *easy* way to kill this crusade -- forget I said anything
<wink>.



From Barrett@stsci.edu  Fri Dec 15 21:20:20 2000
From: Barrett@stsci.edu (Paul Barrett)
Date: Fri, 15 Dec 2000 16:20:20 -0500 (EST)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <200012151951.OAA03219@cj20424-a.reston1.va.home.com>
References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>
 <14906.17712.830224.481130@nem-srvr.stsci.edu>
 <200012151951.OAA03219@cj20424-a.reston1.va.home.com>
Message-ID: <14906.33325.5784.118110@nem-srvr.stsci.edu>

>> This example also makes me think that the proposals for new operators
>> (ie.  PEP 211 and 225) are a good idea.  The discussion of rich
>> comparisons in 1998 also lends some support to this.  I can see many
>> uses for two types of comparison operators (as well as the proposed
>> matrix-type operators), one set for poor or boolean comparisons and
>> one for rich or non-boolean comparisons.  For example, numeric arrays
>> can define both.  Rich comparison operators would return an array of
>> boolean values, while poor comparison operators return a boolean value
>> by performing an implied 'and.reduce' operation.  These operators
>> provide clarity and conciseness, without much change to current Python 
>> behavior.
>
> Maybe.  That can still be decided later.  Right now, adding operators
> is not on the table for 2.1 (if only because there are two conflicting
> PEPs); adding rich comparisons *is* on the table because it doesn't
> change the parser (and because the rich comparisons idea was already
> pretty much worked out two years ago).

Yes, it was worked out previously _assuming_ rich comparisons do not
use any new operators.  

But let's stop for a moment and contemplate adding rich comparisons 
along with new comparison operators.  What do we gain?

1. The current boolean operator behavior does not have to change, and
   hence will be backward compatible.

2. It eliminates the need to decide whether or not rich comparisons
   takes precedence over boolean comparisons.

3. The new operators add additional behavior without directly impacting 
   current behavior and the use of them is unambigous, at least in
   relation to current Python behavior.  You know by the operator what 
   type of comparison will be returned.  This should appease Jim
   Fulton, based on his arguments in 1998 about comparison operators
   always returning a boolean value.

4. Compound objects, such as lists, could implement both rich
   and boolean comparisons.  The boolean comparison would remain as
   is, while the rich comparison would return a list of boolean
   values.  Current behavior doesn't change; just a new feature, which
   you may or may not choose to use, is added.

If we go one step further and add the matrix-style operators along
with the comparison operators, we can provide a consistent user
interface to array/complex operations without changing current Python
behavior.  If a user has no need for these new operators, he doesn't
have to use them or even know about them.  All we've done is made
Python richer, but I believe with making it more complex.  For
example, all element-wise operations could have a ':' appended to
them, e.g. '+:', '<:', etc.; and will define element-wise addition,
element-wise less-than, etc.  The traditional '*', '/', etc. operators
can then be used for matrix operations, which will appease the Matlab
people.

Therefore, I don't think rich comparisons and matrix-type operators
should be considered separable.  I really think you should consider
this suggestion.  It appeases many groups while providing a consistent 
and clear user interface, while greatly impacting current Python
behavior. 

Always-causing-havoc-at-the-last-moment-ly Yours,
Paul

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


From guido@python.org  Fri Dec 15 21:23:46 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 16:23:46 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Fri, 15 Dec 2000 16:08:02 EST."
 <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com>
Message-ID: <200012152123.QAA03566@cj20424-a.reston1.va.home.com>

> "string" is right up there with "os" and "sys" as a FIM (Frequently
> Imported Module), so the required code changes will be massive.  As
> a user, I don't see what's in it for me to endure that pain: the
> string module functions work fine!  Neither are they warts in the
> language, any more than that we say sin(pi) instead of pi.sin().
> Keeping the functions around doesn't hurt anybody that I can see.

Hm.  I'm not saying that this one will be easy.  But I don't like
having "two ways to do it".  It means more learning, etc. (you know
the drill).  We could have chosen to make the strop module support
Unicode; instead, we chose to give string objects methods and promote
the use of those methods instead of the string module.  (And in a
generous mood, we also supported Unicode in the string module -- by
providing wrappers that invoke string methods.)

If you're saying that we should give users ample time for the
transition, I'm with you.

If you're saying that you think the string module is too prominent to
ever start deprecating its use, I'm afraid we have a problem.

I'd also like to note that using the string module's wrappers incurs
the overhead of a Python function call -- using string methods is
faster.

Finally, I like the look of fields[i].strip().lower() much better than
that of string.lower(string.strip(fields[i])) -- an actual example
from mimetools.py.

Ideally, I would like to deprecate the entire string module, so that I
can place a single warning at its top.  This will cause a single
warning to be issued for programs that still use it (no matter how
many times it is imported).  Unfortunately, there are a couple of
things that still need it: string.letters etc., and
string.maketrans().

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gvwilson@nevex.com  Fri Dec 15 21:43:47 2000
From: gvwilson@nevex.com (Greg Wilson)
Date: Fri, 15 Dec 2000 16:43:47 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <14906.33325.5784.118110@nem-srvr.stsci.edu>
Message-ID: <002901c066e0$1b3f13c0$770a0a0a@nevex.com>

This is a multi-part message in MIME format.

------=_NextPart_000_002A_01C066B6.32690BC0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

Hi, Paul; thanks for your mail.  W.r.t. adding matrix operators to Python,
you may want to take a look at the counter-arguments in PEP 0211 (attached).
Basically, I spoke with the authors of GNU Octave (a GPL'd clone of MATLAB)
about what users really used.  They felt that the only matrix operator that
really mattered was matrix-matrix multiply; other operators (including the
left and right division operators that even experienced MATLAB users often
mix up) were second order at best, and were better handled with methods or
functions.

Thanks,
Greg

p.s. PEP 0225 (also attached) is an alternative to PEP 0211 which would add
most of the MATLAB-ish operators to Python.
------=_NextPart_000_002A_01C066B6.32690BC0
Content-Type: text/plain;
	name="pep-0211.txt"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="pep-0211.txt"

PEP: 211=0A=
Title: Adding New Linear Algebra Operators to Python=0A=
Version: $Revision: 1.5 $=0A=
Author: gvwilson@nevex.com (Greg Wilson)=0A=
Status: Draft=0A=
Type: Standards Track=0A=
Python-Version: 2.1=0A=
Created: 15-Jul-2000=0A=
Post-History:=0A=
=0A=
=0A=
Introduction=0A=
=0A=
    This PEP describes a conservative proposal to add linear algebra=0A=
    operators to Python 2.0.  It discusses why such operators are=0A=
    desirable, and why a minimalist approach should be adopted at this=0A=
    point.  This PEP summarizes discussions held in mailing list=0A=
    forums, and provides URLs for further information, where=0A=
    appropriate.  The CVS revision history of this file contains the=0A=
    definitive historical record.=0A=
=0A=
=0A=
Summary=0A=
=0A=
    Add a single new infix binary operator '@' ("across"), and=0A=
    corresponding special methods "__across__()", "__racross__()", and=0A=
    "__iacross__()".  This operator will perform mathematical matrix=0A=
    multiplication on NumPy arrays, and generate cross-products when=0A=
    applied to built-in sequence types.  No existing operator=0A=
    definitions will be changed.=0A=
=0A=
=0A=
Background=0A=
=0A=
    The first high-level programming language, Fortran, was invented=0A=
    to do arithmetic.  While this is now just a small part of=0A=
    computing, there are still many programmers who need to express=0A=
    complex mathematical operations in code.=0A=
=0A=
    The most influential of Fortran's successors was APL [1].  Its=0A=
    author, Kenneth Iverson, designed the language as a notation for=0A=
    expressing matrix algebra, and received the 1980 Turing Award for=0A=
    his work.=0A=
=0A=
    APL's operators supported both familiar algebraic operations, such=0A=
    as vector dot product and matrix multiplication, and a wide range=0A=
    of structural operations, such as stitching vectors together to=0A=
    create arrays.  Even by programming's standards, APL is=0A=
    exceptionally cryptic: many of its symbols did not exist on=0A=
    standard keyboards, and expressions have to be read right to left.=0A=
=0A=
    Most subsequent work numerical languages, such as Fortran-90,=0A=
    MATLAB, and Mathematica, have tried to provide the power of APL=0A=
    without the obscurity.  Python's NumPy [2] has most of the=0A=
    features that users of such languages expect, but these are=0A=
    provided through named functions and methods, rather than=0A=
    overloaded operators.  This makes NumPy clumsier than most=0A=
    alternatives.=0A=
=0A=
    The author of this PEP therefore consulted the developers of GNU=0A=
    Octave [3], an open source clone of MATLAB.  When asked how=0A=
    important it was to have infix operators for matrix solution,=0A=
    Prof. James Rawlings replied [4]:=0A=
=0A=
        I DON'T think it's a must have, and I do a lot of matrix=0A=
        inversion. I cannot remember if its A\b or b\A so I always=0A=
        write inv(A)*b instead. I recommend dropping \.=0A=
=0A=
    Rawlings' feedback on other operators was similar.  It is worth=0A=
    noting in this context that notations such as "/" and "\" for=0A=
    matrix solution were invented by programmers, not mathematicians,=0A=
    and have not been adopted by the latter.=0A=
=0A=
    Based on this discussion, and feedback from classes at the US=0A=
    national laboratories and elsewhere, we recommend only adding a=0A=
    matrix multiplication operator to Python at this time.  If there=0A=
    is significant user demand for syntactic support for other=0A=
    operations, these can be added in a later release.=0A=
=0A=
=0A=
Requirements=0A=
=0A=
    The most important requirement is minimal impact on existing=0A=
    Python programs and users: the proposal must not break existing=0A=
    code (except possibly NumPy).=0A=
=0A=
    The second most important requirement is the ability to handle all=0A=
    common cases cleanly and clearly.  There are nine such cases:=0A=
=0A=
        |5 6| *   9   =3D |45 54|      MS: matrix-scalar multiplication=0A=
        |7 8|           |63 72|=0A=
=0A=
          9   * |5 6| =3D |45 54|      SM: scalar-matrix multiplication=0A=
                |7 8|   |63 72|=0A=
=0A=
        |2 3| * |4 5| =3D |8 15|       VE: vector elementwise =
multiplication=0A=
=0A=
=0A=
        |2 3| *  |4|  =3D   23         VD: vector dot product=0A=
                 |5|=0A=
=0A=
         |2|  * |4 5| =3D | 8 10|      VO: vector outer product=0A=
         |3|            |12 15|=0A=
=0A=
        |1 2| * |5 6| =3D | 5 12|      ME: matrix elementwise =
multiplication=0A=
        |3 4|   |7 8|   |21 32|=0A=
=0A=
        |1 2| * |5 6| =3D |19 22|      MM: mathematical matrix =
multiplication=0A=
        |3 4|   |7 8|   |43 50|=0A=
=0A=
        |1 2| * |5 6| =3D |19 22|      VM: vector-matrix multiplication=0A=
                |7 8|=0A=
=0A=
        |5 6| *  |1|  =3D   |17|       MV: matrix-vector multiplication=0A=
        |7 8|    |2|      |23|=0A=
=0A=
    Note that 1-dimensional vectors are treated as rows in VM, as=0A=
    columns in MV, and as both in VD and VO.  Both are special cases=0A=
    of 2-dimensional matrices (Nx1 and 1xN respectively).  We will=0A=
    therefore define the new operator only for 2-dimensional arrays,=0A=
    and provide an easy (and efficient) way for users to treat=0A=
    1-dimensional structures as 2-dimensional.=0A=
=0A=
    Third, we must avoid confusion between Python's notation and those=0A=
    of MATLAB and Fortran-90.  In particular, mathematical matrix=0A=
    multiplication (case MM) should not be represented as '.*', since:=0A=
=0A=
    (a) MATLAB uses prefix-'.' forms to mean 'elementwise', and raw=0A=
        forms to mean "mathematical"; and=0A=
=0A=
    (b) even if the Python parser can be taught how to handle dotted=0A=
        forms, '1.*A' will still be visually ambiguous.=0A=
=0A=
=0A=
Proposal=0A=
=0A=
    The meanings of all existing operators will be unchanged.  In=0A=
    particular, 'A*B' will continue to be interpreted elementwise.=0A=
    This takes care of the cases MS, SM, VE, and ME, and ensures=0A=
    minimal impact on existing programs.=0A=
=0A=
    A new operator '@' (pronounced "across") will be added to Python,=0A=
    along with special methods "__across__()", "__racross__()", and=0A=
    "__iacross__()", with the usual semantics.  (We recommend using=0A=
    "@", rather than the times-like "><", because of the ease with=0A=
    which the latter could be mis-typed as inequality "<>".)=0A=
=0A=
    No new operators will be defined to mean "solve a set of linear=0A=
    equations", or "invert a matrix".=0A=
=0A=
    (Optional) When applied to sequences, the "@" operator will return=0A=
    a tuple of tuples containing the cross-product of their elements=0A=
    in left-to-right order:=0A=
=0A=
    >>> [1, 2] @ (3, 4)=0A=
    ((1, 3), (1, 4), (2, 3), (2, 4))=0A=
=0A=
    >>> [1, 2] @ (3, 4) @ (5, 6)=0A=
    ((1, 3, 5), (1, 3, 6), =0A=
     (1, 4, 5), (1, 4, 6),=0A=
     (2, 3, 5), (2, 3, 6),=0A=
     (2, 4, 5), (2, 4, 6))=0A=
=0A=
    This will require the same kind of special support from the parser=0A=
    as chained comparisons (such as "a<b<c<=3Dd").  However, it will=0A=
    permit:=0A=
=0A=
    >>> for (i, j) in [1, 2] @ [3, 4]:=0A=
    >>>     print i, j=0A=
    1 3=0A=
    1 4=0A=
    2 3=0A=
    2 4=0A=
=0A=
    as a short-hand for the common nested loop idiom:=0A=
=0A=
    >>> for i in [1, 2]:=0A=
    >>>    for j in [3, 4]:=0A=
    >>>        print i, j=0A=
=0A=
    Response to the 'lockstep loop' questionnaire [5] indicated that=0A=
    newcomers would be comfortable with this (so comfortable, in fact,=0A=
    that most of them interpreted most multi-loop 'zip' syntaxes [6]=0A=
    as implementing single-stage nesting).=0A=
=0A=
=0A=
Alternatives=0A=
=0A=
    01. Don't add new operators.=0A=
=0A=
    Python is not primarily a numerical language; it may not be worth=0A=
    complexifying it for this special case.  NumPy's success is proof=0A=
    that users can and will use functions and methods for linear=0A=
    algebra.  However, support for real matrix multiplication is=0A=
    frequently requested, as:=0A=
=0A=
    * functional forms are cumbersome for lengthy formulas, and do not=0A=
      respect the operator precedence rules of conventional mathematics;=0A=
      and=0A=
=0A=
    * method forms are asymmetric in their operands.=0A=
=0A=
    What's more, the proposed semantics for "@" for built-in sequence=0A=
    types would simplify expression of a very common idiom (nested=0A=
    loops).  User testing during discussion of 'lockstep loops'=0A=
    indicated that both new and experienced users would understand=0A=
    this immediately.=0A=
=0A=
    02. Introduce prefixed forms of all existing operators, such as=0A=
        "~*" and "~+", as proposed in PEP 0225 [7].=0A=
=0A=
    This proposal would duplicate all built-in mathematical operators=0A=
    with matrix equivalents, as in numerical languages such as=0A=
    MATLAB.  Our objections to this are:=0A=
=0A=
    * Python is not primarily a numerical programming language.  While=0A=
      the (self-selected) participants in the discussions that led to=0A=
      PEP 0225 may want all of these new operators, the majority of=0A=
      Python users would be indifferent.  The extra complexity they=0A=
      would introduce into the language therefore does not seem=0A=
      merited. (See also Rawlings' comments, quoted in the Background=0A=
      section, about these operators not being essential.)=0A=
=0A=
    * The proposed syntax is difficult to read (i.e. passes the "low=0A=
      toner" readability test).=0A=
=0A=
    03. Retain the existing meaning of all operators, but create a=0A=
        behavioral accessor for arrays, such that:=0A=
=0A=
            A * B=0A=
=0A=
        is elementwise multiplication (ME), but:=0A=
=0A=
            A.m() * B.m()=0A=
=0A=
        is mathematical multiplication (MM).  The method "A.m()" would=0A=
        return an object that aliased A's memory (for efficiency), but=0A=
        which had a different implementation of __mul__().=0A=
=0A=
    This proposal was made by Moshe Zadka, and is also considered by=0A=
    PEP 0225 [7].  Its advantage is that it has no effect on the=0A=
    existing implementation of Python: changes are localized in the=0A=
    Numeric module.  The disadvantages are=0A=
=0A=
    * The semantics of "A.m() * B", "A + B.m()", and so on would have=0A=
      to be defined, and there is no "obvious" choice for them.=0A=
=0A=
    * Aliasing objects to trigger different operator behavior feels=0A=
      less Pythonic than either calling methods (as in the existing=0A=
      Numeric module) or using a different operator.  This PEP is=0A=
      primarily about look and feel, and about making Python more=0A=
      attractive to people who are not already using it.=0A=
=0A=
=0A=
Related Proposals=0A=
=0A=
    0207 :  Rich Comparisons=0A=
=0A=
            It may become possible to overload comparison operators=0A=
            such as '<' so that an expression such as 'A < B' returns=0A=
            an array, rather than a scalar value.=0A=
=0A=
    0209 :  Adding Multidimensional Arrays=0A=
=0A=
            Multidimensional arrays are currently an extension to=0A=
            Python, rather than a built-in type.=0A=
=0A=
    0225 :  Elementwise/Objectwise Operators=0A=
=0A=
            A larger proposal that addresses the same subject, but=0A=
            which proposes many more additions to the language.=0A=
=0A=
=0A=
Acknowledgments=0A=
=0A=
    I am grateful to Huaiyu Zhu [8] for initiating this discussion,=0A=
    and for some of the ideas and terminology included below.=0A=
=0A=
=0A=
References=0A=
=0A=
    [1] http://www.acm.org/sigapl/whyapl.htm=0A=
    [2] http://numpy.sourceforge.net=0A=
    [3] http://bevo.che.wisc.edu/octave/=0A=
    [4] http://www.egroups.com/message/python-numeric/4=0A=
    [5] http://www.python.org/pipermail/python-dev/2000-July/013139.html=0A=
    [6] PEP-0201.txt "Lockstep Iteration"=0A=
    [7] =
http://www.python.org/pipermail/python-list/2000-August/112529.html=0A=
=0A=
=0A=
Appendix: NumPy=0A=
=0A=
    NumPy will overload "@" to perform mathematical multiplication of=0A=
    arrays where shapes permit, and to throw an exception otherwise.=0A=
    Its implementation of "@" will treat built-in sequence types as if=0A=
    they were column vectors.  This takes care of the cases MM and MV.=0A=
=0A=
    An attribute "T" will be added to the NumPy array type, such that=0A=
    "m.T" is:=0A=
=0A=
    (a) the transpose of "m" for a 2-dimensional array=0A=
=0A=
    (b) the 1xN matrix transpose of "m" if "m" is a 1-dimensional=0A=
        array; or=0A=
=0A=
    (c) a runtime error for an array with rank >=3D 3.=0A=
=0A=
    This attribute will alias the memory of the base object.  NumPy's=0A=
    "transpose()" function will be extended to turn built-in sequence=0A=
    types into row vectors.  This takes care of the VM, VD, and VO=0A=
    cases.  We propose an attribute because:=0A=
=0A=
    (a) the resulting notation is similar to the 'superscript T' (at=0A=
        least, as similar as ASCII allows), and=0A=
=0A=
    (b) it signals that the transposition aliases the original object.=0A=
=0A=
    NumPy will define a value "inv", which will be recognized by the=0A=
    exponentiation operator, such that "A ** inv" is the inverse of=0A=
    "A".  This is similar in spirit to NumPy's existing "newaxis"=0A=
    value.=0A=
=0A=
=0A=
=0C=0A=
Local Variables:=0A=
mode: indented-text=0A=
indent-tabs-mode: nil=0A=
End:=0A=

------=_NextPart_000_002A_01C066B6.32690BC0
Content-Type: text/plain;
	name="pep-0225.txt"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="pep-0225.txt"

PEP: 225=0A=
Title: Elementwise/Objectwise Operators=0A=
Version: $Revision: 1.2 $=0A=
Author: hzhu@users.sourceforge.net (Huaiyu Zhu),=0A=
       gregory.lielens@fft.be (Gregory Lielens)=0A=
Status: Draft       =0A=
Type: Standards Track=0A=
Python-Version: 2.1=0A=
Created: 19-Sep-2000=0A=
Post-History:=0A=
=0A=
=0A=
Introduction=0A=
=0A=
    This PEP describes a proposal to add new operators to Python which=0A=
    are useful for distinguishing elementwise and objectwise=0A=
    operations, and summarizes discussions in the news group=0A=
    comp.lang.python on this topic.  See Credits and Archives section=0A=
    at end.  Issues discussed here include:=0A=
=0A=
    - Background.=0A=
    - Description of proposed operators and implementation issues.=0A=
    - Analysis of alternatives to new operators.=0A=
    - Analysis of alternative forms.=0A=
    - Compatibility issues=0A=
    - Description of wider extensions and other related ideas.=0A=
=0A=
    A substantial portion of this PEP describes ideas that do not go=0A=
    into the proposed extension.  They are presented because the=0A=
    extension is essentially syntactic sugar, so its adoption must be=0A=
    weighed against various possible alternatives.  While many=0A=
    alternatives may be better in some aspects, the current proposal=0A=
    appears to be overall advantageous.=0A=
=0A=
    The issues concerning elementwise-objectwise operations extends to=0A=
    wider areas than numerical computation.  This document also=0A=
    describes how the current proposal may be integrated with more=0A=
    general future extensions.=0A=
=0A=
=0A=
Background=0A=
=0A=
    Python provides six binary infix math operators: + - * / % **=0A=
    hereafter generically represented by "op".  They can be overloaded=0A=
    with new semantics for user-defined classes.  However, for objects=0A=
    composed of homogeneous elements, such as arrays, vectors and=0A=
    matrices in numerical computation, there are two essentially=0A=
    distinct flavors of semantics.  The objectwise operations treat=0A=
    these objects as points in multidimensional spaces.  The=0A=
    elementwise operations treat them as collections of individual=0A=
    elements.  These two flavors of operations are often intermixed in=0A=
    the same formulas, thereby requiring syntactical distinction.=0A=
=0A=
    Many numerical computation languages provide two sets of math=0A=
    operators.  For example, in MatLab, the ordinary op is used for=0A=
    objectwise operation while .op is used for elementwise operation.=0A=
    In R, op stands for elementwise operation while %op% stands for=0A=
    objectwise operation.=0A=
=0A=
    In Python, there are other methods of representation, some of=0A=
    which already used by available numerical packages, such as=0A=
=0A=
    - function:   mul(a,b)=0A=
    - method:     a.mul(b)=0A=
    - casting:    a.E*b =0A=
=0A=
    In several aspects these are not as adequate as infix operators.=0A=
    More details will be shown later, but the key points are=0A=
=0A=
    - Readability: Even for moderately complicated formulas, infix=0A=
      operators are much cleaner than alternatives.=0A=
=0A=
    - Familiarity: Users are familiar with ordinary math operators.=0A=
=0A=
    - Implementation: New infix operators will not unduly clutter=0A=
      Python syntax.  They will greatly ease the implementation of=0A=
      numerical packages.=0A=
=0A=
    While it is possible to assign current math operators to one=0A=
    flavor of semantics, there is simply not enough infix operators to=0A=
    overload for the other flavor.  It is also impossible to maintain=0A=
    visual symmetry between these two flavors if one of them does not=0A=
    contain symbols for ordinary math operators.=0A=
=0A=
=0A=
Proposed extension=0A=
=0A=
    - Six new binary infix operators ~+ ~- ~* ~/ ~% ~** are added to=0A=
      core Python.  They parallel the existing operators + - * / % **.=0A=
=0A=
    - Six augmented assignment operators ~+=3D ~-=3D ~*=3D ~/=3D ~%=3D =
~**=3D are=0A=
      added to core Python.  They parallel the operators +=3D -=3D *=3D =
/=3D=0A=
      %=3D **=3D available in Python 2.0.=0A=
=0A=
    - Operator ~op retains the syntactical properties of operator op,=0A=
      including precedence.=0A=
=0A=
    - Operator ~op retains the semantical properties of operator op on=0A=
      built-in number types.=0A=
=0A=
    - Operator ~op raise syntax error on non-number builtin types.=0A=
      This is temporary until the proper behavior can be agreed upon.=0A=
=0A=
    - These operators are overloadable in classes with names that=0A=
      prepend "t" (for tilde) to names of ordinary math operators.=0A=
      For example, __tadd__ and __rtadd__ work for ~+ just as __add__=0A=
      and __radd__ work for +.=0A=
=0A=
    - As with exiting operators, the __r*__() methods are invoked when=0A=
      the left operand does not provide the appropriate method.=0A=
=0A=
    It is intended that one set of op or ~op is used for elementwise=0A=
    operations, the other for objectwise operations, but it is not=0A=
    specified which version of operators stands for elementwise or=0A=
    objectwise operations, leaving the decision to applications.=0A=
=0A=
    The proposed implementation is to patch several files relating to=0A=
    the tokenizer, parser, grammar and compiler to duplicate the=0A=
    functionality of corresponding existing operators as necessary.=0A=
    All new semantics are to be implemented in the classes that=0A=
    overload them.=0A=
=0A=
    The symbol ~ is already used in Python as the unary "bitwise not"=0A=
    operator.  Currently it is not allowed for binary operators.  The=0A=
    new operators are completely backward compatible.=0A=
=0A=
=0A=
Prototype Implementation=0A=
=0A=
    Greg Lielens implemented the infix ~op as a patch against Python=0A=
    2.0b1 source[1].=0A=
=0A=
    To allow ~ to be part of binary operators, the tokenizer would=0A=
    treat ~+ as one token.  This means that currently valid expression=0A=
    ~+1 would be tokenized as ~+ 1 instead of ~ + 1.  The parser would=0A=
    then treat ~+ as composite of ~ +.  The effect is invisible to=0A=
    applications.=0A=
=0A=
    Notes about current patch:=0A=
=0A=
    - It does not include ~op=3D operators yet.=0A=
=0A=
    - The ~op behaves the same as op on lists, instead of raising=0A=
      exceptions.=0A=
=0A=
    These should be fixed when the final version of this proposal is=0A=
    ready.=0A=
=0A=
    - It reserves xor as an infix operator with the semantics=0A=
      equivalent to:=0A=
      =0A=
        def __xor__(a, b):=0A=
            if not b: return a=0A=
            elif not a: return b=0A=
            else: 0=0A=
=0A=
   This preserves true value as much as possible, otherwise preserve=0A=
   left hand side value if possible.=0A=
=0A=
   This is done so that bitwise operators could be regarded as=0A=
   elementwise logical operators in the future (see below).=0A=
   =0A=
=0A=
Alternatives to adding new operators=0A=
=0A=
    The discussions on comp.lang.python and python-dev mailing list=0A=
    explored many alternatives.  Some of the leading alternatives are=0A=
    listed here, using the multiplication operator as an example.=0A=
=0A=
    1. Use function mul(a,b).=0A=
=0A=
       Advantage:=0A=
       -  No need for new operators.=0A=
=0A=
       Disadvantage: =0A=
       - Prefix forms are cumbersome for composite formulas.=0A=
       - Unfamiliar to the intended users.=0A=
       - Too verbose for the intended users.=0A=
       - Unable to use natural precedence rules.=0A=
=0A=
    2. Use method call a.mul(b)=0A=
=0A=
       Advantage:=0A=
       - No need for new operators.=0A=
=0A=
       Disadvantage:=0A=
       - Asymmetric for both operands.=0A=
       - Unfamiliar to the intended users.=0A=
       - Too verbose for the intended users.=0A=
       - Unable to use natural precedence rules.=0A=
=0A=
    3. Use "shadow classes".  For matrix class define a shadow array=0A=
       class accessible through a method .E, so that for matrices a=0A=
       and b, a.E*b would be a matrix object that is=0A=
       elementwise_mul(a,b).=0A=
=0A=
       Likewise define a shadow matrix class for arrays accessible=0A=
       through a method .M so that for arrays a and b, a.M*b would be=0A=
       an array that is matrixwise_mul(a,b).=0A=
=0A=
       Advantage:=0A=
       - No need for new operators.=0A=
       - Benefits of infix operators with correct precedence rules.=0A=
       - Clean formulas in applications.=0A=
=0A=
       Disadvantage:=0A=
       - Hard to maintain in current Python because ordinary numbers=0A=
         cannot have user defined class methods; i.e. a.E*b will fail=0A=
         if a is a pure number.=0A=
       - Difficult to implement, as this will interfere with existing=0A=
         method calls, like .T for transpose, etc.=0A=
       - Runtime overhead of object creation and method lookup.=0A=
       - The shadowing class cannot replace a true class, because it=0A=
         does not return its own type.  So there need to be a M class=0A=
         with shadow E class, and an E class with shadow M class.=0A=
       - Unnatural to mathematicians.=0A=
=0A=
    4. Implement matrixwise and elementwise classes with easy casting=0A=
       to the other class.  So matrixwise operations for arrays would=0A=
       be like a.M*b.M and elementwise operations for matrices would=0A=
       be like a.E*b.E.  For error detection a.E*b.M would raise=0A=
       exceptions.=0A=
=0A=
       Advantage:=0A=
       - No need for new operators.=0A=
       - Similar to infix notation with correct precedence rules.=0A=
=0A=
       Disadvantage:=0A=
       - Similar difficulty due to lack of user-methods for pure numbers.=0A=
       - Runtime overhead of object creation and method lookup.=0A=
       - More cluttered formulas=0A=
       - Switching of flavor of objects to facilitate operators=0A=
         becomes persistent.  This introduces long range context=0A=
         dependencies in application code that would be extremely hard=0A=
         to maintain.=0A=
=0A=
    5. Using mini parser to parse formulas written in arbitrary=0A=
       extension placed in quoted strings.=0A=
=0A=
       Advantage:=0A=
       - Pure Python, without new operators=0A=
=0A=
       Disadvantage:=0A=
       - The actual syntax is within the quoted string, which does not=0A=
         resolve the problem itself.=0A=
       - Introducing zones of special syntax.=0A=
       - Demanding on the mini-parser.=0A=
=0A=
    6. Introducing a single operator, such as @, for matrix=0A=
       multiplication.=0A=
=0A=
       Advantage:=0A=
       - Introduces less operators=0A=
=0A=
       Disadvantage:=0A=
       - The distinctions for operators like + - ** are equally=0A=
         important.  Their meaning in matrix or array-oriented=0A=
         packages would be reversed (see below).=0A=
       - The new operator occupies a special character.=0A=
       - This does not work well with more general object-element issues.=0A=
=0A=
    Among these alternatives, the first and second are used in current=0A=
    applications to some extent, but found inadequate.  The third is=0A=
    the most favorite for applications, but it will incur huge=0A=
    implementation complexity.  The fourth would make applications=0A=
    codes very context-sensitive and hard to maintain.  These two=0A=
    alternatives also share significant implementational difficulties=0A=
    due to current type/class split.  The fifth appears to create more=0A=
    problems than it would solve.  The sixth does not cover the same=0A=
    range of applications.=0A=
=0A=
=0A=
Alternative forms of infix operators=0A=
=0A=
    Two major forms and several minor variants of new infix operators=0A=
    were discussed:=0A=
=0A=
    - Bracketed form=0A=
=0A=
        (op)=0A=
        [op]=0A=
        {op}=0A=
        <op>=0A=
        :op:=0A=
        ~op~=0A=
        %op%=0A=
=0A=
    - Meta character form=0A=
=0A=
        .op=0A=
        @op=0A=
        ~op=0A=
=0A=
      Alternatively the meta character is put after the operator.=0A=
=0A=
    - Less consistent variations of these themes.  These are=0A=
      considered unfavorably.  For completeness some are listed here=0A=
=0A=
        - Use @/ and /@ for left and right division=0A=
        - Use [*] and (*) for outer and inner products=0A=
        - Use a single operator @ for multiplication.=0A=
=0A=
    - Use __call__ to simulate multiplication.=0A=
      a(b)  or (a)(b)=0A=
=0A=
    Criteria for choosing among the representations include:=0A=
=0A=
        - No syntactical ambiguities with existing operators.  =0A=
=0A=
        - Higher readability in actual formulas.  This makes the=0A=
          bracketed forms unfavorable.  See examples below.=0A=
=0A=
        - Visually similar to existing math operators.=0A=
=0A=
        - Syntactically simple, without blocking possible future=0A=
          extensions.=0A=
=0A=
    With these criteria the overall winner in bracket form appear to=0A=
    be {op}.  A clear winner in the meta character form is ~op.=0A=
    Comparing these it appears that ~op is the favorite among them=0A=
    all.=0A=
=0A=
    Some analysis are as follows:=0A=
=0A=
        - The .op form is ambiguous: 1.+a would be different from 1 .+a=0A=
=0A=
        - The bracket type operators are most favorable when standing=0A=
          alone, but not in formulas, as they interfere with visual=0A=
          parsing of parenthesis for precedence and function argument.=0A=
          This is so for (op) and [op], and somewhat less so for {op}=0A=
          and <op>.=0A=
=0A=
        - The <op> form has the potential to be confused with < > and =3D=0A=
=0A=
        - The @op is not favored because @ is visually heavy (dense,=0A=
          more like a letter): a@+b is more readily read as a@ + b=0A=
          than a @+ b.=0A=
=0A=
        - For choosing meta-characters: Most of existing ASCII symbols=0A=
          have already been used.  The only three unused are @ $ ?.=0A=
=0A=
=0A=
Semantics of new operators=0A=
=0A=
    There are convincing arguments for using either set of operators=0A=
    as objectwise or elementwise.  Some of them are listed here:=0A=
=0A=
    1. op for element, ~op for object=0A=
=0A=
       - Consistent with current multiarray interface of Numeric package=0A=
       - Consistent with some other languages=0A=
       - Perception that elementwise operations are more natural=0A=
       - Perception that elementwise operations are used more frequently=0A=
=0A=
    2. op for object, ~op for element=0A=
=0A=
       - Consistent with current linear algebra interface of MatPy =
package=0A=
       - Consistent with some other languages=0A=
       - Perception that objectwise operations are more natural=0A=
       - Perception that objectwise operations are used more frequently=0A=
       - Consistent with the current behavior of operators on lists=0A=
       - Allow ~ to be a general elementwise meta-character in future=0A=
         extensions.=0A=
=0A=
    It is generally agreed upon that =0A=
=0A=
       - there is no absolute reason to favor one or the other=0A=
       - it is easy to cast from one representation to another in a=0A=
         sizable chunk of code, so the other flavor of operators is=0A=
         always minority=0A=
       - there are other semantic differences that favor existence of=0A=
         array-oriented and matrix-oriented packages, even if their=0A=
         operators are unified.=0A=
       - whatever the decision is taken, codes using existing=0A=
         interfaces should not be broken for a very long time.=0A=
=0A=
    Therefore not much is lost, and much flexibility retained, if the=0A=
    semantic flavors of these two sets of operators are not dictated=0A=
    by the core language.  The application packages are responsible=0A=
    for making the most suitable choice.  This is already the case for=0A=
    NumPy and MatPy which use opposite semantics.  Adding new=0A=
    operators will not break this.  See also observation after=0A=
    subsection 2 in the Examples below.=0A=
=0A=
    The issue of numerical precision was raised, but if the semantics=0A=
    is left to the applications, the actual precisions should also go=0A=
    there.=0A=
=0A=
=0A=
Examples=0A=
=0A=
    Following are examples of the actual formulas that will appear=0A=
    using various operators or other representations described above.=0A=
=0A=
    1. The matrix inversion formula:=0A=
=0A=
       - Using op for object and ~op for element:=0A=
=0A=
         b =3D a.I - a.I * u / (c.I + v/a*u) * v / a=0A=
=0A=
         b =3D a.I - a.I * u * (c.I + v*a.I*u).I * v * a.I=0A=
=0A=
       - Using op for element and ~op for object:=0A=
=0A=
         b =3D a.I @- a.I @* u @/ (c.I @+ v@/a@*u) @* v @/ a=0A=
=0A=
         b =3D a.I ~- a.I ~* u ~/ (c.I ~+ v~/a~*u) ~* v ~/ a=0A=
=0A=
         b =3D a.I (-) a.I (*) u (/) (c.I (+) v(/)a(*)u) (*) v (/) a=0A=
=0A=
         b =3D a.I [-] a.I [*] u [/] (c.I [+] v[/]a[*]u) [*] v [/] a=0A=
=0A=
         b =3D a.I <-> a.I <*> u </> (c.I <+> v</>a<*>u) <*> v </> a=0A=
=0A=
         b =3D a.I {-} a.I {*} u {/} (c.I {+} v{/}a{*}u) {*} v {/} a=0A=
=0A=
       Observation: For linear algebra using op for object is preferable.=0A=
=0A=
       Observation: The ~op type operators look better than (op) type=0A=
       in complicated formulas.=0A=
=0A=
       - using named operators=0A=
=0A=
         b =3D a.I @sub a.I @mul u @div (c.I @add v @div a @mul u) @mul =
v @div a=0A=
=0A=
         b =3D a.I ~sub a.I ~mul u ~div (c.I ~add v ~div a ~mul u) ~mul =
v ~div a=0A=
=0A=
       Observation: Named operators are not suitable for math formulas.=0A=
=0A=
    2. Plotting a 3d graph=0A=
=0A=
       - Using op for object and ~op for element:=0A=
=0A=
         z =3D sin(x~**2 ~+ y~**2);    plot(x,y,z)=0A=
=0A=
       - Using op for element and ~op for object:=0A=
=0A=
         z =3D sin(x**2 + y**2);   plot(x,y,z)=0A=
=0A=
        Observation: Elementwise operations with broadcasting allows=0A=
        much more efficient implementation than MatLab.=0A=
=0A=
        Observation: It is useful to have two related classes with the=0A=
        semantics of op and ~op swapped.  Using these the ~op=0A=
        operators would only need to appear in chunks of code where=0A=
        the other flavor dominates, while maintaining consistent=0A=
        semantics of the code.=0A=
=0A=
    3. Using + and - with automatic broadcasting=0A=
=0A=
         a =3D b - c;  d =3D a.T*a=0A=
=0A=
       Observation: This would silently produce hard-to-trace bugs if=0A=
       one of b or c is row vector while the other is column vector.=0A=
=0A=
=0A=
Miscellaneous issues:=0A=
=0A=
    - Need for the ~+ ~- operators.  The objectwise + - are important=0A=
      because they provide important sanity checks as per linear=0A=
      algebra.  The elementwise + - are important because they allow=0A=
      broadcasting that are very efficient in applications.=0A=
=0A=
    - Left division (solve).  For matrix, a*x is not necessarily equal=0A=
      to x*a.  The solution of a*x=3D=3Db, denoted x=3Dsolve(a,b), is=0A=
      therefore different from the solution of x*a=3D=3Db, denoted=0A=
      x=3Ddiv(b,a).  There are discussions about finding a new symbol=0A=
      for solve.  [Background: MatLab use b/a for div(b,a) and a\b for=0A=
      solve(a,b).]=0A=
=0A=
      It is recognized that Python provides a better solution without=0A=
      requiring a new symbol: the inverse method .I can be made to be=0A=
      delayed so that a.I*b and b*a.I are equivalent to Mat lab's a\b=0A=
      and b/a.  The implementation is quite simple and the resulting=0A=
      application code clean.=0A=
=0A=
    - Power operator.  Python's use of a**b as pow(a,b) has two=0A=
      perceived disadvantages:=0A=
=0A=
      - Most mathematicians are more familiar with a^b for this purpose.=0A=
      - It results in long augmented assignment operator ~**=3D.=0A=
=0A=
      However, this issue is distinct from the main issue here.=0A=
=0A=
    - Additional multiplication operators.  Several forms of=0A=
      multiplications are used in (multi-)linear algebra.  Most can be=0A=
      seen as variations of multiplication in linear algebra sense=0A=
      (such as Kronecker product).  But two forms appear to be more=0A=
      fundamental: outer product and inner product.  However, their=0A=
      specification includes indices, which can be either=0A=
=0A=
      - associated with the operator, or=0A=
      - associated with the objects.=0A=
=0A=
      The latter (the Einstein notation) is used extensively on paper,=0A=
      and is also the easier one to implement.  By implementing a=0A=
      tensor-with-indices class, a general form of multiplication=0A=
      would cover both outer and inner products, and specialize to=0A=
      linear algebra multiplication as well.  The index rule can be=0A=
      defined as class methods, like,=0A=
=0A=
          a =3D b.i(1,2,-1,-2) * c.i(4,-2,3,-1)   # a_ijkl =3D b_ijmn =
c_lnkm=0A=
=0A=
      Therefore one objectwise multiplication is sufficient.=0A=
=0A=
    - Bitwise operators. =0A=
=0A=
      - The proposed new math operators use the symbol ~ that is=0A=
        "bitwise not" operator.  This poses no compatibility problem=0A=
        but somewhat complicates implementation.=0A=
=0A=
      - The symbol ^ might be better used for pow than bitwise xor.=0A=
        But this depends on the future of bitwise operators.  It does=0A=
        not immediately impact on the proposed math operator.=0A=
=0A=
      - The symbol | was suggested to be used for matrix solve.  But=0A=
        the new solution of using delayed .I is better in several=0A=
        ways.=0A=
=0A=
      - The current proposal fits in a larger and more general=0A=
        extension that will remove the need for special bitwise=0A=
        operators.  (See elementization below.)=0A=
=0A=
    - Alternative to special operator names used in definition,=0A=
=0A=
          def "+"(a, b)      in place of       def __add__(a, b)=0A=
=0A=
      This appears to require greater syntactical change, and would=0A=
      only be useful when arbitrary additional operators are allowed.=0A=
=0A=
=0A=
Impact on general elementization=0A=
=0A=
    The distinction between objectwise and elementwise operations are=0A=
    meaningful in other contexts as well, where an object can be=0A=
    conceptually regarded as a collection of elements.  It is=0A=
    important that the current proposal does not preclude possible=0A=
    future extensions.=0A=
=0A=
    One general future extension is to use ~ as a meta operator to=0A=
    "elementize" a given operator.  Several examples are listed here:=0A=
=0A=
    1. Bitwise operators.  Currently Python assigns six operators to=0A=
       bitwise operations: and (&), or (|), xor (^), complement (~),=0A=
       left shift (<<) and right shift (>>), with their own precedence=0A=
       levels.=0A=
=0A=
       Among them, the & | ^ ~ operators can be regarded as=0A=
       elementwise versions of lattice operators applied to integers=0A=
       regarded as bit strings.=0A=
=0A=
           5 and 6                # 6=0A=
           5 or 6                 # 5=0A=
=0A=
           5 ~and 6               # 4=0A=
           5 ~or 6                # 7=0A=
=0A=
       These can be regarded as general elementwise lattice operators,=0A=
       not restricted to bits in integers.=0A=
=0A=
       In order to have named operators for xor ~xor, it is necessary=0A=
       to make xor a reserved word.=0A=
=0A=
    2. List arithmetics. =0A=
=0A=
           [1, 2] + [3, 4]        # [1, 2, 3, 4]=0A=
           [1, 2] ~+ [3, 4]       # [4, 6]=0A=
=0A=
           ['a', 'b'] * 2         # ['a', 'b', 'a', 'b']=0A=
           'ab' * 2               # 'abab'=0A=
=0A=
           ['a', 'b'] ~* 2        # ['aa', 'bb']=0A=
           [1, 2] ~* 2            # [2, 4]=0A=
=0A=
       It is also consistent to Cartesian product=0A=
=0A=
           [1,2]*[3,4]            # [(1,3),(1,4),(2,3),(2,4)]=0A=
=0A=
    3. List comprehension.=0A=
=0A=
           a =3D [1, 2]; b =3D [3, 4]=0A=
           ~f(a,b)                # [f(x,y) for x, y in zip(a,b)]=0A=
           ~f(a*b)                # [f(x,y) for x in a for y in b]=0A=
           a ~+ b                 # [x + y for x, y in zip(a,b)]=0A=
=0A=
    4. Tuple generation (the zip function in Python 2.0)=0A=
=0A=
          [1, 2, 3], [4, 5, 6]   # ([1,2, 3], [4, 5, 6])=0A=
          [1, 2, 3]~,[4, 5, 6]   # [(1,4), (2, 5), (3,6)]=0A=
=0A=
    5. Using ~ as generic elementwise meta-character to replace map=0A=
=0A=
          ~f(a, b)               # map(f, a, b)=0A=
          ~~f(a, b)              # map(lambda *x:map(f, *x), a, b)=0A=
=0A=
       More generally,=0A=
=0A=
          def ~f(*x): return map(f, *x)=0A=
          def ~~f(*x): return map(~f, *x)=0A=
          ...=0A=
=0A=
    6. Elementwise format operator (with broadcasting)=0A=
=0A=
          a =3D [1,2,3,4,5]=0A=
          print ["%5d "] ~% a =0A=
          a =3D [[1,2],[3,4]]=0A=
          print ["%5d "] ~~% a=0A=
=0A=
    7.  Rich comparison=0A=
=0A=
          [1, 2, 3]  ~< [3, 2, 1]  # [1, 0, 0]=0A=
          [1, 2, 3] ~=3D=3D [3, 2, 1]  # [0, 1, 0]=0A=
=0A=
    8. Rich indexing=0A=
=0A=
          [a, b, c, d] ~[2, 3, 1]  # [c, d, b]=0A=
=0A=
    9. Tuple flattening=0A=
=0A=
          a =3D (1,2);  b =3D (3,4)=0A=
          f(~a, ~b)                # f(1,2,3,4)      =0A=
=0A=
    10. Copy operator=0A=
=0A=
          a ~=3D b                   # a =3D b.copy()=0A=
=0A=
        There can be specific levels of deep copy=0A=
=0A=
          a ~~=3D b                  # a =3D b.copy(2)=0A=
=0A=
    Notes:=0A=
=0A=
    1. There are probably many other similar situations.  This general=0A=
       approach seems well suited for most of them, in place of=0A=
       several separated extensions for each of them (parallel and=0A=
       cross iteration, list comprehension, rich comparison, etc).=0A=
=0A=
    2. The semantics of "elementwise" depends on applications.  For=0A=
       example, an element of matrix is two levels down from the=0A=
       list-of-list point of view.  This requires more fundamental=0A=
       change than the current proposal.  In any case, the current=0A=
       proposal will not negatively impact on future possibilities of=0A=
       this nature.=0A=
=0A=
    Note that this section describes a type of future extensions that=0A=
    is consistent with current proposal, but may present additional=0A=
    compatibility or other problems.  They are not tied to the current=0A=
    proposal.=0A=
=0A=
=0A=
Impact on named operators=0A=
=0A=
    The discussions made it generally clear that infix operators is a=0A=
    scarce resource in Python, not only in numerical computation, but=0A=
    in other fields as well.  Several proposals and ideas were put=0A=
    forward that would allow infix operators be introduced in ways=0A=
    similar to named functions.  We show here that the current=0A=
    extension does not negatively impact on future extensions in this=0A=
    regard.=0A=
=0A=
    1. Named infix operators.=0A=
=0A=
        Choose a meta character, say @, so that for any identifier=0A=
        "opname", the combination "@opname" would be a binary infix=0A=
        operator, and=0A=
=0A=
        a @opname b =3D=3D opname(a,b)=0A=
=0A=
        Other representations mentioned include .name ~name~ :name:=0A=
        (.name) %name% and similar variations.  The pure bracket based=0A=
        operators cannot be used this way.=0A=
=0A=
        This requires a change in the parser to recognize @opname, and=0A=
        parse it into the same structure as a function call.  The=0A=
        precedence of all these operators would have to be fixed at=0A=
        one level, so the implementation would be different from=0A=
        additional math operators which keep the precedence of=0A=
        existing math operators.=0A=
=0A=
        The current proposed extension do not limit possible future=0A=
        extensions of such form in any way.=0A=
=0A=
    2. More general symbolic operators.=0A=
=0A=
        One additional form of future extension is to use meta=0A=
        character and operator symbols (symbols that cannot be used in=0A=
        syntactical structures other than operators).  Suppose @ is=0A=
        the meta character.  Then=0A=
=0A=
            a + b,    a @+ b,    a @@+ b,  a @+- b=0A=
=0A=
        would all be operators with a hierarchy of precedence, defined by=0A=
=0A=
            def "+"(a, b)=0A=
            def "@+"(a, b)=0A=
            def "@@+"(a, b)=0A=
            def "@+-"(a, b)=0A=
=0A=
        One advantage compared with named operators is greater=0A=
        flexibility for precedences based on either the meta character=0A=
        or the ordinary operator symbols.  This also allows operator=0A=
        composition.  The disadvantage is that they are more like=0A=
        "line noise".  In any case the current proposal does not=0A=
        impact its future possibility.=0A=
=0A=
        These kinds of future extensions may not be necessary when=0A=
        Unicode becomes generally available.=0A=
=0A=
        Note that this section discusses compatibility of the proposed=0A=
        extension with possible future extensions.  The desirability=0A=
        or compatibility of these other extensions themselves are=0A=
        specifically not considered here.=0A=
=0A=
=0A=
Credits and archives=0A=
=0A=
    The discussions mostly happened in July to August of 2000 on news=0A=
    group comp.lang.python and the mailing list python-dev.  There are=0A=
    altogether several hundred postings, most can be retrieved from=0A=
    these two pages (and searching word "operator"):=0A=
=0A=
        http://www.python.org/pipermail/python-list/2000-July/=0A=
        http://www.python.org/pipermail/python-list/2000-August/=0A=
=0A=
    The names of contributers are too numerous to mention here,=0A=
    suffice to say that a large proportion of ideas discussed here are=0A=
    not our own.=0A=
=0A=
    Several key postings (from our point of view) that may help to=0A=
    navigate the discussions include:=0A=
=0A=
        http://www.python.org/pipermail/python-list/2000-July/108893.html=0A=
        http://www.python.org/pipermail/python-list/2000-July/108777.html=0A=
        http://www.python.org/pipermail/python-list/2000-July/108848.html=0A=
        http://www.python.org/pipermail/python-list/2000-July/109237.html=0A=
        http://www.python.org/pipermail/python-list/2000-July/109250.html=0A=
        http://www.python.org/pipermail/python-list/2000-July/109310.html=0A=
        http://www.python.org/pipermail/python-list/2000-July/109448.html=0A=
        http://www.python.org/pipermail/python-list/2000-July/109491.html=0A=
        http://www.python.org/pipermail/python-list/2000-July/109537.html=0A=
        http://www.python.org/pipermail/python-list/2000-July/109607.html=0A=
        http://www.python.org/pipermail/python-list/2000-July/109709.html=0A=
        http://www.python.org/pipermail/python-list/2000-July/109804.html=0A=
        http://www.python.org/pipermail/python-list/2000-July/109857.html=0A=
        http://www.python.org/pipermail/python-list/2000-July/110061.html=0A=
        http://www.python.org/pipermail/python-list/2000-July/110208.html=0A=
        =
http://www.python.org/pipermail/python-list/2000-August/111427.html=0A=
        =
http://www.python.org/pipermail/python-list/2000-August/111558.html=0A=
        =
http://www.python.org/pipermail/python-list/2000-August/112551.html=0A=
        =
http://www.python.org/pipermail/python-list/2000-August/112606.html=0A=
        =
http://www.python.org/pipermail/python-list/2000-August/112758.html=0A=
=0A=
        http://www.python.org/pipermail/python-dev/2000-July/013243.html=0A=
        http://www.python.org/pipermail/python-dev/2000-July/013364.html=0A=
        =
http://www.python.org/pipermail/python-dev/2000-August/014940.html=0A=
=0A=
    These are earlier drafts of this PEP:=0A=
=0A=
        =
http://www.python.org/pipermail/python-list/2000-August/111785.html=0A=
        =
http://www.python.org/pipermail/python-list/2000-August/112529.html=0A=
        =
http://www.python.org/pipermail/python-dev/2000-August/014906.html=0A=
=0A=
    There is an alternative PEP (with official PEP number 211) by Greg=0A=
    Wilson, titled "Adding New Linear Algebra Operators to Python".=0A=
=0A=
    Its first (and current) version is at:=0A=
=0A=
        =
http://www.python.org/pipermail/python-dev/2000-August/014876.html=0A=
        http://python.sourceforge.net/peps/pep-0211.html=0A=
=0A=
=0A=
Additional References=0A=
=0A=
    [1] http://MatPy.sourceforge.net/Misc/index.html=0A=
=0A=
=0A=
=0C=0A=
Local Variables:=0A=
mode: indented-text=0A=
indent-tabs-mode: nil=0A=
End:=0A=

------=_NextPart_000_002A_01C066B6.32690BC0--



From guido@python.org  Fri Dec 15 21:55:46 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 16:55:46 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: Your message of "Fri, 15 Dec 2000 16:20:20 EST."
 <14906.33325.5784.118110@nem-srvr.stsci.edu>
References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> <14906.17712.830224.481130@nem-srvr.stsci.edu> <200012151951.OAA03219@cj20424-a.reston1.va.home.com>
 <14906.33325.5784.118110@nem-srvr.stsci.edu>
Message-ID: <200012152155.QAA03879@cj20424-a.reston1.va.home.com>

> > Maybe.  That can still be decided later.  Right now, adding operators
> > is not on the table for 2.1 (if only because there are two conflicting
> > PEPs); adding rich comparisons *is* on the table because it doesn't
> > change the parser (and because the rich comparisons idea was already
> > pretty much worked out two years ago).
> 
> Yes, it was worked out previously _assuming_ rich comparisons do not
> use any new operators.  
> 
> But let's stop for a moment and contemplate adding rich comparisons 
> along with new comparison operators.  What do we gain?
> 
> 1. The current boolean operator behavior does not have to change, and
>    hence will be backward compatible.

What incompatibility do you see in the current proposal?

> 2. It eliminates the need to decide whether or not rich comparisons
>    takes precedence over boolean comparisons.

Only if you want different semantics -- that's only an issue for NumPy.

> 3. The new operators add additional behavior without directly impacting 
>    current behavior and the use of them is unambigous, at least in
>    relation to current Python behavior.  You know by the operator what 
>    type of comparison will be returned.  This should appease Jim
>    Fulton, based on his arguments in 1998 about comparison operators
>    always returning a boolean value.

As you know, I'm now pretty close to Jim. :-)  He seemed pretty mellow
about this now.

> 4. Compound objects, such as lists, could implement both rich
>    and boolean comparisons.  The boolean comparison would remain as
>    is, while the rich comparison would return a list of boolean
>    values.  Current behavior doesn't change; just a new feature, which
>    you may or may not choose to use, is added.
> 
> If we go one step further and add the matrix-style operators along
> with the comparison operators, we can provide a consistent user
> interface to array/complex operations without changing current Python
> behavior.  If a user has no need for these new operators, he doesn't
> have to use them or even know about them.  All we've done is made
> Python richer, but I believe with making it more complex.  For
> example, all element-wise operations could have a ':' appended to
> them, e.g. '+:', '<:', etc.; and will define element-wise addition,
> element-wise less-than, etc.  The traditional '*', '/', etc. operators
> can then be used for matrix operations, which will appease the Matlab
> people.
> 
> Therefore, I don't think rich comparisons and matrix-type operators
> should be considered separable.  I really think you should consider
> this suggestion.  It appeases many groups while providing a consistent 
> and clear user interface, while greatly impacting current Python
> behavior. 
> 
> Always-causing-havoc-at-the-last-moment-ly Yours,

I think you misunderstand.  Rich comparisons are mostly about allowing
the separate overloading of <, <=, ==, !=, >, and >=.  This is useful
in its own light.

If you don't want to use this overloading facility for elementwise
comparisons in NumPy, that's fine with me.  Nobody says you have to --
it's just that you *could*.

Red my lips: there won't be *any* new operators in 2.1.

There will a better way to overload the existing Boolean operators,
and they will be able to return non-Boolean results.  That's useful in
other situations besides NumPy.

Feel free to lobby for elementwise operators -- but based on the
discussion about this subject so far, I don't give it much of a chance
even past Python 2.1.  They would add a lot of baggage to the language
(e.g. the table of operators in all Python books would be about twice
as long) and by far the most users don't care about them.  (Read the
intro to 211 for some of the concerns -- this PEP tries to make the
addition palatable by adding exactly *one* new operator.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Dec 15 22:16:34 2000
From: guido@python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 17:16:34 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Fri, 08 Dec 2000 17:58:03 EST."
 <200012082258.RAA02389@cj20424-a.reston1.va.home.com>
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>
 <200012082258.RAA02389@cj20424-a.reston1.va.home.com>
Message-ID: <200012152216.RAA11098@cj20424-a.reston1.va.home.com>

I've checked in the essential parts of the warnings PEP, and closed
the SF patch.  I haven't checked in the examples in the patch -- it's
too early for that.  But I figured that it's easier to revise the code
once it's checked in.  I'm pretty confident that it works as
advertised.

Still missing is documentation: the warnings module, the new API
functions, and the new command line option should all be documented.
I'll work on that over the holidays.

I consider the PEP done.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Fri Dec 15 22:21:24 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:21:24 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com>
Message-ID: <3A3A9964.A6B3DD11@lemburg.com>

Neil Schemenauer wrote:
> 
> On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote:
> > I'm not sure I agree with that view either, but mostly because there
> > is a non-GPL replacement for parts of the readline API:
> >
> >     http://www.cstr.ed.ac.uk/downloads/editline.html
> 
> It doesn't work with the current readline module.  It is much
> smaller than readline and works just as well in my experience.
> Would there be any interest in including a copy with the standard
> distribution?  The license is quite nice (X11 type).

+1 from here -- line editing is simply a very important part of
an interactive prompt and readline is not only big, slow and
full of strange surprises, but also GPLed ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Fri Dec 15 22:24:34 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:24:34 +0100
Subject: [Python-Dev] Use of %c and Py_UNICODE
References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com>
Message-ID: <3A3A9A22.E9BA9551@lemburg.com>

"A.M. Kuchling" wrote:
> 
> unicodeobject.c contains this code:
> 
>                 PyErr_Format(PyExc_ValueError,
>                             "unsupported format character '%c' (0x%x) "
>                             "at index %i",
>                             c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat));
> 
> c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits,
> so '%\u3000' % 1 results in an error message containing "'\000'
> (0x3000)".  Is this worth fixing?  I'd say no, since the hex value is
> more useful for Unicode strings anyway.  (I still wanted to mention
> this little buglet, since I just touched this bit of code.)

Why would you want to fix it ? Format characters will always
be ASCII and thus 7-bit -- theres really no need to expand the
set of possibilities beyond 8 bits ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fdrake@acm.org  Fri Dec 15 22:22:34 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 15 Dec 2000 17:22:34 -0500 (EST)
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012152216.RAA11098@cj20424-a.reston1.va.home.com>
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>
 <200012082258.RAA02389@cj20424-a.reston1.va.home.com>
 <200012152216.RAA11098@cj20424-a.reston1.va.home.com>
Message-ID: <14906.39338.795843.947683@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > Still missing is documentation: the warnings module, the new API
 > functions, and the new command line option should all be documented.
 > I'll work on that over the holidays.

  I've assigned a bug to you in case you forget.  I've given it a
"show-stopper" priority level, so I'll feel good ripping the code out
if you don't get docs written in time.  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From mal@lemburg.com  Fri Dec 15 22:39:18 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:39:18 +0100
Subject: [Python-Dev] What to do about PEP 229?
References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com>
Message-ID: <3A3A9D96.80781D61@lemburg.com>

"A.M. Kuchling" wrote:
> 
> I began writing the fabled fancy setup script described in PEP 229,
> and then realized there was duplication going on here.  The code in
> setup.py would need to know what libraries, #defines, &c., are needed
> by each module in order to check if they're needed and set them.  But
> if Modules/Setup can be used to override setup.py's behaviour, then
> much of this information would need to be in that file, too; the
> details of compiling a module are in two places.
> 
> Possibilities:
> 
> 1) Setup contains fully-loaded module descriptions, and the setup
>    script drops unneeded bits.  For example, the socket module
>    requires -lnsl on some platforms.  The Setup file would contain
>    "socket socketmodule.c -lnsl" on all platforms, and setup.py would
>    check for an nsl library and only use if it's there.
> 
>    This seems dodgy to me; what if -ldbm is needed on one platform and
>    -lndbm on another?

Can't distutils try both and then settle for the working combination ?

[distutils isn't really ready for auto-configure yet, but Greg
has already provided most of the needed functionality -- it's just
not well integrated into the rest of the build process in version 1.0.1
... BTW, where is Gerg ? I haven't heard from him in quite a while.]
 
> 2) Drop setup completely and just maintain setup.py, with some
>    different overriding mechanism.  This is more radical.  Adding a
>    new module is then not just a matter of editing a simple text file;
>    you'd have to modify setup.py, making it more like maintaining an
>    autoconf script.

Why not parse Setup and use it as input to distutils setup.py ?
 
> Remember, the underlying goal of PEP 229 is to have the out-of-the-box
> Python installation you get from "./configure;make" contain many more
> useful modules; right now you wouldn't get zlib, syslog, resource, any
> of the DBM modules, PyExpat, &c.  I'm not wedded to using Distutils to
> get that, but think that's the only practical way; witness the hackery
> required to get the DB module automatically compiled.
> 
> You can also wave your hands in the direction of packagers such as
> ActiveState or Red Hat, and say "let them make to compile everything".
> But this problem actually inconveniences *me*, since I always build
> Python myself and have to extensively edit Setup, so I'd like to fix
> the problem.
> 
> Thoughts?

Nice idea :-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Fri Dec 15 22:44:15 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:44:15 +0100
Subject: [Python-Dev] Death to string functions!
References: <200012151509.HAA18093@slayer.i.sourceforge.net>
 <20001215041450.B22056@glacier.fnational.com> <200012151929.OAA03073@cj20424-a.reston1.va.home.com>
Message-ID: <3A3A9EBF.3F9306B6@lemburg.com>

Guido van Rossum wrote:
> 
> > Can you explain the logic behind this recent interest in removing
> > string functions from the standard library?  It it performance?
> > Some unicode issue?  I don't have a great attachment to string.py
> > but I also don't see the justification for the amount of work it
> > requires.
> 
> I figure that at *some* point we should start putting our money where
> our mouth is, deprecate most uses of the string module, and start
> warning about it.  Not in 2.1 probably, given my experience below.
> 
> As a realistic test of the warnings module I played with some warnings
> about the string module, and then found that say most of the std
> library modules use it, triggering an extraordinary amount of
> warnings.  I then decided to experiment with the conversion.  I
> quickly found out it's too much work to do manually, so I'll hold off
> until someone comes up with a tool that does 99% of the work.

This would also help a lot of programmers out there who are
stuch with 100k LOCs of Python code using string.py ;)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Fri Dec 15 22:49:01 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:49:01 +0100
Subject: [Python-Dev] Death to string functions!
References: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com> <200012152123.QAA03566@cj20424-a.reston1.va.home.com>
Message-ID: <3A3A9FDD.E6F021AF@lemburg.com>

Guido van Rossum wrote:
> 
> Ideally, I would like to deprecate the entire string module, so that I
> can place a single warning at its top.  This will cause a single
> warning to be issued for programs that still use it (no matter how
> many times it is imported).  Unfortunately, there are a couple of
> things that still need it: string.letters etc., and
> string.maketrans().

Can't we come up with a module similar to unicodedata[.py] ? 

string.py could then still provide the interfaces, but the
implementation would live in stringdata.py

[Perhaps we won't need stringdata by then... Unicode will have
 taken over and the discussion be mood ;-)]

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From thomas@xs4all.net  Fri Dec 15 22:54:25 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 15 Dec 2000 23:54:25 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <20001215040304.A22056@glacier.fnational.com>; from nas@arctrix.com on Fri, Dec 15, 2000 at 04:03:04AM -0800
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com>
Message-ID: <20001215235425.A29681@xs4all.nl>

On Fri, Dec 15, 2000 at 04:03:04AM -0800, Neil Schemenauer wrote:
> On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote:
> > I'm not sure I agree with that view either, but mostly because there
> > is a non-GPL replacement for parts of the readline API:
> > 
> >     http://www.cstr.ed.ac.uk/downloads/editline.html
> 
> It doesn't work with the current readline module.  It is much
> smaller than readline and works just as well in my experience.
> Would there be any interest in including a copy with the standard
> distribution?  The license is quite nice (X11 type).

Definately +1 from here. Readline reminds me of the cold war, for some
reason. (Actually, multiple reasons ;) I don't have time to do it myself,
unfortunately, or I would. (Looking at editline has been on my TODO list for
a while... :P)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From martin@loewis.home.cs.tu-berlin.de  Sat Dec 16 12:32:30 2000
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 16 Dec 2000 13:32:30 +0100
Subject: [Python-Dev] PEP 226
Message-ID: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de>

I remember earlier discussion on the Python 2.1 release schedule, and
never managed to comment on those.

I believe that Python contributors and maintainers did an enourmous
job in releasing Python 2, which took quite some time from everybody's
life. I think it is unrealistic to expect the same amount of
commitment for the next release, especially if that release appears
just a few months after the previous release (that is, one month from
now).

So I'd like to ask the release manager to take that into
account. I'm not quite sure what kind of action I expect; possible
alternatives are:
- declare 2.1 a pure bug fix release only; with a minimal set of new
  features. In particular, don't push for completion of PEPs; everybody
  should then accept that most features that are currently discussed
  will appear in Python 2.2.

- move the schedule for Python 2.1 back (or is it forward?) by, say, a
  few month. This will people give some time to do the things that did
  not get the right amount of attention during 2.0 release, and will
  still allow to work on new and interesting features.

Just my 0.02EUR,

Martin


From guido@python.org  Sat Dec 16 16:38:28 2000
From: guido@python.org (Guido van Rossum)
Date: Sat, 16 Dec 2000 11:38:28 -0500
Subject: [Python-Dev] PEP 226
In-Reply-To: Your message of "Sat, 16 Dec 2000 13:32:30 +0100."
 <200012161232.NAA01779@loewis.home.cs.tu-berlin.de>
References: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de>
Message-ID: <200012161638.LAA13888@cj20424-a.reston1.va.home.com>

> I remember earlier discussion on the Python 2.1 release schedule, and
> never managed to comment on those.
> 
> I believe that Python contributors and maintainers did an enourmous
> job in releasing Python 2, which took quite some time from everybody's
> life. I think it is unrealistic to expect the same amount of
> commitment for the next release, especially if that release appears
> just a few months after the previous release (that is, one month from
> now).
> 
> So I'd like to ask the release manager to take that into
> account. I'm not quite sure what kind of action I expect; possible
> alternatives are:
> - declare 2.1 a pure bug fix release only; with a minimal set of new
>   features. In particular, don't push for completion of PEPs; everybody
>   should then accept that most features that are currently discussed
>   will appear in Python 2.2.
> 
> - move the schedule for Python 2.1 back (or is it forward?) by, say, a
>   few month. This will people give some time to do the things that did
>   not get the right amount of attention during 2.0 release, and will
>   still allow to work on new and interesting features.
> 
> Just my 0.02EUR,

You're right -- 2.0 (including 1.6) was a monumental effort, and I'm
grateful to all who contributed.

I don't expect that 2.1 will be anywhere near the same amount of work!

Let's look at what's on the table.

0042  Small Feature Requests                 Hylton
 SD  205  pep-0205.txt  Weak References                        Drake
 S   207  pep-0207.txt  Rich Comparisons                       Lemburg, van Rossum
 S   208  pep-0208.txt  Reworking the Coercion Model           Schemenauer
 S   217  pep-0217.txt  Display Hook for Interactive Use       Zadka
 S   222  pep-0222.txt  Web Library Enhancements               Kuchling
 I   226  pep-0226.txt  Python 2.1 Release Schedule            Hylton
 S   227  pep-0227.txt  Statically Nested Scopes               Hylton
 S   230  pep-0230.txt  Warning Framework                      van Rossum
 S   232  pep-0232.txt  Function Attributes                    Warsaw
 S   233  pep-0233.txt  Python Online Help                     Prescod


From guido@python.org  Sat Dec 16 16:46:32 2000
From: guido@python.org (Guido van Rossum)
Date: Sat, 16 Dec 2000 11:46:32 -0500
Subject: [Python-Dev] PEP 226
In-Reply-To: Your message of "Sat, 16 Dec 2000 13:32:30 +0100."
 <200012161232.NAA01779@loewis.home.cs.tu-berlin.de>
References: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de>
Message-ID: <200012161646.LAA13947@cj20424-a.reston1.va.home.com>

[Oops, I posted a partial edit of this message by mistake before.]

> I remember earlier discussion on the Python 2.1 release schedule, and
> never managed to comment on those.
> 
> I believe that Python contributors and maintainers did an enourmous
> job in releasing Python 2, which took quite some time from everybody's
> life. I think it is unrealistic to expect the same amount of
> commitment for the next release, especially if that release appears
> just a few months after the previous release (that is, one month from
> now).
> 
> So I'd like to ask the release manager to take that into
> account. I'm not quite sure what kind of action I expect; possible
> alternatives are:
> - declare 2.1 a pure bug fix release only; with a minimal set of new
>   features. In particular, don't push for completion of PEPs; everybody
>   should then accept that most features that are currently discussed
>   will appear in Python 2.2.
> 
> - move the schedule for Python 2.1 back (or is it forward?) by, say, a
>   few month. This will people give some time to do the things that did
>   not get the right amount of attention during 2.0 release, and will
>   still allow to work on new and interesting features.
> 
> Just my 0.02EUR,

You're right -- 2.0 (including 1.6) was a monumental effort, and I'm
grateful to all who contributed.

I don't expect that 2.1 will be anywhere near the same amount of work!

Let's look at what's on the table.  These are listed as Active PEPs --
under serious consideration for Python 2.1:

> 0042  Small Feature Requests                 Hylton

We can do some of these or leave them.

> 0205  Weak References                        Drake

This one's open.

> 0207  Rich Comparisons                       Lemburg, van Rossum

This is really not that much work -- I would've done it already if I
weren't distracted by the next one.

> 0208  Reworking the Coercion Model           Schemenauer

Neil has most of this under control.  I don't doubt for a second that
it will be finished.

> 0217  Display Hook for Interactive Use       Zadka

Probably a 20-line fix.

> 0222  Web Library Enhancements               Kuchling

Up to Andrew.  If he doesn't get to it, no big deal.

> 0226  Python 2.1 Release Schedule            Hylton

I still think this is realistic -- a release before the conference
seems doable!

> 0227  Statically Nested Scopes               Hylton

This one's got a 50% chance at least.  Jeremy seems motivated to do
it.

> 0230  Warning Framework                      van Rossum

Done except for documentation.

> 0232  Function Attributes                    Warsaw

We need to discuss this more, but it's not much work to implement.

> 0233  Python Online Help                     Prescod

If Paul can control his urge to want to solve everything at once, I
see no reason whi this one couldn't find its way into 2.1.

Now, officially the PEP deadline is closed today: the schedule says
"16-Dec-2000: 2.1 PEPs ready for review".  That means that no new PEPs
will be considered for inclusion in 2.1, and PEPs not in the active
list won't be considered either.  But the PEPs in the list above are
all ready for review, even if we don't agree with all of them.

I'm actually more worried about the ever-growing number of bug reports
and submitted patches.  But that's for another time.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Sun Dec 17 00:09:28 2000
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Sat, 16 Dec 2000 19:09:28 -0500
Subject: [Python-Dev] Use of %c and Py_UNICODE
In-Reply-To: <3A3A9A22.E9BA9551@lemburg.com>; from mal@lemburg.com on Fri, Dec 15, 2000 at 11:24:34PM +0100
References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> <3A3A9A22.E9BA9551@lemburg.com>
Message-ID: <20001216190928.A6703@kronos.cnri.reston.va.us>

On Fri, Dec 15, 2000 at 11:24:34PM +0100, M.-A. Lemburg wrote:
>Why would you want to fix it ? Format characters will always
>be ASCII and thus 7-bit -- theres really no need to expand the
>set of possibilities beyond 8 bits ;-)

This message is for characters that aren't format characters, which
therefore includes all characters >127.  

--amk



From akuchlin@mems-exchange.org  Sun Dec 17 00:17:39 2000
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Sat, 16 Dec 2000 19:17:39 -0500
Subject: [Python-Dev] What to do about PEP 229?
In-Reply-To: <3A3A9D96.80781D61@lemburg.com>; from mal@lemburg.com on Fri, Dec 15, 2000 at 11:39:18PM +0100
References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com>
Message-ID: <20001216191739.B6703@kronos.cnri.reston.va.us>

On Fri, Dec 15, 2000 at 11:39:18PM +0100, M.-A. Lemburg wrote:
>Can't distutils try both and then settle for the working combination ?

I'm worried about subtle problems; what if an unneeded -lfoo drags in
a customized malloc, or has symbols which conflict with some other
library.  

>... BTW, where is Greg ? I haven't heard from him in quite a while.]

Still around; he just hasn't been posting much these days.

>Why not parse Setup and use it as input to distutils setup.py ?

That was option 1.  The existing Setup format doesn't really contain
enough intelligence, though; the intelligence is usually in comments
such as "Uncomment the following line for Solaris".  So either the
Setup format is modified (bad, since we'd break existing 3rd-party
packages that still use a Makefile.pre.in), or I give up and just do
everything in a setup.py.

--amk


From guido@python.org  Sun Dec 17 02:38:01 2000
From: guido@python.org (Guido van Rossum)
Date: Sat, 16 Dec 2000 21:38:01 -0500
Subject: [Python-Dev] What to do about PEP 229?
In-Reply-To: Your message of "Sat, 16 Dec 2000 19:17:39 EST."
 <20001216191739.B6703@kronos.cnri.reston.va.us>
References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com>
 <20001216191739.B6703@kronos.cnri.reston.va.us>
Message-ID: <200012170238.VAA14466@cj20424-a.reston1.va.home.com>

> >Why not parse Setup and use it as input to distutils setup.py ?
> 
> That was option 1.  The existing Setup format doesn't really contain
> enough intelligence, though; the intelligence is usually in comments
> such as "Uncomment the following line for Solaris".  So either the
> Setup format is modified (bad, since we'd break existing 3rd-party
> packages that still use a Makefile.pre.in), or I give up and just do
> everything in a setup.py.

Forget Setup.  Convert it and be done with it.  There really isn't
enough there to hang on to.  We'll support Setup format (through the
makesetup script and the Misc/Makefile.pre.in file) for 3rd party b/w
compatibility, but we won't need to use it ourselves.  (Too bad for
3rd party documentation that describes the Setup format. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Sun Dec 17 07:34:27 2000
From: tim.one@home.com (Tim Peters)
Date: Sun, 17 Dec 2000 02:34:27 -0500
Subject: [Python-Dev] Use of %c and Py_UNICODE
In-Reply-To: <20001216190928.A6703@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEEHIEAA.tim.one@home.com>

[MAL]
> Why would you want to fix it ? Format characters will always
> be ASCII and thus 7-bit -- theres really no need to expand the
> set of possibilities beyond 8 bits ;-)

[AMK]
> This message is for characters that aren't format characters, which
> therefore includes all characters >127.

I'm with the wise man who suggested to drop the %c in this case and just
display the hex value.  Although it would be more readable to drop the %c if
and only if the bogus format character isn't printable 7-bit ASCII.  Which
is obvious, yes?  A new if/else isn't going to hurt anything.



From tim.one@home.com  Sun Dec 17 07:57:01 2000
From: tim.one@home.com (Tim Peters)
Date: Sun, 17 Dec 2000 02:57:01 -0500
Subject: [Python-Dev] PEP 226
In-Reply-To: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEHIEAA.tim.one@home.com>

[Martin v. Loewis]
> ...
> - move the schedule for Python 2.1 back (or is it forward?) by, say, a
>   few month. This will people give some time to do the things that did
>   not get the right amount of attention during 2.0 release, and will
>   still allow to work on new and interesting features.

Just a stab in the dark, but is one of your real concerns the spotty state
of Unicode support in the std libraries?  If so, nobody working on the PEPs
Guido identified would be likely to work on improving Unicode support even
if the PEPs vanished.  I don't know how Unicode support is going to improve,
but in the absence of visible work in that direction-- or even A Plan to get
some --I doubt we're going to hold up 2.1 waiting for magic.

no-feature-is-ever-done-ly y'rs  - tim



From tim.one@home.com  Sun Dec 17 08:30:24 2000
From: tim.one@home.com (Tim Peters)
Date: Sun, 17 Dec 2000 03:30:24 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <3A387D6A.782E6A3B@prescod.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEEIIEAA.tim.one@home.com>

[Tim]
>> I've rarely seen problems due to shadowing a global, but have often
>> seen problems due to shadowing a builtin.

[Paul Prescod]
> Really?

Yes.

> I think that there are two different issues here. One is consciously
> choosing to create a new variable but not understanding that there
> already exists a variable by that name. (i.e. str, list).

Yes, and that's what I've often seen, typically long after the original code
is written:  someone sticks in some debugging output, or makes a small
change to the implementation, and introduces e.g.

    str = some_preexisting_var + ":"
    yadda(str)

"Suddenly" the program misbehaves in baffling ways.  They're "baffling"
because the errors do not occur on the lines where the changes were made,
and are almost never related to the programmer's intent when making the
changes.

> Another is trying to assign to a global but actually shadowing it.

I've rarely seen that.

> There is no way that anyone coming from another language is going
> to consider this transcript reasonable:

True, but I don't really care:  everyone gets burned once, the better ones
eventually learn to use classes instead of mutating globals, and even the
dull get over it.  It is not, in my experience, an on-going problem for
anyone.  But I still get burned regularly by shadowing builtins.  The burns
are not fatal, however, and I can't think of an ointment less painful than
the blisters.

> >>> a=5
> >>> def show():
> ...    print a
> ...
> >>> def set(val):
> ...     a=val
> ...
> >>> a
> 5
> >>> show()
> 5
> >>> set(10)
> >>> show()
> 5
>
> It doesn't seem to make any sense. My solution is to make the assignment
> in "set" illegal unless you add a declaration that says: "No, really. I
> mean it. Override that sucker." As the PEP points out, overriding is
> seldom a good idea so the requirement to declare would be rarely
> invoked.

I expect it would do less harm to introduce a compile-time warning for
locals that are never referenced (such as the "a" in "set").

> ...
> The "right answer" in terms of namespace theory is to consistently refer
> to builtins with a prefix (whether "__builtins__" or "$") but that's
> pretty unpalatable from an aesthetic point of view.

Right, that's one of the ointments I won't apply to my own code, so wouldn't
think of asking others to either.

WRT mutable globals, people who feel they have to use them would be well
served to adopt a naming convention.  For example, begin each name with "g"
and capitalize the second letter.  This can make global-rich code much
easier to follow (I've done-- and very happily --similar things in
Javascript and C++).



From pf@artcom-gmbh.de  Sun Dec 17 09:59:11 2000
From: pf@artcom-gmbh.de (Peter Funk)
Date: Sun, 17 Dec 2000 10:59:11 +0100 (MET)
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 15, 2000  4:23:46 pm"
Message-ID: <m147ab2-000CxUC@artcom0.artcom-gmbh.de>

Hi,

Guido van Rossum:
> If you're saying that you think the string module is too prominent to
> ever start deprecating its use, I'm afraid we have a problem.
 
I strongly believe the string module is too prominent.

> I'd also like to note that using the string module's wrappers incurs
> the overhead of a Python function call -- using string methods is
> faster.

I think most care more about readbility than about run time performance.
For people without much OOP experience, the method syntax hurts
readability.

> Finally, I like the look of fields[i].strip().lower() much better than
> that of string.lower(string.strip(fields[i])) -- an actual example
> from mimetools.py.

Hmmmm.... May be this is just a matter of taste?  Like my preference
for '<>' instead of '!='?  Personally I still like the old fashinoned
form more.  Especially, if string.join() or string.split() are
involved.

Since Python 1.5.2 will stay around for several years, keeping backward 
compatibility in our Python coding is still major issue for us.
So we won't change our Python coding style soon if ever.  

> Ideally, I would like to deprecate the entire string module, so that I
[...]
I share Mark Lutz and Tim Peters oppinion, that this crusade will do 
more harm than good to Python community.  IMO this is a really bad idea.

Just my $0.02, Peter


From martin@loewis.home.cs.tu-berlin.de  Sun Dec 17 11:13:09 2000
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 17 Dec 2000 12:13:09 +0100
Subject: [Python-Dev] PEP 226
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEEHIEAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCOEEHIEAA.tim.one@home.com>
Message-ID: <200012171113.MAA00733@loewis.home.cs.tu-berlin.de>

> Just a stab in the dark, but is one of your real concerns the spotty state
> of Unicode support in the std libraries?  

Not at all. I really responded to amk's message

# All the PEPs for 2.1 are supposed to be complete for Dec. 16, and
# some of those PEPs are pretty complicated.  I'm a bit worried that
# it's been so quiet on python-dev lately, especially after the
# previous two weeks of lively discussion.

I just thought that something was wrong here - contributing to a free
software project ought to be fun for contributors, not a cause for
worries.

There-are-other-things-but-i18n-although-they-are-not-that-interesting y'rs,
Martin


From guido@python.org  Sun Dec 17 14:38:07 2000
From: guido@python.org (Guido van Rossum)
Date: Sun, 17 Dec 2000 09:38:07 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: Your message of "Sun, 17 Dec 2000 03:30:24 EST."
 <LNBBLJKPBEHFEDALKOLCKEEIIEAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCKEEIIEAA.tim.one@home.com>
Message-ID: <200012171438.JAA21603@cj20424-a.reston1.va.home.com>

> I expect it would do less harm to introduce a compile-time warning for
> locals that are never referenced (such as the "a" in "set").

Another warning that would be quite useful (and trap similar cases)
would be "local variable used before set".

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sun Dec 17 14:40:40 2000
From: guido@python.org (Guido van Rossum)
Date: Sun, 17 Dec 2000 09:40:40 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Sun, 17 Dec 2000 10:59:11 +0100."
 <m147ab2-000CxUC@artcom0.artcom-gmbh.de>
References: <m147ab2-000CxUC@artcom0.artcom-gmbh.de>
Message-ID: <200012171440.JAA21620@cj20424-a.reston1.va.home.com>

> I think most care more about readbility than about run time performance.
> For people without much OOP experience, the method syntax hurts
> readability.

I don't believe one bit of this.  By that standard, we would do better
to define a new module "list" and start writing list.append(L, x) for
L.append(x).

> I share Mark Lutz and Tim Peters oppinion, that this crusade will do 
> more harm than good to Python community.  IMO this is a really bad
> idea.

You are entitled to your opinion, but given that your arguments seem
very weak I will continue to ignore it (except to argue with you :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@digicool.com  Sun Dec 17 16:17:12 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Sun, 17 Dec 2000 11:17:12 -0500
Subject: [Python-Dev] Death to string functions!
References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com>
 <m147ab2-000CxUC@artcom0.artcom-gmbh.de>
Message-ID: <14908.59144.321167.419762@anthem.concentric.net>

>>>>> "PF" == Peter Funk <pf@artcom-gmbh.de> writes:

    PF> Hmmmm.... May be this is just a matter of taste?  Like my
    PF> preference for '<>' instead of '!='?  Personally I still like
    PF> the old fashinoned form more.  Especially, if string.join() or
    PF> string.split() are involved.

Hey cool!  I prefer <> over != too, but I also (not surprisingly)
strongly prefer string methods over string module functions.

TOOWTDI-MA-ly y'rs,
-Barry


From gvwilson@nevex.com  Sun Dec 17 16:25:17 2000
From: gvwilson@nevex.com (Greg Wilson)
Date: Sun, 17 Dec 2000 11:25:17 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <14908.59144.321167.419762@anthem.concentric.net>
Message-ID: <000201c06845$f1afdb40$770a0a0a@nevex.com>

+1 on deprecating string functions.  Every Python book and tutorial
(including mine) emphasizes Python's simplicity and lack of Perl-ish
redundancy; the more we practice what we preach, the more persuasive
this argument is.

Greg (who admittedly only has a few thousand lines of Python to maintain)


From pf@artcom-gmbh.de  Sun Dec 17 17:40:06 2000
From: pf@artcom-gmbh.de (Peter Funk)
Date: Sun, 17 Dec 2000 18:40:06 +0100 (MET)
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012171440.JAA21620@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 17, 2000  9:40:40 am"
Message-ID: <m147hn4-000CxUC@artcom0.artcom-gmbh.de>

[string.function(S, ...) vs. S.method(...)]

Guido van Rossum:
> I don't believe one bit of this.  By that standard, we would do better
> to define a new module "list" and start writing list.append(L, x) for
> L.append(x).

list objects have only very few methods.  Strings have so many methods.  
Some of them have names, that clash easily with the method names 
of other kind of objects.  Since there are no type declarations in
Python, looking at the code in isolation and seeing a line
	i = string.index(some_parameter)
tells at the first glance, that some_parameter should be a string
object even if the doc string of this function is too terse.  However
in 
	i = some_parameter.index()
it could be a list, a database or whatever.  

> You are entitled to your opinion, but given that your arguments seem
> very weak I will continue to ignore it (except to argue with you :-).

I see.  But given the time frame that the string module wouldn't
go away any time soon, I guess I have a lot of time to either think
about some stronger arguments or to get finally accustomed to that
new style of coding.  But since we have to keep compatibility with
Python 1.5.2 for at least the next two years chances for the latter
are bad.

Regards and have a nice vacation, Peter


From mwh21@cam.ac.uk  Sun Dec 17 18:18:24 2000
From: mwh21@cam.ac.uk (Michael Hudson)
Date: 17 Dec 2000 18:18:24 +0000
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: Thomas Wouters's message of "Fri, 15 Dec 2000 23:54:25 +0100"
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> <20001215235425.A29681@xs4all.nl>
Message-ID: <m3hf42q5cf.fsf@atrus.jesus.cam.ac.uk>

Thomas Wouters <thomas@xs4all.net> writes:

> On Fri, Dec 15, 2000 at 04:03:04AM -0800, Neil Schemenauer wrote:
> > On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote:
> > > I'm not sure I agree with that view either, but mostly because there
> > > is a non-GPL replacement for parts of the readline API:
> > > 
> > >     http://www.cstr.ed.ac.uk/downloads/editline.html
> > 
> > It doesn't work with the current readline module.  It is much
> > smaller than readline and works just as well in my experience.
> > Would there be any interest in including a copy with the standard
> > distribution?  The license is quite nice (X11 type).
> 
> Definately +1 from here. Readline reminds me of the cold war, for
> some reason. (Actually, multiple reasons ;) I don't have time to do
> it myself, unfortunately, or I would. (Looking at editline has been
> on my TODO list for a while... :P)

It wouldn't be particularly hard to rewrite editline in Python (we
have termios & the terminal handling functions in curses - and even
ioctl if we get really keen).

I've been hacking on my own Python line reader on and off for a while;
it's still pretty buggy, but if you're feeling brave you could look at:

http://www-jcsu.jesus.cam.ac.uk/~mwh21/hacks/pyrl-0.0.0.tar.gz

To try it out, unpack it, cd into the ./pyrl directory and try:

>>> import foo # sorry
>>> foo.test_loop()

It sort of imitates the Python command prompt, except that it doesn't
actually execute the code you type.

You need a recent _cursesmodule.c for it to work.

Cheers,
M.

-- 
41. Some programming languages manage to absorb change, but 
    withstand progress.
  -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html



From thomas@xs4all.net  Sun Dec 17 18:30:38 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Sun, 17 Dec 2000 19:30:38 +0100
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <000201c06845$f1afdb40$770a0a0a@nevex.com>; from gvwilson@nevex.com on Sun, Dec 17, 2000 at 11:25:17AM -0500
References: <14908.59144.321167.419762@anthem.concentric.net> <000201c06845$f1afdb40$770a0a0a@nevex.com>
Message-ID: <20001217193038.C29681@xs4all.nl>

On Sun, Dec 17, 2000 at 11:25:17AM -0500, Greg Wilson wrote:

> +1 on deprecating string functions.

How wonderfully ambiguous ! Do you mean string methods, or the string module?
:)

FWIW, I agree that in time, the string module should be deprecated. But I
also think that 'in time' should be a considerable timespan. Don't deprecate
it before everything it provides is available though some other means. Wait
a bit longer than that, even, before calling it deprecated -- that scares
people off. And then keep it for practically forever (until Py3K) just to
support old code. And don't forget to document it 'deprecated' everywhere,
not just one minor release note.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From tismer@tismer.com  Sun Dec 17 17:38:31 2000
From: tismer@tismer.com (Christian Tismer)
Date: Sun, 17 Dec 2000 19:38:31 +0200
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com>
Message-ID: <3A3CFA17.ED26F51A@tismer.com>

This is a multi-part message in MIME format.
--------------0B643A01C67D836AADED505B
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Old topic: {}.popitem() (was Re: {}.first[key,value,item] ...)

Christian Tismer wrote:
> 
> Fredrik Lundh wrote:
> >
> > christian wrote:
> > > That algorithm is really a gem which you should know,
> > > so let me try to explain it.
> >
> > I think someone just won the "brain exploder 2000" award ;-)

<snip>

> As you might have guessed, I didn't do this just for fun.
> It is the old game of explaining what is there, convincing
> everybody that you at least know what you are talking about,
> and then three days later coming up with an improved
> application of the theory.
> 
> Today is Monday, 2 days left. :-)

Ok, today is Sunday, I had no time to finish this.
But now it is here.

                  ===========================
                  =====    Claim:       =====
                  ===========================

-  Dictionary access time can be improved with a minimal change -

On the hash() function:
All Objects are supposed to provide a hash function which is
as good as possible.
Good means to provide a wide range of different keys for different
values.

Problem: There are hash functions which are "good" in this sense,
but they do not spread their randomness uniformly over the
32 bits.

Example: Integers use their own value as hash.
This is ok, as far as the integers are uniformly distributed.
But if they all contain a high power of two, for instance,
the low bits give a very bad hash function.

Take a dictionary with integers range(1000) as keys and access
all entries. Then use a dictionay with the integers shifted
left by 16.
Access time is slowed down by a factor of 100, since every
access is a linear search now.

This is not an urgent problem, although applications exist
where this can play a role (memory addresses for instance
can have high factors of two when people do statistics on
page accesses...)

While this is not a big problem, it is ugly enough to
think of a solution.

Solution 1:
-------------
Try to involve more bits of the hash value by doing extra
shuffling, either 
a) in the dictlook function, or
b) in the hash generation itself.

I believe, both can't be really justified for a rare problem.
But how about changing the existing solution in a way that
an improvement is gained without extra cost?

Solution 2: (*the* solution)
----------------------------
Some people may remember what I wrote about re-hashing
functions through the multiplicative group GF(2^n)*,
and I don't want to repeat this here.
The simple idea can be summarized quickly:

The original algorithm uses multiplication by polynomials,
and it is guaranteed that these re-hash values are jittering
through all possible nonzero patterns of the n bits.

Observation: Whe are using an operation of a finite field.
This means that the inverse of multiplication also exists!

Old algortithm (multiplication):
      shift the index left by 1
      if index > mask:
          xor the index with the generator polynomial

New algorithm (division):
      if low bit of index set:
          xor the index with the generator polynomial
      shift the index right by 1

What does this mean? Not so much, we are just cycling through
our bit patterns in reverse order.

But now for the big difference.

First change:    We change from multiplication to division.
Second change:   We do not mask the hash value before!

The second change is what I was after: By not masking the
hash value when computing the initial index, all the
existing bits in the hash come into play.

This can be seen like a polynomial division, but the initial
remainder (the hash value) was not normalized. After a number
of steps, all the extra bits are wheeled into our index,
but not wasted by masking them off. That gives our re-hash
some more randomness. When all the extra bits are sucked
in, the guaranteed single-visit cycle begins. There cannot
be more than 27 extra cycles in the worst case (dict size
= 32, so there are 27 bits to consume).

I do not expect any bad effect from this modification.

Here some results, dictionaries have 1000 entries:

timing for strings              old=  5.097 new= 5.088
timing for bad integers (<<10)  old=101.540 new=12.610
timing for bad integers (<<16)  old=571.210 new=19.220

On strings, both algorithms behave the same.
On numbers, they differ dramatically.
While the current algorithm is 110 times slower on a worst
case dict (quadratic behavior), the new algorithm accounts
a little for the extra cycle, but is only 4 times slower.

Alternative implementation:
The above approach is conservative in the sense that it
tries not to slow down the current implementation in any
way. An alternative would be to comsume all of the extra
bits at once. But this would add an extra "warmup" loop
like this to the algorithm:

    while index > mask:
        if low bit of index set:
            xor the index with the generator polynomial
        shift the index right by 1

This is of course a very good digest of the higher bits,
since it is a polynomial division and not just some
bit xor-ing which might give quite predictable cancellations,
therefore it is "the right way" in my sense.
It might be cheap, but would add over 20 cycles to every
small dict. I therefore don't think it is worth to do this.

Personally, I prefer the solution to merge the bits during
the actual lookup, since it suffices to get access time
from quadratic down to logarithmic.

Attached is a direct translation of the relevant parts
of dictobject.c into Python, with both algorithms
implemented.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com
--------------0B643A01C67D836AADED505B
Content-Type: text/plain; charset=us-ascii;
 name="dictest.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="dictest.py"

## dictest.py
## Test of a new rehash algorithm
## Chris Tismer
## 2000-12-17
## Mission Impossible 5oftware Team

# The following is a partial re-implementation of
# Python dictionaries in Python.
# The original algorithm was literally turned
# into Python code.

##/*
##Table of irreducible polynomials to efficiently cycle through
##GF(2^n)-{0}, 2<=n<=30.
##*/
polys = [
    4 + 3,
    8 + 3,
    16 + 3,
    32 + 5,
    64 + 3,
    128 + 3,
    256 + 29,
    512 + 17,
    1024 + 9,
    2048 + 5,
    4096 + 83,
    8192 + 27,
    16384 + 43,
    32768 + 3,
    65536 + 45,
    131072 + 9,
    262144 + 39,
    524288 + 39,
    1048576 + 9,
    2097152 + 5,
    4194304 + 3,
    8388608 + 33,
    16777216 + 27,
    33554432 + 9,
    67108864 + 71,
    134217728 + 39,
    268435456 + 9,
    536870912 + 5,
    1073741824 + 83,
    0
]


class NULL: pass

class Dictionary:
    dummy = "<dummy key>"
    def __init__(mp, newalg=0):
        mp.ma_size = 0
        mp.ma_poly = 0
        mp.ma_table = []
        mp.ma_fill = 0
        mp.ma_used = 0
        mp.oldalg = not newalg

    def lookdict(mp, key, _hash):
        me_hash, me_key, me_value = range(3) # rec slots
        dummy = mp.dummy
        
        mask = mp.ma_size-1
        ep0 = mp.ma_table
        i = (~_hash) & mask
        ep = ep0[i]
        if ep[me_key] is NULL or ep[me_key] == key:
            return ep
        if ep[me_key] == dummy:
            freeslot = ep
        else:
            if (ep[me_hash] == _hash and
                cmp(ep[me_key], key) == 0) :
                return ep
            freeslot = NULL

###### FROM HERE
        if mp.oldalg:
            incr = (_hash ^ (_hash >> 3)) & mask
        else:
            # note that we do not mask!
            # even the shifting my not be worth it.
            incr = _hash ^ (_hash >> 3)
###### TO HERE
            
        if (not incr):
            incr = mask
        while 1:
            ep = ep0[(i+incr)&mask]
            if (ep[me_key] is NULL) :
                if (freeslot != NULL) :
                    return freeslot
                else:
                    return ep
            if (ep[me_key] == dummy) :
                if (freeslot == NULL):
                    freeslot = ep
            elif (ep[me_key] == key or
                 (ep[me_hash] == _hash and
                  cmp(ep[me_key], key) == 0)) :
                return ep

            # Cycle through GF(2^n)-{0}
###### FROM HERE
            if mp.oldalg:
                incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            else:
                # new algorithm: do a division
                if (incr & 1):
                    incr = incr ^ mp.ma_poly
                incr = incr >> 1
###### TO HERE

    def insertdict(mp, key, _hash, value):
        me_hash, me_key, me_value = range(3) # rec slots
        
        ep = mp.lookdict(key, _hash)
        if (ep[me_value] is not NULL) :
            old_value = ep[me_value]
            ep[me_value] = value
        else :
            if (ep[me_key] is NULL):
                mp.ma_fill=mp.ma_fill+1
            ep[me_key] = key
            ep[me_hash] = _hash
            ep[me_value] = value
            mp.ma_used = mp.ma_used+1

    def dictresize(mp, minused):
        me_hash, me_key, me_value = range(3) # rec slots

        oldsize = mp.ma_size
        oldtable = mp.ma_table
        MINSIZE = 4
        newsize = MINSIZE
        for i in range(len(polys)):
            if (newsize > minused) :
                newpoly = polys[i]
                break
            newsize = newsize << 1
        else:
            return -1

        _nullentry = range(3)
        _nullentry[me_hash] = 0
        _nullentry[me_key] = NULL
        _nullentry[me_value] = NULL

        newtable = map(lambda x,y=_nullentry:y[:], range(newsize))      

        mp.ma_size = newsize
        mp.ma_poly = newpoly
        mp.ma_table = newtable
        mp.ma_fill = 0
        mp.ma_used = 0

        for ep in oldtable:
            if (ep[me_value] is not NULL):
                mp.insertdict(ep[me_key],ep[me_hash],ep[me_value])
        return 0

    # PyDict_GetItem
    def __getitem__(op, key):
        me_hash, me_key, me_value = range(3) # rec slots

        if not op.ma_table:
            raise KeyError, key
        _hash = hash(key)
        return op.lookdict(key, _hash)[me_value]

    # PyDict_SetItem
    def __setitem__(op, key, value):
        mp = op
        _hash = hash(key)
##      /* if fill >= 2/3 size, double in size */
        if (mp.ma_fill*3 >= mp.ma_size*2) :
            if (mp.dictresize(mp.ma_used*2) != 0):
                if (mp.ma_fill+1 > mp.ma_size):
                    raise MemoryError
        mp.insertdict(key, _hash, value)

    # more interface functions
    def keys(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _key)
        return res

    def values(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _value)
        return res

    def items(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( (_key, _value) )
        return res

    def __cmp__(self, other):
        mine = self.items()
        others = other.items()
        mine.sort()
        others.sort()
        return cmp(mine, others)

######################################################
## tests

def timing(func, args=None, n=1, **keywords) :
	import time
	time=time.time
	appl=apply
	if args is None: args = ()
	if type(args) != type(()) : args=(args,)
	rep=range(n)
	dummyarg = ("",)
	dummykw = {}
	dummyfunc = len
	if keywords:
		before=time()
		for i in rep: res=appl(dummyfunc, dummyarg, dummykw)
		empty = time()-before
		before=time()
		for i in rep: res=appl(func, args, keywords)
	else:
		before=time()
		for i in rep: res=appl(dummyfunc, dummyarg)
		empty = time()-before
		before=time()
		for i in rep: res=appl(func, args)
	after = time()
	return round(after-before-empty,4), res

def test(lis, dic):
    for key in lis: dic[key]

def nulltest(lis, dic):
    for key in lis: dic

def string_dicts():
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	for i in range(1000):
		s = str(i) * 5
		d1[s] = d2[s] = i
	return d1, d2

def badnum_dicts():
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	for i in range(1000):
		bad = i << 16
		d1[bad] = d2[bad] = i
	return d1, d2

def do_test(dict, keys, n):
	t0 = timing(nulltest, (keys, dict), n)[0]
	t1 = timing(test, (keys, dict), n)[0]
	return t1-t0

if __name__ == "__main__":
	sdold, sdnew = string_dicts()
	bdold, bdnew = badnum_dicts()
	print "timing for strings old=%.3f new=%.3f" % (
		  do_test(sdold, sdold.keys(), 100),
		  do_test(sdnew, sdnew.keys(), 100) )
	print "timing for bad integers old=%.3f new=%.3f" % (
		  do_test(bdold, bdold.keys(), 10) *10,
		  do_test(bdnew, bdnew.keys(), 10) *10)

		  
"""
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.097 new=5.088
timing for bad integers old=101.540 new=12.610
"""
--------------0B643A01C67D836AADED505B--



From fdrake@acm.org  Sun Dec 17 18:49:58 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Sun, 17 Dec 2000 13:49:58 -0500 (EST)
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <20001217193038.C29681@xs4all.nl>
References: <14908.59144.321167.419762@anthem.concentric.net>
 <000201c06845$f1afdb40$770a0a0a@nevex.com>
 <20001217193038.C29681@xs4all.nl>
Message-ID: <14909.2774.158973.760077@cj42289-a.reston1.va.home.com>

Thomas Wouters writes:
 > FWIW, I agree that in time, the string module should be deprecated. But I
 > also think that 'in time' should be a considerable timespan. Don't deprecate

  *If* most functions in the string module are going to be deprecated,
that should be done *now*, so that the documentation will include the
appropriate warning to users.  When they should actually be removed is
another matter, and I think Guido is sufficiently aware of their
widespread use and won't remove them too quickly -- his creation of
Python isn't the reason he's *accepted* as BDFL, it just made it a
possibility.  He's had to actually *earn* the BDFL position, I think.
  With regard to converting the standard library to string methods:
that needs to be done as part of the deprecation.  The code in the
library is commonly used as example code, and should be good example
code wherever possible.

 > support old code. And don't forget to document it 'deprecated' everywhere,
 > not just one minor release note.

  When Guido tells me exactly what is deprecated, the documentation
will be updated with proper deprecation notices in the appropriate
places.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From tismer@tismer.com  Sun Dec 17 18:10:07 2000
From: tismer@tismer.com (Christian Tismer)
Date: Sun, 17 Dec 2000 20:10:07 +0200
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com>
Message-ID: <3A3D017F.62AD599F@tismer.com>

This is a multi-part message in MIME format.
--------------D1825E07B23FE5AC1D48DB49
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit



Christian Tismer wrote:

...
(my timings)
Attached is the updated script with the timings mentioned
in the last posting. Sorry, I posted an older version before.

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com
--------------D1825E07B23FE5AC1D48DB49
Content-Type: text/plain; charset=us-ascii;
 name="dictest.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="dictest.py"

## dictest.py
## Test of a new rehash algorithm
## Chris Tismer
## 2000-12-17
## Mission Impossible 5oftware Team

# The following is a partial re-implementation of
# Python dictionaries in Python.
# The original algorithm was literally turned
# into Python code.

##/*
##Table of irreducible polynomials to efficiently cycle through
##GF(2^n)-{0}, 2<=n<=30.
##*/
polys = [
    4 + 3,
    8 + 3,
    16 + 3,
    32 + 5,
    64 + 3,
    128 + 3,
    256 + 29,
    512 + 17,
    1024 + 9,
    2048 + 5,
    4096 + 83,
    8192 + 27,
    16384 + 43,
    32768 + 3,
    65536 + 45,
    131072 + 9,
    262144 + 39,
    524288 + 39,
    1048576 + 9,
    2097152 + 5,
    4194304 + 3,
    8388608 + 33,
    16777216 + 27,
    33554432 + 9,
    67108864 + 71,
    134217728 + 39,
    268435456 + 9,
    536870912 + 5,
    1073741824 + 83,
    0
]


class NULL: pass

class Dictionary:
    dummy = "<dummy key>"
    def __init__(mp, newalg=0):
        mp.ma_size = 0
        mp.ma_poly = 0
        mp.ma_table = []
        mp.ma_fill = 0
        mp.ma_used = 0
        mp.oldalg = not newalg

    def lookdict(mp, key, _hash):
        me_hash, me_key, me_value = range(3) # rec slots
        dummy = mp.dummy
        
        mask = mp.ma_size-1
        ep0 = mp.ma_table
        i = (~_hash) & mask
        ep = ep0[i]
        if ep[me_key] is NULL or ep[me_key] == key:
            return ep
        if ep[me_key] == dummy:
            freeslot = ep
        else:
            if (ep[me_hash] == _hash and
                cmp(ep[me_key], key) == 0) :
                return ep
            freeslot = NULL

###### FROM HERE
        if mp.oldalg:
            incr = (_hash ^ (_hash >> 3)) & mask
        else:
            # note that we do not mask!
            # even the shifting my not be worth it.
            incr = _hash ^ (_hash >> 3)
###### TO HERE
            
        if (not incr):
            incr = mask
        while 1:
            ep = ep0[(i+incr)&mask]
            if (ep[me_key] is NULL) :
                if (freeslot != NULL) :
                    return freeslot
                else:
                    return ep
            if (ep[me_key] == dummy) :
                if (freeslot == NULL):
                    freeslot = ep
            elif (ep[me_key] == key or
                 (ep[me_hash] == _hash and
                  cmp(ep[me_key], key) == 0)) :
                return ep

            # Cycle through GF(2^n)-{0}
###### FROM HERE
            if mp.oldalg:
                incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            else:
                # new algorithm: do a division
                if (incr & 1):
                    incr = incr ^ mp.ma_poly
                incr = incr >> 1
###### TO HERE

    def insertdict(mp, key, _hash, value):
        me_hash, me_key, me_value = range(3) # rec slots
        
        ep = mp.lookdict(key, _hash)
        if (ep[me_value] is not NULL) :
            old_value = ep[me_value]
            ep[me_value] = value
        else :
            if (ep[me_key] is NULL):
                mp.ma_fill=mp.ma_fill+1
            ep[me_key] = key
            ep[me_hash] = _hash
            ep[me_value] = value
            mp.ma_used = mp.ma_used+1

    def dictresize(mp, minused):
        me_hash, me_key, me_value = range(3) # rec slots

        oldsize = mp.ma_size
        oldtable = mp.ma_table
        MINSIZE = 4
        newsize = MINSIZE
        for i in range(len(polys)):
            if (newsize > minused) :
                newpoly = polys[i]
                break
            newsize = newsize << 1
        else:
            return -1

        _nullentry = range(3)
        _nullentry[me_hash] = 0
        _nullentry[me_key] = NULL
        _nullentry[me_value] = NULL

        newtable = map(lambda x,y=_nullentry:y[:], range(newsize))      

        mp.ma_size = newsize
        mp.ma_poly = newpoly
        mp.ma_table = newtable
        mp.ma_fill = 0
        mp.ma_used = 0

        for ep in oldtable:
            if (ep[me_value] is not NULL):
                mp.insertdict(ep[me_key],ep[me_hash],ep[me_value])
        return 0

    # PyDict_GetItem
    def __getitem__(op, key):
        me_hash, me_key, me_value = range(3) # rec slots

        if not op.ma_table:
            raise KeyError, key
        _hash = hash(key)
        return op.lookdict(key, _hash)[me_value]

    # PyDict_SetItem
    def __setitem__(op, key, value):
        mp = op
        _hash = hash(key)
##      /* if fill >= 2/3 size, double in size */
        if (mp.ma_fill*3 >= mp.ma_size*2) :
            if (mp.dictresize(mp.ma_used*2) != 0):
                if (mp.ma_fill+1 > mp.ma_size):
                    raise MemoryError
        mp.insertdict(key, _hash, value)

    # more interface functions
    def keys(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _key)
        return res

    def values(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _value)
        return res

    def items(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( (_key, _value) )
        return res

    def __cmp__(self, other):
        mine = self.items()
        others = other.items()
        mine.sort()
        others.sort()
        return cmp(mine, others)

######################################################
## tests

def timing(func, args=None, n=1, **keywords) :
	import time
	time=time.time
	appl=apply
	if args is None: args = ()
	if type(args) != type(()) : args=(args,)
	rep=range(n)
	dummyarg = ("",)
	dummykw = {}
	dummyfunc = len
	if keywords:
		before=time()
		for i in rep: res=appl(dummyfunc, dummyarg, dummykw)
		empty = time()-before
		before=time()
		for i in rep: res=appl(func, args, keywords)
	else:
		before=time()
		for i in rep: res=appl(dummyfunc, dummyarg)
		empty = time()-before
		before=time()
		for i in rep: res=appl(func, args)
	after = time()
	return round(after-before-empty,4), res

def test(lis, dic):
    for key in lis: dic[key]

def nulltest(lis, dic):
    for key in lis: dic

def string_dicts():
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	for i in range(1000):
		s = str(i) * 5
		d1[s] = d2[s] = i
	return d1, d2

def badnum_dicts():
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	shift = 10
	if EXTREME:
		shift = 16
	for i in range(1000):
		bad = i << 16
		d1[bad] = d2[bad] = i
	return d1, d2

def do_test(dict, keys, n):
	t0 = timing(nulltest, (keys, dict), n)[0]
	t1 = timing(test, (keys, dict), n)[0]
	return t1-t0

EXTREME=1

if __name__ == "__main__":
	sdold, sdnew = string_dicts()
	bdold, bdnew = badnum_dicts()
	print "timing for strings old=%.3f new=%.3f" % (
		  do_test(sdold, sdold.keys(), 100),
		  do_test(sdnew, sdnew.keys(), 100) )
	print "timing for bad integers old=%.3f new=%.3f" % (
		  do_test(bdold, bdold.keys(), 10) *10,
		  do_test(bdnew, bdnew.keys(), 10) *10)
  
"""
Results with a shift of 10 (EXTREME=0):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.097 new=5.088
timing for bad integers old=101.540 new=12.610

Results with a shift of 16 (EXTREME=1):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.218 new=5.147
timing for bad integers old=571.210 new=19.220
"""
--------------D1825E07B23FE5AC1D48DB49--



From lutz@rmi.net  Sun Dec 17 19:09:47 2000
From: lutz@rmi.net (Mark Lutz)
Date: Sun, 17 Dec 2000 12:09:47 -0700
Subject: [Python-Dev] Death to string functions!
References: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com>  <200012152123.QAA03566@cj20424-a.reston1.va.home.com>
Message-ID: <001f01c0685c$ef555200$7bdb5da6@vaio>

As a longstanding Python advocate and user, I find this
thread disturbing, and feel compelled to add a few words:

> > [Tim wrote:]
> > "string" is right up there with "os" and "sys" as a FIM (Frequently
> > Imported Module), so the required code changes will be massive.  As
> > a user, I don't see what's in it for me to endure that pain: the
> > string module functions work fine!  Neither are they warts in the
> > language, any more than that we say sin(pi) instead of pi.sin().
> > Keeping the functions around doesn't hurt anybody that I can see.
> 
> [Guido wrote:]
> Hm.  I'm not saying that this one will be easy.  But I don't like
> having "two ways to do it".  It means more learning, etc. (you know
> the drill).

But with all due respect, there are already _lots_ of places 
in Python that provide at least two ways to do something
already.  Why be so strict on this one alone?

Consider lambda and def; tuples and lists; map and for 
loops; the loop else and boolean exit flags; and so on.  
The notion of Python forcing a single solution is largely a 
myth. And as someone who makes a living teaching this 
stuff, I can tell you that none of the existing redundancies 
prevent anyone from learning Python.

More to the point, many of those shiny new features added 
to 2.0 fall squarely into this category too, and are completely 
redundant with other tools.  Consider list comprehensions
and simple loops; extended print statements and sys.std* 
assignments; augmented assignment statements and simpler
ones.  Eliminating redundancy at a time when we're also busy
introducing it seems a tough goal to sell.

I understand the virtues of aesthetics too, but removing the
string module seems an incredibly arbitrary application of it.  


> If you're saying that you think the string module is too prominent to
> ever start deprecating its use, I'm afraid we have a problem.
>
> [...]
> Ideally, I'd like to deprecate the entire string module, so that I
> can place a single warning at its top.  This will cause a single
> warning to be issued for programs that still use it (no matter how
> many times it is imported).

And to me, this seems the real crux of the matter. For a 
decade now, the string module _has_ been the right way to
do it.  And today, half a million Python developers absolutely
rely on it as an essential staple in their toolbox.  What could 
possibly be wrong with keeping it around for backward 
compatibility, albeit as a less recommended option?

If almost every Python program ever written suddenly starts 
issuing warning messages, then I think we do have a problem
indeed.  Frankly, a Python that changes without regard to its
user base seems an ominous thing to me.  And keep in mind 
that I like Python; others will look much less generously upon
a tool that seems inclined to rip the rug out from under its users.
Trust me on this; I've already heard the rumblings out there.

So please: can we keep string around?  Like it or not, we're 
way past the point of removing such core modules at this point.
Such a radical change might pass in a future non-backward-
compatible Python mutation; I'm not sure such a different
system will still be "Python", but that's a topic for another day.

All IMHO, of course,
--Mark Lutz  (http://www.rmi.net~lutz)




From tim.one@home.com  Sun Dec 17 19:50:55 2000
From: tim.one@home.com (Tim Peters)
Date: Sun, 17 Dec 2000 14:50:55 -0500
Subject: [Python-Dev] SourceForge SSH silliness
Message-ID: <LNBBLJKPBEHFEDALKOLCCEFCIEAA.tim.one@home.com>

Starting last night, I get this msg whenever I update Python code w/
CVSROOT=:ext:tim_one@cvs.python.sourceforge.net:/cvsroot/python:

"""
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@       WARNING: HOST IDENTIFICATION HAS CHANGED!         @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that the host key has just been changed.
Please contact your system administrator.
Add correct host key in C:\Code/.ssh/known_hosts to get rid of this message.
Password authentication is disabled to avoid trojan horses.
"""

This is SourceForge's doing, and is permanent (they've changed keys on their
end).  Here's a link to a thread that may or may not make sense to you:

http://sourceforge.net/forum/forum.php?forum_id=52867

Deleting the sourceforge entries from my .ssh/known_hosts file worked for
me.  But everyone in the thread above who tried it says that they haven't
been able to get scp working again (I haven't tried it yet ...).



From paulp@ActiveState.com  Sun Dec 17 20:04:27 2000
From: paulp@ActiveState.com (Paul Prescod)
Date: Sun, 17 Dec 2000 12:04:27 -0800
Subject: [Python-Dev] Pragmas and warnings
Message-ID: <3A3D1C4B.8F08A744@ActiveState.com>

A couple of other threads started me to thinking that there are a couple
of things missing from our warnings framework. 

Many languages have pragmas that allow you turn warnings on and off in
code. For instance, I should be able to put a pragma at the top of a
module that uses string functions to say: "I know that this module
doesn't adhere to the latest Python conventions. Please don't warn me
about it." I should also be able to put a declaration that says: "I'm
really paranoid about shadowing globals and builtins. Please warn me
when I do that."

Batch and visual linters could also use the declarations to customize
their behaviors.

And of course we have a stack of other features that could use pragmas:

 * type signatures
 * Unicode syntax declarations
 * external object model language binding hints
 * ...

A case could be made that warning pragmas could use a totally different
syntax from "user-defined" pragmas. I don't care much.

 Paul


From thomas@xs4all.net  Sun Dec 17 21:00:08 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Sun, 17 Dec 2000 22:00:08 +0100
Subject: [Python-Dev] SourceForge SSH silliness
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEFCIEAA.tim.one@home.com>; from tim.one@home.com on Sun, Dec 17, 2000 at 02:50:55PM -0500
References: <LNBBLJKPBEHFEDALKOLCCEFCIEAA.tim.one@home.com>
Message-ID: <20001217220008.D29681@xs4all.nl>

On Sun, Dec 17, 2000 at 02:50:55PM -0500, Tim Peters wrote:
> Starting last night, I get this msg whenever I update Python code w/
> CVSROOT=:ext:tim_one@cvs.python.sourceforge.net:/cvsroot/python:

> """
> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> @       WARNING: HOST IDENTIFICATION HAS CHANGED!         @
> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
> Someone could be eavesdropping on you right now (man-in-the-middle attack)!
> It is also possible that the host key has just been changed.
> Please contact your system administrator.
> Add correct host key in C:\Code/.ssh/known_hosts to get rid of this message.
> Password authentication is disabled to avoid trojan horses.
> """

> This is SourceForge's doing, and is permanent (they've changed keys on their
> end).  Here's a link to a thread that may or may not make sense to you:

> http://sourceforge.net/forum/forum.php?forum_id=52867

> Deleting the sourceforge entries from my .ssh/known_hosts file worked for
> me.  But everyone in the thread above who tried it says that they haven't
> been able to get scp working again (I haven't tried it yet ...).

What sourceforge did was switch Linux distributions, and upgrade. The switch
doesn't really matter for the SSH problem, because recent Debian and recent
RedHat releases both use a new ssh, the OpenBSD ssh imlementation.
Apparently, it isn't entirely backwards compatible to old versions of
F-secure ssh. For one thing, it doesn't support the 'idea' cypher. This
might or might not be your problem; if it is, you should get a decent
message that gives a relatively clear message such as 'cypher type 'idea'
not supported'. You should be able to pass the '-c' option to scp/ssh to use
a different cypher, like 3des (aka triple-des.) Or maybe the windows
versions have a menu to configure that kind of thing :) 

Another possible problem is that it might not have good support for older
protocol versions. The 'current' protocol version, at least for 'ssh1', is
1.5. The one message on the sourceforge thread above that actually mentions
a version in the *cough* bugreport is using an older ssh that only supports
protocol version 1.4. Since that particular version of F-secure ssh has
known problems (why else would they release 16 more versions ?) I'd suggest
anyone with problems first try a newer version. I hope that doesn't break
WinCVS, but it would suck if it did :P

If that doesn't work, which is entirely possible, it might be an honest bug
in the OpenBSD ssh that Sourceforge is using. If anyone cared, we could do a
bit of experimenting with the openssh-2.0 betas installed by Debian woody
(unstable) to see if the problem occurs there as well.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From greg@cosc.canterbury.ac.nz  Sun Dec 17 23:05:41 2000
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 18 Dec 2000 12:05:41 +1300 (NZDT)
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <20001216014341.5BA97A82E@darjeeling.zadka.site.co.il>
Message-ID: <200012172305.MAA02512@s454.cosc.canterbury.ac.nz>

Moshe Zadka <moshez@zadka.site.co.il>:

> Perl and Scheme permit implicit shadowing too.

But Scheme always requires declarations!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From martin@loewis.home.cs.tu-berlin.de  Sun Dec 17 23:45:56 2000
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 18 Dec 2000 00:45:56 +0100
Subject: [Python-Dev] Death to string functions!
Message-ID: <200012172345.AAA00877@loewis.home.cs.tu-berlin.de>

> But with all due respect, there are already _lots_ of places in
> Python that provide at least two ways to do something already.

Exactly. My favourite one here is string exceptions, which have quite
some analogy to the string module.

At some time, there were only string exceptions. Then, instance
exceptions were added, some releases later they were considered the
better choice, so the standard library was converted to use them.
Still, there is no sign whatsoever that anybody plans to deprecate
string exceptions.

I believe the string module will get less importance over
time. Comparing it with string exception, that may be well 5 years.
It seems there are two ways of "deprecation": a loud "we will remove
that, change your code", and a silent "strings have methods"
(i.e. don't mention the module when educating people). The latter
approach requires educators to agree that the module is
"uninteresting", and people to really not use once they find out it
exists.

I think deprecation should be only attempted once there is a clear
sign that people don't use it massively for new code anymore. Removal
should only occur if keeping the module less pain than maintaining it.

Regards,
Martin



From skip@mojam.com (Skip Montanaro)  Sun Dec 17 23:55:10 2000
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sun, 17 Dec 2000 17:55:10 -0600 (CST)
Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in?
Message-ID: <14909.21086.92774.940814@beluga.mojam.com>

I executed cvs update today (removing the sourceforge machines from
.ssh/known_hosts worked fine for me, btw) followed by a configure and a make
clean.  The last step failed with this output:

    ...
    make[1]: Entering directory `/home/beluga/skip/src/python/dist/src/Modules'
    Makefile.pre.in:20: *** missing separator.  Stop.
    make[1]: Leaving directory `/home/beluga/skip/src/python/dist/src/Modules'
    make: [clean] Error 2 (ignored)

I found the following at line 20 of Modules/Makefile.pre.in:

    @SET_CXX@

I then tried a cvs annotate on that file but saw that line 20 had been there
since rev 1.60 (16-Dec-99).  I then checked the top-level Makefile.in
thinking something must have changed in the clean target recently, but cvs
annotate shows no recent changes there either:

    1.1          (guido    24-Dec-93): clean:		localclean
    1.1          (guido    24-Dec-93): 		-for i in $(SUBDIRS); do \
    1.74         (guido    19-May-98): 		    if test -d $$i; then \
    1.24         (guido    20-Jun-96): 			(echo making clean in subdirectory $$i; cd $$i; \
    1.4          (guido    01-Aug-94): 			 if test -f Makefile; \
    1.4          (guido    01-Aug-94): 			 then $(MAKE) clean; \
    1.4          (guido    01-Aug-94): 			 else $(MAKE) -f Makefile.*in clean; \
    1.4          (guido    01-Aug-94): 			 fi); \
    1.74         (guido    19-May-98): 		    else true; fi; \
    1.1          (guido    24-Dec-93): 		done

Make distclean succeeded so I tried the following:

    make distclean
    ./configure
    make clean

but the last step still failed.  Any idea why make clean is now failing (for
me)?  Can anyone else reproduce this problem?

Skip


From greg@cosc.canterbury.ac.nz  Mon Dec 18 00:02:32 2000
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 18 Dec 2000 13:02:32 +1300 (NZDT)
Subject: [Python-Dev] Use of %c and Py_UNICODE
In-Reply-To: <3A3A9A22.E9BA9551@lemburg.com>
Message-ID: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz>

"M.-A. Lemburg" <mal@lemburg.com>:

> Format characters will always
> be ASCII and thus 7-bit -- theres really no need to expand the
> set of possibilities beyond 8 bits ;-)

But the error message is being produced because the
character is NOT a valid format character. One of the
reasons for that might be because it's not in the
7-bit range!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From MarkH@ActiveState.com  Mon Dec 18 06:02:27 2000
From: MarkH@ActiveState.com (Mark Hammond)
Date: Mon, 18 Dec 2000 17:02:27 +1100
Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in?
In-Reply-To: <14909.21086.92774.940814@beluga.mojam.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPIEJJCMAA.MarkH@ActiveState.com>

> I found the following at line 20 of Modules/Makefile.pre.in:
>
>     @SET_CXX@

I dont have time to investigate this specific problem, but I definitely had
problems with SET_CXX around 6 months back.  This was trying to build an
external C++ application, so may be different.  My message and other
followups at the time implied noone really knew and everyone agreed it was
likely SET_CXX was broken :-(  I even referenced the CVS chekin that I
thought broke it.

Mark.



From mal@lemburg.com  Mon Dec 18 09:58:37 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 10:58:37 +0100
Subject: [Python-Dev] Pragmas and warnings
References: <3A3D1C4B.8F08A744@ActiveState.com>
Message-ID: <3A3DDFCD.34AB05B2@lemburg.com>

Paul Prescod wrote:
> 
> A couple of other threads started me to thinking that there are a couple
> of things missing from our warnings framework.
> 
> Many languages have pragmas that allow you turn warnings on and off in
> code. For instance, I should be able to put a pragma at the top of a
> module that uses string functions to say: "I know that this module
> doesn't adhere to the latest Python conventions. Please don't warn me
> about it." I should also be able to put a declaration that says: "I'm
> really paranoid about shadowing globals and builtins. Please warn me
> when I do that."
> 
> Batch and visual linters could also use the declarations to customize
> their behaviors.
> 
> And of course we have a stack of other features that could use pragmas:
> 
>  * type signatures
>  * Unicode syntax declarations
>  * external object model language binding hints
>  * ...
> 
> A case could be made that warning pragmas could use a totally different
> syntax from "user-defined" pragmas. I don't care much.

There was a long thread about this some months ago. We agreed
to add a new keyword to the language (I think it was "define")
which then uses a very simple syntax which can be interpreted 
at compile time to modify the behaviour of the compiler, e.g.

define <identifier> = <literal>

There was also a discussion about allowing limited forms of
expressions instead of the constant literal.

define source_encoding = "utf-8"

was the original motivation for this, but (as always ;) the
usefulness for other application areas was quickly recognized, e.g.
to enable compilation in optimization mode on a per module
basis.

PS: "define" is perhaps not obscure enough as keyword...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Mon Dec 18 10:04:08 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 11:04:08 +0100
Subject: [Python-Dev] Use of %c and Py_UNICODE
References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz>
Message-ID: <3A3DE118.3355896D@lemburg.com>

Greg Ewing wrote:
> 
> "M.-A. Lemburg" <mal@lemburg.com>:
> 
> > Format characters will always
> > be ASCII and thus 7-bit -- theres really no need to expand the
> > set of possibilities beyond 8 bits ;-)
> 
> But the error message is being produced because the
> character is NOT a valid format character. One of the
> reasons for that might be because it's not in the
> 7-bit range!

True. 

I think removing %c completely in that case is the right
solution (in case you don't want to convert the Unicode char
using the default encoding to a string first).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Mon Dec 18 10:09:16 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 11:09:16 +0100
Subject: [Python-Dev] What to do about PEP 229?
References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com> <20001216191739.B6703@kronos.cnri.reston.va.us>
Message-ID: <3A3DE24C.DA0B2F6C@lemburg.com>

Andrew Kuchling wrote:
> 
> On Fri, Dec 15, 2000 at 11:39:18PM +0100, M.-A. Lemburg wrote:
> >Can't distutils try both and then settle for the working combination ?
> 
> I'm worried about subtle problems; what if an unneeded -lfoo drags in
> a customized malloc, or has symbols which conflict with some other
> library.

In that case, I think the user will have to decide. setup.py should
then default to not integrating the module in question and issue
a warning telling the use what to look for and how to call setup.py
in order to add the right combination of libs.
 
> >... BTW, where is Greg ? I haven't heard from him in quite a while.]
> 
> Still around; he just hasn't been posting much these days.

Good to know :)
 
> >Why not parse Setup and use it as input to distutils setup.py ?
> 
> That was option 1.  The existing Setup format doesn't really contain
> enough intelligence, though; the intelligence is usually in comments
> such as "Uncomment the following line for Solaris".  So either the
> Setup format is modified (bad, since we'd break existing 3rd-party
> packages that still use a Makefile.pre.in), or I give up and just do
> everything in a setup.py.

I would still like a simple input to setup.py -- one that doesn't
require hacking setup.py just to enable a few more modules.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fredrik@effbot.org  Mon Dec 18 10:15:26 2000
From: fredrik@effbot.org (Fredrik Lundh)
Date: Mon, 18 Dec 2000 11:15:26 +0100
Subject: [Python-Dev] Use of %c and Py_UNICODE
References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> <3A3DE118.3355896D@lemburg.com>
Message-ID: <004a01c068db$72403170$3c6340d5@hagrid>

mal wrote:

> > But the error message is being produced because the
> > character is NOT a valid format character. One of the
> > reasons for that might be because it's not in the
> > 7-bit range!
> 
> True. 
> 
> I think removing %c completely in that case is the right
> solution (in case you don't want to convert the Unicode char
> using the default encoding to a string first).

how likely is it that a human programmer will use a bad formatting
character that's not in the ASCII range?

-1 on removing it -- people shouldn't have to learn the octal ASCII
table just to be able to fix trivial typos.

+1 on mapping the character back to a string in the same was as
"repr" -- that is, print ASCII characters as is, map anything else to
an octal escape.

+0 on leaving it as it is, or mapping non-printables to "?".

</F>



From mal@lemburg.com  Mon Dec 18 10:34:02 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 11:34:02 +0100
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com>
Message-ID: <3A3DE81A.4B825D89@lemburg.com>

> Here some results, dictionaries have 1000 entries:
> 
> timing for strings              old=  5.097 new= 5.088
> timing for bad integers (<<10)  old=101.540 new=12.610
> timing for bad integers (<<16)  old=571.210 new=19.220

Even though I think concentrating on string keys would provide more
performance boost for Python in general, I think you have a point
there. +1 from here.

BTW, would changing the hash function on strings from the simple
XOR scheme to something a little smarter help improve the performance
too (e.g. most strings used in programming never use the 8-th
bit) ?

I also think that we could inline the string compare function
in dictobject:lookdict_string to achieve even better performance.
Currently it uses a function which doesn't trigger compiler
inlining.

And finally: I think a generic PyString_Compare() API would
be useful in a lot of places where strings are being compared
(e.g. dictionaries and keyword parameters). Unicode already
has such an API (along with dozens of other useful APIs which
are not available for strings).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Mon Dec 18 10:41:38 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 11:41:38 +0100
Subject: [Python-Dev] Use of %c and Py_UNICODE
References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> <3A3DE118.3355896D@lemburg.com> <004a01c068db$72403170$3c6340d5@hagrid>
Message-ID: <3A3DE9E2.77FF0FA9@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:
> 
> > > But the error message is being produced because the
> > > character is NOT a valid format character. One of the
> > > reasons for that might be because it's not in the
> > > 7-bit range!
> >
> > True.
> >
> > I think removing %c completely in that case is the right
> > solution (in case you don't want to convert the Unicode char
> > using the default encoding to a string first).
> 
> how likely is it that a human programmer will use a bad formatting
> character that's not in the ASCII range?

Not very likely... the most common case of this error is probably
the use of % as percent sign in a formatting string. The next
character in those cases is usually whitespace.
 
> -1 on removing it -- people shouldn't have to learn the octal ASCII
> table just to be able to fix trivial typos.
> 
> +1 on mapping the character back to a string in the same was as
> "repr" -- that is, print ASCII characters as is, map anything else to
> an octal escape.
> 
> +0 on leaving it as it is, or mapping non-printables to "?".

Agreed.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tismer@tismer.com  Mon Dec 18 11:08:34 2000
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 18 Dec 2000 13:08:34 +0200
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> <3A3DE81A.4B825D89@lemburg.com>
Message-ID: <3A3DF032.5F86AD15@tismer.com>


"M.-A. Lemburg" wrote:
> 
> > Here some results, dictionaries have 1000 entries:
> >
> > timing for strings              old=  5.097 new= 5.088
> > timing for bad integers (<<10)  old=101.540 new=12.610
> > timing for bad integers (<<16)  old=571.210 new=19.220
> 
> Even though I think concentrating on string keys would provide more
> performance boost for Python in general, I think you have a point
> there. +1 from here.
> 
> BTW, would changing the hash function on strings from the simple
> XOR scheme to something a little smarter help improve the performance
> too (e.g. most strings used in programming never use the 8-th
> bit) ?

Yes, it would. I spent the rest of last night to do more
accurate tests, also refined the implementation (using
longs for the shifts etc), and turned from timing over to
trip counting, i.e. a dict counts every round through the
re-hash. That showed two things:
- The bits used from the string hash are not well distributed
- using a "warmup wheel" on the hash to suck all bits in
  gives the same quality of hashes like random numbers.

I will publish some results later today.

> I also think that we could inline the string compare function
> in dictobject:lookdict_string to achieve even better performance.
> Currently it uses a function which doesn't trigger compiler
> inlining.

Sure!

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From guido@python.org  Mon Dec 18 14:20:22 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 09:20:22 -0500
Subject: [Python-Dev] The Dictionary Gem is polished!
In-Reply-To: Your message of "Sun, 17 Dec 2000 19:38:31 +0200."
 <3A3CFA17.ED26F51A@tismer.com>
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com>
 <3A3CFA17.ED26F51A@tismer.com>
Message-ID: <200012181420.JAA25063@cj20424-a.reston1.va.home.com>

> Problem: There are hash functions which are "good" in this sense,
> but they do not spread their randomness uniformly over the
> 32 bits.
> 
> Example: Integers use their own value as hash.
> This is ok, as far as the integers are uniformly distributed.
> But if they all contain a high power of two, for instance,
> the low bits give a very bad hash function.
> 
> Take a dictionary with integers range(1000) as keys and access
> all entries. Then use a dictionay with the integers shifted
> left by 16.
> Access time is slowed down by a factor of 100, since every
> access is a linear search now.

Ai.  I think what happened is this: long ago, the hash table sizes
were primes, or at least not powers of two!

I'll leave it to the more mathematically-inclined to judge your
solution...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Dec 18 14:52:35 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 09:52:35 -0500
Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in?
In-Reply-To: Your message of "Sun, 17 Dec 2000 17:55:10 CST."
 <14909.21086.92774.940814@beluga.mojam.com>
References: <14909.21086.92774.940814@beluga.mojam.com>
Message-ID: <200012181452.JAA04372@cj20424-a.reston1.va.home.com>

> Make distclean succeeded so I tried the following:
> 
>     make distclean
>     ./configure
>     make clean
> 
> but the last step still failed.  Any idea why make clean is now failing (for
> me)?  Can anyone else reproduce this problem?

Yes.  I don't understand it, but this takes care of it:

    make distclean
    ./configure
    make Makefiles		# <--------- !!!
    make clean

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Dec 18 14:54:20 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 09:54:20 -0500
Subject: [Python-Dev] Pragmas and warnings
In-Reply-To: Your message of "Mon, 18 Dec 2000 10:58:37 +0100."
 <3A3DDFCD.34AB05B2@lemburg.com>
References: <3A3D1C4B.8F08A744@ActiveState.com>
 <3A3DDFCD.34AB05B2@lemburg.com>
Message-ID: <200012181454.JAA04394@cj20424-a.reston1.va.home.com>

> There was a long thread about this some months ago. We agreed
> to add a new keyword to the language (I think it was "define")

I don't recall agreeing.  :-)

This is PEP material.  For 2.2, please!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Mon Dec 18 14:56:33 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 15:56:33 +0100
Subject: [Python-Dev] Pragmas and warnings
References: <3A3D1C4B.8F08A744@ActiveState.com>
 <3A3DDFCD.34AB05B2@lemburg.com> <200012181454.JAA04394@cj20424-a.reston1.va.home.com>
Message-ID: <3A3E25A1.DFD2BDBF@lemburg.com>

Guido van Rossum wrote:
> 
> > There was a long thread about this some months ago. We agreed
> > to add a new keyword to the language (I think it was "define")
> 
> I don't recall agreeing.  :-)

Well, maybe it was a misinterpretation on my part... you said
something like "add a new keyword and live with the consequences".
AFAIR, of course :-)

> This is PEP material.  For 2.2, please!

Ok.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@python.org  Mon Dec 18 15:15:26 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 10:15:26 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Sun, 17 Dec 2000 12:09:47 MST."
 <001f01c0685c$ef555200$7bdb5da6@vaio>
References: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com> <200012152123.QAA03566@cj20424-a.reston1.va.home.com>
 <001f01c0685c$ef555200$7bdb5da6@vaio>
Message-ID: <200012181515.KAA04571@cj20424-a.reston1.va.home.com>

[Mark Lutz]
> So please: can we keep string around?  Like it or not, we're 
> way past the point of removing such core modules at this point.

Of course we're keeping string around.  I already said that for
backwards compatibility reasons it would not disappear before Py3K.

I think there's a misunderstanding about the meaning of deprecation,
too.  That word doesn't mean to remove a feature.  It doesn't even
necessarily mean to warn every time a feature is used.  It just means
(to me) that at some point in the future the feature will change or
disappear, there's a new and better way to do it, and that we
encourage users to start using the new way, to save them from work
later.

In my mind, there's no reason to start emitting warnings about every
deprecated feature.  The warnings are only needed late in the
deprecation cycle.  PEP 5 says "There must be at least a one-year
transition period between the release of the transitional version of
Python and the release of the backwards incompatible version."

Can we now stop getting all bent out of shape over this?  String
methods *are* recommended over equivalent string functions.  Those
string functions *are* already deprecated, in the informal sense
(i.e. just that it is recommended to use string methods instead).
This *should* (take notice, Fred!) be documented per 2.1.  We won't
however be issuing run-time warnings about the use of string functions
until much later.  (Lint-style tools may start warning sooner --
that's up to the author of the lint tool to decide.)

Note that I believe Java makes a useful distinction that PEP 5 misses:
it defines both deprecated features and obsolete features.
*Deprecated* features are simply features for which a better
alternative exists.  *Obsolete* features are features that are only
being kept around for backwards compatibility.  Deprecated features
may also be (and usually are) *obsolescent*, meaning they will become
obsolete in the future.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Dec 18 15:22:09 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 10:22:09 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Mon, 18 Dec 2000 00:45:56 +0100."
 <200012172345.AAA00877@loewis.home.cs.tu-berlin.de>
References: <200012172345.AAA00877@loewis.home.cs.tu-berlin.de>
Message-ID: <200012181522.KAA04597@cj20424-a.reston1.va.home.com>

> At some time, there were only string exceptions. Then, instance
> exceptions were added, some releases later they were considered the
> better choice, so the standard library was converted to use them.
> Still, there is no sign whatsoever that anybody plans to deprecate
> string exceptions.

Now there is: I hereby state that I officially deprecate string
exceptions.  Py3K won't support them, and it *may* even require that
all exception classes are derived from Exception.

> I believe the string module will get less importance over
> time. Comparing it with string exception, that may be well 5 years.
> It seems there are two ways of "deprecation": a loud "we will remove
> that, change your code", and a silent "strings have methods"
> (i.e. don't mention the module when educating people). The latter
> approach requires educators to agree that the module is
> "uninteresting", and people to really not use once they find out it
> exists.

Exactly.  This is what I hope will happen.  I certainly hope that Mark
Lutz has already started teaching string methods!

> I think deprecation should be only attempted once there is a clear
> sign that people don't use it massively for new code anymore.

Right.  So now we're on the first step: get the word out!

> Removal should only occur if keeping the module [is] less pain than
> maintaining it.

Exactly.  Guess where the string module falls today. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Barrett@stsci.edu  Mon Dec 18 16:50:49 2000
From: Barrett@stsci.edu (Paul Barrett)
Date: Mon, 18 Dec 2000 11:50:49 -0500 (EST)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
Message-ID: <14910.16431.554136.374725@nem-srvr.stsci.edu>

Guido van Rossum writes:
 > > 
 > > 1. The current boolean operator behavior does not have to change, and
 > >    hence will be backward compatible.
 > 
 > What incompatibility do you see in the current proposal?

You have to choose between using rich comparisons or boolean
comparisons.  You can't use both for the same (rich/complex) object.

 > > 2. It eliminates the need to decide whether or not rich comparisons
 > >    takes precedence over boolean comparisons.
 > 
 > Only if you want different semantics -- that's only an issue for NumPy.

No. I think NumPy is the tip of the iceberg, when discussing new
semantics.  Most users don't consider these broader semantic issues,
because Python doesn't give them the opportunity to do so.  I can see
possible scenarios of using both boolean and non-boolean comparisons
for Python lists and dictionaries in addition to NumPy.

I chose to use Python because it provides a richer framework than
other languages.  When Python fails to provide such benefits, I'll
move to another language.  I moved from PERL to Python because the
multi-dimensional array syntax is vastly better in Python than PERL,
though as a novice I don't have to know that it exists.  What I'm
proposing here is in a similar vein.

 > > 3. The new operators add additional behavior without directly impacting 
 > >    current behavior and the use of them is unambigous, at least in
 > >    relation to current Python behavior.  You know by the operator what 
 > >    type of comparison will be returned.  This should appease Jim
 > >    Fulton, based on his arguments in 1998 about comparison operators
 > >    always returning a boolean value.
 > 
 > As you know, I'm now pretty close to Jim. :-)  He seemed pretty mellow
 > about this now.

Yes, I would hope so!

It appears though that you misunderstand me.  My point was that I tend
to agree with Jim Fulton's arguments for a limited interpretation of
the current comparison operators.  I too expect them to return a
boolean result.  I have never felt comfortable using such comparison
operators in an array context, e.g. as in the array language, IDL. It
just looks wrong.  So my suggestion is to create new ones whose
implicit meaning is to provide element-wise or rich comparison
behavior.  And to add similar behavior for the other operators for
consistency.

Can someone provide an example in mathematics where comparison
operators are used in a non-boolean, ie. rich comparison, context.
If so, this might shut me up!

 > > 4. Compound objects, such as lists, could implement both rich
 > >    and boolean comparisons.  The boolean comparison would remain as
 > >    is, while the rich comparison would return a list of boolean
 > >    values.  Current behavior doesn't change; just a new feature, which
 > >    you may or may not choose to use, is added.
 > > 
 > > If we go one step further and add the matrix-style operators along
 > > with the comparison operators, we can provide a consistent user
 > > interface to array/complex operations without changing current Python
 > > behavior.  If a user has no need for these new operators, he doesn't
 > > have to use them or even know about them.  All we've done is made
 > > Python richer, but I believe with making it more complex.  For

Phrase should be: "but I believe without making it more complex.".
                                 -------

 > > example, all element-wise operations could have a ':' appended to
 > > them, e.g. '+:', '<:', etc.; and will define element-wise addition,
 > > element-wise less-than, etc.  The traditional '*', '/', etc. operators
 > > can then be used for matrix operations, which will appease the Matlab
 > > people.
 > > 
 > > Therefore, I don't think rich comparisons and matrix-type operators
 > > should be considered separable.  I really think you should consider
 > > this suggestion.  It appeases many groups while providing a consistent 
 > > and clear user interface, while greatly impacting current Python
 > > behavior. 

The last phrase should read: "while not greatly impacting current
                                    ---
Python behavior."

 > > 
 > > Always-causing-havoc-at-the-last-moment-ly Yours,
 > 
 > I think you misunderstand.  Rich comparisons are mostly about allowing
 > the separate overloading of <, <=, ==, !=, >, and >=.  This is useful
 > in its own light.

No, I do understand.  I've read most of the early discussions on this
issue and one of those issues was about having to choose between
boolean and rich comparisons and what should take precedence, when
both may be appropriate.  I'm suggesting an alternative here.

 > If you don't want to use this overloading facility for elementwise
 > comparisons in NumPy, that's fine with me.  Nobody says you have to --
 > it's just that you *could*.

Yes, I understand.

 > Red my lips: there won't be *any* new operators in 2.1.

OK, I didn't expect this to make it into 2.1.

 > There will a better way to overload the existing Boolean operators,
 > and they will be able to return non-Boolean results.  That's useful in
 > other situations besides NumPy.

Yes, I agree, this should be done anyway.  I'm just not sure that the
implicit meaning that these comparison operators are being given is
the best one.  I'm just looking for ways to incorporate rich
comparisons into a broader framework, numpy just currently happens to
be the primary example of this proposal.

Assuming the current comparison operator overloading is already
implemented and has been used to implement rich comparisons for some
objects, then my rich comparison proposal would cause confusion.  This 
is what I'm trying to avoid.

 > Feel free to lobby for elementwise operators -- but based on the
 > discussion about this subject so far, I don't give it much of a chance
 > even past Python 2.1.  They would add a lot of baggage to the language
 > (e.g. the table of operators in all Python books would be about twice
 > as long) and by far the most users don't care about them.  (Read the
 > intro to 211 for some of the concerns -- this PEP tries to make the
 > addition palatable by adding exactly *one* new operator.)

So!  Introductory books don't have to discuss these additional
operators.  I don't have to know about XML and socket modules to start
using Python effectively, nor do I have to know about 'zip' or list
comprehensions.  These additions decrease the code size and increase
efficiency, but don't really add any new expressive power that can't
already be done by a 'for' loop.

I'll try to convince myself that this suggestion is crazy and not
bother you with this issue for awhile.

Cheers,
Paul



From guido@python.org  Mon Dec 18 17:18:11 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 12:18:11 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: Your message of "Mon, 18 Dec 2000 11:50:49 EST."
 <14910.16431.554136.374725@nem-srvr.stsci.edu>
References: <14910.16431.554136.374725@nem-srvr.stsci.edu>
Message-ID: <200012181718.MAA14030@cj20424-a.reston1.va.home.com>

Paul Barret:
>  > > 1. The current boolean operator behavior does not have to change, and
>  > >    hence will be backward compatible.

Guido van Rossum:
>  > What incompatibility do you see in the current proposal?

Paul Barret:
> You have to choose between using rich comparisons or boolean
> comparisons.  You can't use both for the same (rich/complex) object.

Sure.  I thought that the NumPy folks were happy with this.  Certainly
two years ago they seemed to be.

>  > > 2. It eliminates the need to decide whether or not rich comparisons
>  > >    takes precedence over boolean comparisons.
>  > 
>  > Only if you want different semantics -- that's only an issue for NumPy.
> 
> No. I think NumPy is the tip of the iceberg, when discussing new
> semantics.  Most users don't consider these broader semantic issues,
> because Python doesn't give them the opportunity to do so.  I can see
> possible scenarios of using both boolean and non-boolean comparisons
> for Python lists and dictionaries in addition to NumPy.

That's the same argument that has been made for new operators all
along.  I've explained already why they are not on the table for 2.1.

> I chose to use Python because it provides a richer framework than
> other languages.  When Python fails to provide such benefits, I'll
> move to another language.  I moved from PERL to Python because the
> multi-dimensional array syntax is vastly better in Python than PERL,
> though as a novice I don't have to know that it exists.  What I'm
> proposing here is in a similar vein.
> 
>  > > 3. The new operators add additional behavior without directly impacting 
>  > >    current behavior and the use of them is unambigous, at least in
>  > >    relation to current Python behavior.  You know by the operator what 
>  > >    type of comparison will be returned.  This should appease Jim
>  > >    Fulton, based on his arguments in 1998 about comparison operators
>  > >    always returning a boolean value.
>  > 
>  > As you know, I'm now pretty close to Jim. :-)  He seemed pretty mellow
>  > about this now.
> 
> Yes, I would hope so!
> 
> It appears though that you misunderstand me.  My point was that I tend
> to agree with Jim Fulton's arguments for a limited interpretation of
> the current comparison operators.  I too expect them to return a
> boolean result.  I have never felt comfortable using such comparison
> operators in an array context, e.g. as in the array language, IDL. It
> just looks wrong.  So my suggestion is to create new ones whose
> implicit meaning is to provide element-wise or rich comparison
> behavior.  And to add similar behavior for the other operators for
> consistency.
> 
> Can someone provide an example in mathematics where comparison
> operators are used in a non-boolean, ie. rich comparison, context.
> If so, this might shut me up!

Not me (I no longer consider myself a mathematician :-).  Why are you
requiring an example from math though?

Again, you will be able to make this argument to the NumPy folks when
they are ready to change the meaning of A<B to mean an array of
Booleans rather than a single Boolean.  Since you're part of the
design team for NumPy NG, you're in a pretty good position to hold out
for elementwise operators!

However, what would you do if elementwise operators were turned down
for ever (which is a realistic possibility)?

In the mean time, I see no harm in *allowing* the comparison operators
to be overridden to return something else besides a Boolean.  Someone
else may find this useful.  (Note that standard types won't use this
new freedom, so I'm not imposing this on anybody -- I'm only giving a
new option.)

>  > > 4. Compound objects, such as lists, could implement both rich
>  > >    and boolean comparisons.  The boolean comparison would remain as
>  > >    is, while the rich comparison would return a list of boolean
>  > >    values.  Current behavior doesn't change; just a new feature, which
>  > >    you may or may not choose to use, is added.
>  > > 
>  > > If we go one step further and add the matrix-style operators along
>  > > with the comparison operators, we can provide a consistent user
>  > > interface to array/complex operations without changing current Python
>  > > behavior.  If a user has no need for these new operators, he doesn't
>  > > have to use them or even know about them.  All we've done is made
>  > > Python richer, but I believe with making it more complex.  For
> 
> Phrase should be: "but I believe without making it more complex.".
>                                  -------
> 
>  > > example, all element-wise operations could have a ':' appended to
>  > > them, e.g. '+:', '<:', etc.; and will define element-wise addition,
>  > > element-wise less-than, etc.  The traditional '*', '/', etc. operators
>  > > can then be used for matrix operations, which will appease the Matlab
>  > > people.
>  > > 
>  > > Therefore, I don't think rich comparisons and matrix-type operators
>  > > should be considered separable.  I really think you should consider
>  > > this suggestion.  It appeases many groups while providing a consistent 
>  > > and clear user interface, while greatly impacting current Python
>  > > behavior. 
> 
> The last phrase should read: "while not greatly impacting current
>                                     ---
> Python behavior."

I don't see any argument for elementwise operators here that I haven't
heard before, and AFAIK it's all in the two PEPs.

>  > > Always-causing-havoc-at-the-last-moment-ly Yours,
>  > 
>  > I think you misunderstand.  Rich comparisons are mostly about allowing
>  > the separate overloading of <, <=, ==, !=, >, and >=.  This is useful
>  > in its own light.
> 
> No, I do understand.  I've read most of the early discussions on this
> issue and one of those issues was about having to choose between
> boolean and rich comparisons and what should take precedence, when
> both may be appropriate.  I'm suggesting an alternative here.

Note that Python doesn't decide which should take precedent.  The
implementer of an individual extension type decides what his
comparison operators will return.

>  > If you don't want to use this overloading facility for elementwise
>  > comparisons in NumPy, that's fine with me.  Nobody says you have to --
>  > it's just that you *could*.
> 
> Yes, I understand.
> 
>  > Red my lips: there won't be *any* new operators in 2.1.
> 
> OK, I didn't expect this to make it into 2.1.
> 
>  > There will a better way to overload the existing Boolean operators,
>  > and they will be able to return non-Boolean results.  That's useful in
>  > other situations besides NumPy.
> 
> Yes, I agree, this should be done anyway.  I'm just not sure that the
> implicit meaning that these comparison operators are being given is
> the best one.  I'm just looking for ways to incorporate rich
> comparisons into a broader framework, numpy just currently happens to
> be the primary example of this proposal.
> 
> Assuming the current comparison operator overloading is already
> implemented and has been used to implement rich comparisons for some
> objects, then my rich comparison proposal would cause confusion.  This 
> is what I'm trying to avoid.

AFAIK, rich comparisons haven't been used anywhere to return
non-Boolean results.

>  > Feel free to lobby for elementwise operators -- but based on the
>  > discussion about this subject so far, I don't give it much of a chance
>  > even past Python 2.1.  They would add a lot of baggage to the language
>  > (e.g. the table of operators in all Python books would be about twice
>  > as long) and by far the most users don't care about them.  (Read the
>  > intro to 211 for some of the concerns -- this PEP tries to make the
>  > addition palatable by adding exactly *one* new operator.)
> 
> So!  Introductory books don't have to discuss these additional
> operators.  I don't have to know about XML and socket modules to start
> using Python effectively, nor do I have to know about 'zip' or list
> comprehensions.  These additions decrease the code size and increase
> efficiency, but don't really add any new expressive power that can't
> already be done by a 'for' loop.
> 
> I'll try to convince myself that this suggestion is crazy and not
> bother you with this issue for awhile.

Happy holidays nevertheless. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Mon Dec 18 18:38:13 2000
From: tim.one@home.com (Tim Peters)
Date: Mon, 18 Dec 2000 13:38:13 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <14910.16431.554136.374725@nem-srvr.stsci.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEGLIEAA.tim.one@home.com>

[Paul Barrett]
> ...
> Can someone provide an example in mathematics where comparison
> operators are used in a non-boolean, ie. rich comparison, context.
> If so, this might shut me up!

By my informal accounting, over the years there have been more requests for
three-outcome comparison operators than for elementwise ones, although the
three-outcome lobby isn't organized so is less visible.  It's a natural
request for anyone working with partial orderings (a < b -> one of {yes, no,
unordered}).  Another large group of requests comes from people working with
variants of fuzzy logic, where it's desired that the comparison operators be
definable to return floats (intuitively corresponding to the probability
that the stated relation "is true").  Another desire comes from the symbolic
math camp, which would like to be able to-- as is possible for "+", "*",
etc --define "<" so that e.g. "x < y" return an object capturing that
somebody *asked* for "x < y"; they're not interested in numeric or Boolean
results so much as symbolic expressions.  "<" is used for all these things
in the literature too.

Whatever.  "<" and friends are just collections of pixels.  Add 300 new
operator symbols, and people will want to redefine all of them at will too.

draw-a-line-in-the-sand-and-the-wind-blows-it-away-ly y'rs  - tim



From tim.one@home.com  Mon Dec 18 20:37:13 2000
From: tim.one@home.com (Tim Peters)
Date: Mon, 18 Dec 2000 15:37:13 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012152123.QAA03566@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHBIEAA.tim.one@home.com>

[Guido]
> ...
> If you're saying that we should give users ample time for the
> transition, I'm with you.

Then we're with each other, for suitably large values of "ample" <wink>.

> If you're saying that you think the string module is too prominent to
> ever start deprecating its use, I'm afraid we have a problem.

We may.  Time will tell.  It needs a conversion tool, else I think it's
unsellable.

> ...
> I'd also like to note that using the string module's wrappers incurs
> the overhead of a Python function call -- using string methods is
> faster.
>
> Finally, I like the look of fields[i].strip().lower() much better than
> that of string.lower(string.strip(fields[i])) -- an actual example
> from mimetools.py.

I happen to like string methods better myself; I don't think that's at issue
(except that loads of people apparently don't like "join" as a string
method -- idiots <wink>).

The issue to me is purely breaking old code someday -- "string" is in very
heavy use, and unlike as when deprecating regex in favor of re (either pre
or especially sre), string methods aren't orders of magnitude better than
the old way; and also unlike regex-vs-re it's not the case that the string
module has become unmaintainable (to the contrary, string.py has become
trivial).  IOW, this one would be unprecedented fiddling.

> ...
> Note that I believe Java makes a useful distinction that PEP 5 misses:
> it defines both deprecated features and obsolete features.
> *Deprecated* features are simply features for which a better
> alternative exists.  *Obsolete* features are features that are only
> being kept around for backwards compatibility.  Deprecated features
> may also be (and usually are) *obsolescent*, meaning they will become
> obsolete in the future.

I agree it would be useful to define these terms, although those particular
definitions appear to be missing the most important point from the user's
POV (not a one says "going away someday").  A Google search on "java
obsolete obsolescent deprecated" doesn't turn up anything useful, so I doubt
the usages you have in mind come from Java (it has "deprecated", but doesn't
appear to have any well-defined meaning for the others).

In keeping with the religious nature of the battle-- and religion offers
precise terms for degrees of damnation! --I suggest:

    struggling -- a supported feature; the initial state of
        all features; may transition to Anathematized

    anathematized -- this feature is now cursed, but is supported;
        may transition to Condemned or Struggling; intimacy with
        Anathematized features is perilous

    condemned -- a feature scheduled for crucifixion; may transition
        to Crucified, Anathematized (this transition is called "a pardon"),
        or Struggling (this transition is called "a miracle"); intimacy
        with Condemned features is suicidal

    crucified -- a feature that is no longer supported; may transition
        to Resurrected

    resurrected -- a once-Crucified feature that is again supported;
        may transition to Condemned, Anathematized or Struggling;
        although since Resurrection is a state of grace, there may be
        no point in human time at which a feature is identifiably
        Resurrected (i.e., it may *appear*, to the unenlightened, that
        a feature moved directly from Crucified to Anathematized or
        Struggling or Condemned -- although saying so out loud is heresy).



From tismer@tismer.com  Mon Dec 18 22:58:03 2000
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 18 Dec 2000 23:58:03 +0100
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com>
 <3A3CFA17.ED26F51A@tismer.com> <200012181420.JAA25063@cj20424-a.reston1.va.home.com>
Message-ID: <3A3E967B.BE404114@tismer.com>


Guido van Rossum wrote:
[me, expanding on hashes, integers,and how to tame them cheaply]

> Ai.  I think what happened is this: long ago, the hash table sizes
> were primes, or at least not powers of two!

At some time I will wake up and they tell me that I'm reducible :-)

> I'll leave it to the more mathematically-inclined to judge your
> solution...

I love small lists! - ciao - chris

+1   (being a member, hopefully)

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From greg@cosc.canterbury.ac.nz  Mon Dec 18 23:04:42 2000
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 19 Dec 2000 12:04:42 +1300 (NZDT)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEGLIEAA.tim.one@home.com>
Message-ID: <200012182304.MAA02642@s454.cosc.canterbury.ac.nz>

[Paul Barrett]
> ...
> Can someone provide an example in mathematics where comparison
> operators are used in a non-boolean, ie. rich comparison, context.
> If so, this might shut me up!

Not exactly mathematical, but some day I'd like to create
a database access module which lets you say things like

  mydb = OpenDB("inventory")
  parts = mydb.parts
  tuples = mydb.retrieve(parts.name, parts.number).where(parts.quantity >= 42)

Of course, to really make this work I need to be able
to overload "and" and "or" as well, but that's a whole
'nother PEP...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@python.org  Mon Dec 18 23:32:51 2000
From: guido@python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 18:32:51 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: Your message of "Tue, 19 Dec 2000 12:04:42 +1300."
 <200012182304.MAA02642@s454.cosc.canterbury.ac.nz>
References: <200012182304.MAA02642@s454.cosc.canterbury.ac.nz>
Message-ID: <200012182332.SAA18456@cj20424-a.reston1.va.home.com>

> Not exactly mathematical, but some day I'd like to create
> a database access module which lets you say things like
> 
>   mydb = OpenDB("inventory")
>   parts = mydb.parts
>   tuples = mydb.retrieve(parts.name, parts.number).where(parts.quantity >= 42)
> 
> Of course, to really make this work I need to be able
> to overload "and" and "or" as well, but that's a whole
> 'nother PEP...

Believe it or not, in 1998 we already had a suggestion for overloading
these too.  This is hinted at in David Ascher's proposal (the Appendix
of PEP 208) where objects could define __boolean_and__ to overload
x<y<z.  It doesn't get all the details right: it's not enough to check
if the left operand is true, since that leaves 'or' out in the cold,
but a different test (i.e. the presence of __boolean_and__) would
work.

I am leaving this out of the current PEP because the bytecode you have
to generate for this is very hairy.  A simple expression like ``f()
and g()'' would become something like:

  outcome = f()
  if hasattr(outcome, '__boolean_and__'):
      outcome = outcome.__boolean_and__(g())
  elif outcome:
      outcome = g()

The problem I have with this is that the code to evaluate g() has to
be generated twice!  In general, g() could be an arbitrary expression.
We can't evaluate g() ahead of time, because it should not be
evaluated at all when outcome is false and doesn't define
__boolean_and__().

For the same reason the current PEP doesn't support x<y<z when x<y
doesn't return a Boolean result; a similar solution would be possible.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Tue Dec 19 00:09:35 2000
From: tim.one@home.com (Tim Peters)
Date: Mon, 18 Dec 2000 19:09:35 -0500
Subject: [Python-Dev] RE: The Dictionary Gem is polished!
In-Reply-To: <3A3CFA17.ED26F51A@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHJIEAA.tim.one@home.com>

Sounds good to me!  It's a very cheap way to get the high bits into play.

>        i = (~_hash) & mask

The ~ here seems like pure superstition to me (and the comments in the C
code don't justify it at all -- I added a nag of my own about that the last
time I checked in dictobject.c -- and see below for a bad consequence of
doing ~).

>            # note that we do not mask!
>            # even the shifting my not be worth it.
>            incr = _hash ^ (_hash >> 3)

The shifting was just another cheap trick to get *some* influence from the
high bits.  It's very limited, though.  Toss it (it appears to be from the
"random operations yield random results" <wink/sigh> matchbook school of
design).

[MAL]
> BTW, would changing the hash function on strings from the simple
> XOR scheme to something a little smarter help improve the performance
> too (e.g. most strings used in programming never use the 8-th
> bit) ?

Don't understand -- the string hash uses multiplication:

    x = (1000003*x) ^ *p++;

in a loop.  Replacing "^" there by "+" should yield slightly better results.
As is, string hashes are a lot like integer hashes, in that "consecutive"
strings

   J001
   J002
   J003
   J004
   ...

yield hashes very close together in value.  But, because the current dict
algorithm uses ~ on the full hash but does not use ~ on the initial
increment, (~hash)+incr too often yields the same result for distinct hashes
(i.e., there's a systematic (but weak) form of clustering).

Note that Python is doing something very unusual:  hashes are *usually*
designed to yield an approximation to randomness across all bits.  But
Python's hashes never achieve that.  This drives theoreticians mad (like the
fellow who originally came up with the GF idea), but tends to work "better
than random" in practice (e.g., a truly random hash function would almost
certainly produce many collisions when fed a fat range of consecutive
integers but still less than half the table size; but Python's trivial
"identity" integer hash produces no collisions in that common case).

[Christian]
> - The bits used from the string hash are not well distributed
> - using a "warmup wheel" on the hash to suck all bits in
>   gives the same quality of hashes like random numbers.

See above and be very cautious:  none of Python's hash functions produce
well-distributed bits, and-- in effect --that's why Python dicts often
perform "better than random" on common data.  Even what you've done so far
appears to provide marginally worse statistics for Guido's favorite kind of
test case ("worse" in two senses:  total number of collisions (a measure of
amortized lookup cost), and maximum collision chain length (a measure of
worst-case lookup cost)):

   d = {}
   for i in range(N):
       d[repr(i)] = i

check-in-one-thing-then-let-it-simmer-ly y'rs  - tim



From tismer@tismer.com  Tue Dec 19 01:16:27 2000
From: tismer@tismer.com (Christian Tismer)
Date: Tue, 19 Dec 2000 02:16:27 +0100
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <Pine.LNX.4.10.10012180848140.20569-100000@akbar.nevex.com>
Message-ID: <3A3EB6EB.C79A3896@tismer.com>

This is a multi-part message in MIME format.
--------------E592273E7D1C3FC9F78A4489
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit



Greg Wilson wrote:
> 
> > > > Here some results, dictionaries have 1000 entries:
> > I will publish some results later today.
> 
> In Doctor Dobb's Journal, right? :-)  We'd *really* like this article...

Well, the results are not so bad:

I stopped testing computation time for the Python dictionary
implementation, in favor of "trips". How many trips does
the re-hash take in a dictionary?

Tests were run for dictionaries of size 1000, 2000, 3000, 4000.

Dictionary 1 consists of i, formatted as string.
Dictionary 2 consists of strings containig the binary of i.
Dictionary 3 consists of random numbers.
Dictionary 4 consists of i << 16.

Algorithms:
old is the original dictionary algorithm implemented
in Python (probably quite correct now, using longs :-)

new is the proposed incremental bits-suck-in-division
algorithm.

new2 is a version of new, where all extra bits of the
hash function are wheeled in in advance. The computation
time of this is not neglectible, so please use this result
for reference, only.


Here the results:
(bad integers(old) not computed for n>1000 )

"""
D:\crml_doc\platf\py>python dictest.py
N=1000
trips for strings old=293 new=302 new2=221
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=499500 new=13187 new2=999
trips for random integers old=377 new=371 new2=393
trips for windows names old=230 new=207 new2=200
N=2000
trips for strings old=1093 new=1109 new2=786
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=26455 new2=1999
trips for random integers old=691 new=710 new2=741
trips for windows names old=503 new=542 new2=564
N=3000
trips for strings old=810 new=839 new2=609
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=38681 new2=2999
trips for random integers old=762 new=740 new2=735
trips for windows names old=712 new=711 new2=691
N=4000
trips for strings old=1850 new=1843 new2=1375
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=52994 new2=3999
trips for random integers old=1440 new=1450 new2=1414
trips for windows names old=1449 new=1434 new2=1457

D:\crml_doc\platf\py>
"""

Interpretation:
---------------
Short numeric strings show a slightly too high trip number.
This means that the hash() function could be enhanced.
But the effect would be below 10 percent compared to
random hashes, therefore not worth it.

Binary representations of numbers as strings still create
perfect hash numbers.

Bad integers (complete hash clash due to high power of 2)
are handled fairly well by the new algorithm. "new2"
shows that they can be brought down to nearly perfect
hashes just by applying the "hash melting wheel":

Windows names are almost upper case, and almost verbose.
They appear to perform nearly as well as random numbers.
This means: The Python string has function is very good
for a wide area of applications.

In Summary: I would try to modify the string hash function
slightly for short strings, but only if this does not
negatively affect the results of above.

Summary of summary:
There is no really low hanging fruit in string hashing.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com
--------------E592273E7D1C3FC9F78A4489
Content-Type: text/plain; charset=us-ascii;
 name="dictest.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="dictest.py"

## dictest.py
## Test of a new rehash algorithm
## Chris Tismer
## 2000-12-17
## Mission Impossible 5oftware Team

# The following is a partial re-implementation of
# Python dictionaries in Python.
# The original algorithm was literally turned
# into Python code.

##/*
##Table of irreducible polynomials to efficiently cycle through
##GF(2^n)-{0}, 2<=n<=30.
##*/
polys = [
    4 + 3,
    8 + 3,
    16 + 3,
    32 + 5,
    64 + 3,
    128 + 3,
    256 + 29,
    512 + 17,
    1024 + 9,
    2048 + 5,
    4096 + 83,
    8192 + 27,
    16384 + 43,
    32768 + 3,
    65536 + 45,
    131072 + 9,
    262144 + 39,
    524288 + 39,
    1048576 + 9,
    2097152 + 5,
    4194304 + 3,
    8388608 + 33,
    16777216 + 27,
    33554432 + 9,
    67108864 + 71,
    134217728 + 39,
    268435456 + 9,
    536870912 + 5,
    1073741824 + 83,
    0
]
polys = map(long, polys)

class NULL: pass

class Dictionary:
    dummy = "<dummy key>"
    def __init__(mp, newalg=0):
        mp.ma_size = 0
        mp.ma_poly = 0
        mp.ma_table = []
        mp.ma_fill = 0
        mp.ma_used = 0
        mp.oldalg = not newalg
        mp.warmup = newalg>1
        mp.trips = 0

    def getTrips(self):
        trips = self.trips
        self.trips = 0
        return trips

    def lookdict(mp, key, _hash):
        me_hash, me_key, me_value = range(3) # rec slots
        dummy = mp.dummy
        
        mask = mp.ma_size-1
        ep0 = mp.ma_table
        i = (~_hash) & mask
        ep = ep0[i]
        if ep[me_key] is NULL or ep[me_key] == key:
            return ep
        if ep[me_key] == dummy:
            freeslot = ep
        else:
            if (ep[me_hash] == _hash and
                cmp(ep[me_key], key) == 0) :
                return ep
            freeslot = NULL

###### FROM HERE
        if mp.oldalg:
            incr = (_hash ^ (_hash >> 3)) & mask
        else:
            # note that we do not mask!
            # the shifting is worth it in the incremental case.

            ## added after posting to python-dev:
            uhash = _hash & 0xffffffffl
            if mp.warmup:
                incr = uhash
                mask2 = 0xffffffffl ^ mask
                while mask2 > mask:
                    if (incr & 1):
                        incr = incr ^ mp.ma_poly
                    incr = incr >> 1
                    mask2 = mask2>>1
                # this loop *can* be sped up by tables
                # with precomputed multiple shifts.
                # But I'm not sure if it is worth it at all.
            else:
                incr = uhash ^ (uhash >> 3)

###### TO HERE
            
        if (not incr):
            incr = mask
        while 1:
            mp.trips = mp.trips+1
            
            ep = ep0[int((i+incr)&mask)]
            if (ep[me_key] is NULL) :
                if (freeslot is not NULL) :
                    return freeslot
                else:
                    return ep
            if (ep[me_key] == dummy) :
                if (freeslot == NULL):
                    freeslot = ep
            elif (ep[me_key] == key or
                 (ep[me_hash] == _hash and
                  cmp(ep[me_key], key) == 0)) :
                return ep

            # Cycle through GF(2^n)-{0}
###### FROM HERE
            if mp.oldalg:
                incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            else:
                # new algorithm: do a division
                if (incr & 1):
                    incr = incr ^ mp.ma_poly
                incr = incr >> 1
###### TO HERE

    def insertdict(mp, key, _hash, value):
        me_hash, me_key, me_value = range(3) # rec slots
        
        ep = mp.lookdict(key, _hash)
        if (ep[me_value] is not NULL) :
            old_value = ep[me_value]
            ep[me_value] = value
        else :
            if (ep[me_key] is NULL):
                mp.ma_fill=mp.ma_fill+1
            ep[me_key] = key
            ep[me_hash] = _hash
            ep[me_value] = value
            mp.ma_used = mp.ma_used+1

    def dictresize(mp, minused):
        me_hash, me_key, me_value = range(3) # rec slots

        oldsize = mp.ma_size
        oldtable = mp.ma_table
        MINSIZE = 4
        newsize = MINSIZE
        for i in range(len(polys)):
            if (newsize > minused) :
                newpoly = polys[i]
                break
            newsize = newsize << 1
        else:
            return -1

        _nullentry = range(3)
        _nullentry[me_hash] = 0
        _nullentry[me_key] = NULL
        _nullentry[me_value] = NULL

        newtable = map(lambda x,y=_nullentry:y[:], range(newsize))      

        mp.ma_size = newsize
        mp.ma_poly = newpoly
        mp.ma_table = newtable
        mp.ma_fill = 0
        mp.ma_used = 0

        for ep in oldtable:
            if (ep[me_value] is not NULL):
                mp.insertdict(ep[me_key],ep[me_hash],ep[me_value])
        return 0

    # PyDict_GetItem
    def __getitem__(op, key):
        me_hash, me_key, me_value = range(3) # rec slots

        if not op.ma_table:
            raise KeyError, key
        _hash = hash(key)
        return op.lookdict(key, _hash)[me_value]

    # PyDict_SetItem
    def __setitem__(op, key, value):
        mp = op
        _hash = hash(key)
##      /* if fill >= 2/3 size, double in size */
        if (mp.ma_fill*3 >= mp.ma_size*2) :
            if (mp.dictresize(mp.ma_used*2) != 0):
                if (mp.ma_fill+1 > mp.ma_size):
                    raise MemoryError
        mp.insertdict(key, _hash, value)

    # more interface functions
    def keys(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _key)
        return res

    def values(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _value)
        return res

    def items(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( (_key, _value) )
        return res

    def __cmp__(self, other):
        mine = self.items()
        others = other.items()
        mine.sort()
        others.sort()
        return cmp(mine, others)

######################################################
## tests

def test(lis, dic):
    for key in lis: dic[key]

def nulltest(lis, dic):
    for key in lis: dic

def string_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    for i in range(n):
        s = str(i) #* 5
        #s = chr(i%256) + chr(i>>8)##
        d1[s] = d2[s] = d3[s] = i
    return d1, d2, d3

def istring_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    for i in range(n):
        s = chr(i%256) + chr(i>>8)
        d1[s] = d2[s] = d3[s] = i
    return d1, d2, d3

def random_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    from whrandom import randint
    import sys
    keys = []
    for i in range(n):
        keys.append(randint(0, sys.maxint-1))
    for i in keys:
        d1[i] = d2[i] = d3[i] = i
    return d1, d2, d3

def badnum_dicts(n):
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	d3 = Dictionary(2)  # with warmup
	shift = 10
	if EXTREME:
		shift = 16
	for i in range(n):
		bad = i << 16
		d2[bad] = d3[bad] = i
		if n <= 1000: d1[bad] = i
	return d1, d2, d3

def names_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    import win32con
    keys = win32con.__dict__.keys()
    if len(keys) < n:
        keys = []
    for s in keys[:n]:
        d1[s] = d2[s] = d3[s] = s
    return d1, d2, d3

def do_test(dict):
    keys = dict.keys()
    dict.getTrips() # reset
    test(keys, dict)
    return dict.getTrips()

EXTREME=1

if __name__ == "__main__":

    for N in (1000,2000,3000,4000):    

        sdold, sdnew, sdnew2 = string_dicts(N)
        idold, idnew, idnew2 = istring_dicts(N)
        bdold, bdnew, bdnew2 = badnum_dicts(N)
        rdold, rdnew, rdnew2 = random_dicts(N)
        ndold, ndnew, ndnew2 = names_dicts(N)

        print "N=%d" %N        
        print "trips for strings old=%d new=%d new2=%d" % tuple(
            map(do_test, (sdold, sdnew, sdnew2)) )
        print "trips for bin strings old=%d new=%d new2=%d" % tuple(
            map(do_test, (idold, idnew, idnew2)) )
        print "trips for bad integers old=%d new=%d new2=%d" % tuple(
            map(do_test, (bdold, bdnew, bdnew2)))
        print "trips for random integers old=%d new=%d new2=%d" % tuple(
            map(do_test, (rdold, rdnew, rdnew2)))
        print "trips for windows names old=%d new=%d new2=%d" % tuple(
            map(do_test, (ndold, ndnew, ndnew2)))

"""
Results with a shift of 10 (EXTREME=0):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.097 new=5.088
timing for bad integers old=101.540 new=12.610

Results with a shift of 16 (EXTREME=1):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.218 new=5.147
timing for bad integers old=571.210 new=19.220
"""


--------------E592273E7D1C3FC9F78A4489--



From tismer@tismer.com  Tue Dec 19 01:51:32 2000
From: tismer@tismer.com (Christian Tismer)
Date: Tue, 19 Dec 2000 02:51:32 +0100
Subject: [Python-Dev] Re: The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCGEHJIEAA.tim.one@home.com>
Message-ID: <3A3EBF23.750CF761@tismer.com>


Tim Peters wrote:
> 
> Sounds good to me!  It's a very cheap way to get the high bits into play.

That's what I wanted to hear. It's also the reason why I try
to stay conservative: Just do an obviously useful bit, but
do not break any of the inherent benefits, like those
"better than random" amenities.
Python's dictionary algorithm appears to be "near perfect"
and of "never touch but veery carefully or redo it completely".
I tried the tightrope walk of just adding a tiny topping.

> >        i = (~_hash) & mask

Yes that stuff was 2 hours last nite :-)
I just decided to not touch it. Arbitrary crap!
Although an XOR with hash >> number of mask bits
would perform much better (in many cases but not all).
Anyway, simple shifting cannot solve general bit
distribution problems. Nor can I :-)

> The ~ here seems like pure superstition to me (and the comments in the C
> code don't justify it at all -- I added a nag of my own about that the last
> time I checked in dictobject.c -- and see below for a bad consequence of
> doing ~).
> 
> >            # note that we do not mask!
> >            # even the shifting my not be worth it.
> >            incr = _hash ^ (_hash >> 3)
> 
> The shifting was just another cheap trick to get *some* influence from the
> high bits.  It's very limited, though.  Toss it (it appears to be from the
> "random operations yield random results" <wink/sigh> matchbook school of
> design).

Now, comment it out, and you see my new algorithm perform much worse.
I just kept it since it had an advantage on "my case". (bad guy I know).
And I wanted to have an argument for my change to get accepted.
"No cost, just profit, nearly the same" was what I tried to sell.

> [MAL]
> > BTW, would changing the hash function on strings from the simple
> > XOR scheme to something a little smarter help improve the performance
> > too (e.g. most strings used in programming never use the 8-th
> > bit) ?
> 
> Don't understand -- the string hash uses multiplication:
> 
>     x = (1000003*x) ^ *p++;
> 
> in a loop.  Replacing "^" there by "+" should yield slightly better results.

For short strings, this prime has bad influence on the low bits,
making it perform supoptimally for small dicts.
See the new2 algo which funnily corrects for that.
The reason is obvious: Just look at the bit pattern
of 1000003:  '0xf4243'

Without giving proof, this smells like bad bit distribution on small
strings to me. You smell it too, right?

> As is, string hashes are a lot like integer hashes, in that "consecutive"
> strings
> 
>    J001
>    J002
>    J003
>    J004
>    ...
> 
> yield hashes very close together in value. 

A bad generator in that case. I'll look for a better one.

> But, because the current dict
> algorithm uses ~ on the full hash but does not use ~ on the initial
> increment, (~hash)+incr too often yields the same result for distinct hashes
> (i.e., there's a systematic (but weak) form of clustering).

You name it.

> Note that Python is doing something very unusual:  hashes are *usually*
> designed to yield an approximation to randomness across all bits.  But
> Python's hashes never achieve that.  This drives theoreticians mad (like the
> fellow who originally came up with the GF idea), but tends to work "better
> than random" in practice (e.g., a truly random hash function would almost
> certainly produce many collisions when fed a fat range of consecutive
> integers but still less than half the table size; but Python's trivial
> "identity" integer hash produces no collisions in that common case).

A good reason to be careful with changes(ahem).

> [Christian]
> > - The bits used from the string hash are not well distributed
> > - using a "warmup wheel" on the hash to suck all bits in
> >   gives the same quality of hashes like random numbers.
> 
> See above and be very cautious:  none of Python's hash functions produce
> well-distributed bits, and-- in effect --that's why Python dicts often
> perform "better than random" on common data.  Even what you've done so far
> appears to provide marginally worse statistics for Guido's favorite kind of
> test case ("worse" in two senses:  total number of collisions (a measure of
> amortized lookup cost), and maximum collision chain length (a measure of
> worst-case lookup cost)):
> 
>    d = {}
>    for i in range(N):
>        d[repr(i)] = i

Nah, I did quite a lot of tests, and the trip number shows a
variation of about 10%, without judging old or new for better.
This is just the randomness inside.

> check-in-one-thing-then-let-it-simmer-ly y'rs  - tim

This is why I think to be even more conservative:
Try to use a division wheel, but with the inverses
of the original primitive roots, just in order to
get at Guido's results :-)

making-perfect-hashes-of-interneds-still-looks-promising - ly y'rs

   - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From greg@cosc.canterbury.ac.nz  Tue Dec 19 03:07:56 2000
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 19 Dec 2000 16:07:56 +1300 (NZDT)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <200012182332.SAA18456@cj20424-a.reston1.va.home.com>
Message-ID: <200012190307.QAA02663@s454.cosc.canterbury.ac.nz>

> The problem I have with this is that the code to evaluate g() has to
> be generated twice!

I have an idea how to fix that. There need to be two methods,
__boolean_and_1__ and __boolean_and_2__. The first operand
is evaluated and passed to __boolean_and_1__. If it returns
a result, that becomes the result of the expression, and the
second operand is short-circuited.

If __boolean_and_1__ raises a NeedOtherOperand exception
(or there is no __boolean_and_1__ method), the second operand 
is evaluated, and both operands are passed to __boolean_and_2__.

The bytecode would look something like

        <evaluate first operand>
        BOOLEAN_AND_1 label
        <evaluate second operand>
        BOOLEAN_AND_2
label:  ...

I'll make a PEP out of this one day if I get enthusiastic
enough.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From tim.one@home.com  Tue Dec 19 04:55:33 2000
From: tim.one@home.com (Tim Peters)
Date: Mon, 18 Dec 2000 23:55:33 -0500
Subject: [Python-Dev] The Dictionary Gem is polished!
In-Reply-To: <3A3EB6EB.C79A3896@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEIAIEAA.tim.one@home.com>

Something else to ponder:  my tests show that the current ("old") algorithm
performs much better (somewhat worse than "new2" == new algorithm + warmup)
if incr is simply initialized like so instead:

        if mp.oldalg:
            incr = (_hash & 0xffffffffL) % (mp.ma_size - 1)

That's another way to get all the bits to contribute to the result.  Note
that a mod by size-1 is analogous to "casting out nines" in decimal:  it's
the same as breaking hash into fixed-sized pieces from the right (10 bits
each if size=2**10, etc), adding the pieces together, and repeating that
process until only one piece remains.  IOW, it's a degenerate form of
division, but works well all the same.  It didn't improve over that when I
tried a mod by the largest prime less than the table size (which suggests
we're sucking all we can out of the *probe* sequence given a sometimes-poor
starting index).

However, it's subject to the same weak clustering phenomenon as the old
method due to the ill-advised "~hash" operation in computing the initial
index.  If ~ is also thrown away, it's as good as new2 (here I've tossed out
the "windows names", and "old" == existing algorithm except (a) get rid of ~
when computing index and (b) do mod by size-1 when computing incr):

N=1000
trips for strings old=230 new=261 new2=239
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=999 new=13212 new2=999
trips for random integers old=399 new=421 new2=410
N=2000
trips for strings old=787 new=1066 new2=827
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=26481 new2=1999
trips for random integers old=652 new=733 new2=650
N=3000
trips for strings old=547 new=760 new2=617
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=38701 new2=2999
trips for random integers old=724 new=743 new2=768
N=4000
trips for strings old=1311 new=1657 new2=1315
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=53014 new2=3999
trips for random integers old=1476 new=1587 new2=1493

The new and new2 values differ in minor ways from the ones you posted
because I got rid of the ~ (the ~ has a bad interaction with "additive"
means of computing incr, because the ~ tends to move the index in the
opposite direction, and these moves in opposite directions tend to cancel
out when computing incr+index the first time).

too-bad-mod-is-expensive!-ly y'rs  - tim



From tim.one@home.com  Tue Dec 19 05:50:01 2000
From: tim.one@home.com (Tim Peters)
Date: Tue, 19 Dec 2000 00:50:01 -0500
Subject: [Python-Dev] SourceForge SSH silliness
In-Reply-To: <20001217220008.D29681@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEIDIEAA.tim.one@home.com>

[Tim]
> Starting last night, I get this msg whenever I update Python code w/
> CVSROOT=:ext:tim_one@cvs.python.sourceforge.net:/cvsroot/python:
>
> """
> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> @       WARNING: HOST IDENTIFICATION HAS CHANGED!         @
> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
> Someone could be eavesdropping on you right now
> (man-in-the-middle attack)!
> It is also possible that the host key has just been changed.
> Please contact your system administrator.
> Add correct host key in C:\Code/.ssh/known_hosts to get rid of
> this message.
> Password authentication is disabled to avoid trojan horses.
> """
>
> This is SourceForge's doing, and is permanent (they've changed
> keys on their end). ...

[Thomas Wouters]
> What sourceforge did was switch Linux distributions, and upgrade.
> The switch doesn't really matter for the SSH problem, because recent
> Debian and recent RedHat releases both use a new ssh, the OpenBSD
> ssh imlementation.
> Apparently, it isn't entirely backwards compatible to old versions of
> F-secure ssh. For one thing, it doesn't support the 'idea' cypher. This
> might or might not be your problem; if it is, you should get a decent
> message that gives a relatively clear message such as 'cypher type 'idea'
> not supported'.
> ... [and quite a bit more] ...

I hope you're feeling better today <wink>.  "The problem" was one the wng
msg spelled out:  "It is also possible that the host key has just been
changed.".  SF changed keys.  That's the whole banana right there.  Deleting
the sourceforge keys from known_hosts fixed it (== convinced ssh to install
new SF keys the next time I connected).



From tim.one@home.com  Tue Dec 19 05:58:45 2000
From: tim.one@home.com (Tim Peters)
Date: Tue, 19 Dec 2000 00:58:45 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <200012171438.JAA21603@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEIEIEAA.tim.one@home.com>

[Tim]
> I expect it would do less harm to introduce a compile-time warning for
> locals that are never referenced (such as the "a" in "set").

[Guido]
> Another warning that would be quite useful (and trap similar cases)
> would be "local variable used before set".

Java elevated that last one to a compile-time error, via its "definite
assignment" rules:  you not only have to make sure a local is bound before
reference, you have to make it *obvious* to the compiler that it's bound
before reference.  I think this is a Good Thing, because with intense
training, people can learn to think like a compiler too <wink>.

Seriously, in several of the cases where gcc warned about "maybe used before
set" in the Python implementation, the warnings were bogus but it was
non-trivial to deduce that.  Such code is very brittle under modification,
and the definite assignment rules make that path to error a non-starter.

Example:

def f(N):
    if N > 0:
        for i in range(N):
            if i == 0:
                j = 42
            else:
                f2(i)
    elif N <= 0:
        j = 24
    return j

It's a Crime Against Humanity to make the code reader *deduce* that j is
always bound by the time "return" is executed.




From guido@python.org  Tue Dec 19 06:08:14 2000
From: guido@python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 01:08:14 -0500
Subject: [Python-Dev] Error: syncmail script missing
Message-ID: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>

I just checked in the documentation for the warnings module.  (Check
it out!)

When I ran "cvs commit" in the Doc directory, it said, amongst other
things:

    sh: /cvsroot/python/CVSROOT/syncmail: No such file or directory

I suppose this may be a side effect of the transition to new hardware
of the SourceForge CVS archive.  (Which, by the way, has dramatically
improved the performance of typical CVS operations -- I am no longer
afraid to do a cvs diff or cvs log in Emacs, or to do a cvs update
just to be sure.)

Could some of the Powers That Be (Fred or Barry :-) check into what
happened to the syncmail script?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Tue Dec 19 06:10:04 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 19 Dec 2000 01:10:04 -0500 (EST)
Subject: [Python-Dev] Error: syncmail script missing
In-Reply-To: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
Message-ID: <14910.64444.662460.48236@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > Could some of the Powers That Be (Fred or Barry :-) check into what
 > happened to the syncmail script?

  We've seen this before, but I'm not sure what it was.  Barry, do you
recall?  Had the Python interpreter landed in a different directory?
Or perhaps the location of the CVS repository is different, so
syncmail isn't where loginfo says.
  Tomorrow... scp to SF appears broken as well.  ;(


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From tim.one@home.com  Tue Dec 19 06:16:15 2000
From: tim.one@home.com (Tim Peters)
Date: Tue, 19 Dec 2000 01:16:15 -0500
Subject: [Python-Dev] Error: syncmail script missing
In-Reply-To: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEIFIEAA.tim.one@home.com>

[Guido]
> I just checked in the documentation for the warnings module.  (Check
> it out!)

Everyone should note that this means Guido will be taking his traditional
post-release vacation almost immediately <wink -- but he is about to
leave!>.

> When I ran "cvs commit" in the Doc directory, it said, amongst other
> things:
>
>     sh: /cvsroot/python/CVSROOT/syncmail: No such file or directory
>
> I suppose this may be a side effect of the transition to new hardware
> of the SourceForge CVS archive.

The lack of checkin mail was first noted on a Jython list.  Finn wisely
replied that he'd just sit back and wait for the CPython people to figure
out how to fix it.

> ...
> Could some of the Powers That Be (Fred or Barry :-) check into what
> happened to the syncmail script?

Don't worry, I'll do my part by nagging them in your absence <wink>.  Bon
holiday voyage!



From cgw@fnal.gov  Tue Dec 19 06:32:15 2000
From: cgw@fnal.gov (Charles G Waldman)
Date: Tue, 19 Dec 2000 00:32:15 -0600 (CST)
Subject: [Python-Dev] cycle-GC question
Message-ID: <14911.239.12288.546710@buffalo.fnal.gov>

The following program:

import rexec
while 1:
      x = rexec.RExec()
      del x

leaks memory at a fantastic rate.

It seems clear (?) that this is due to the call to "set_rexec" at
rexec.py:140, which creates a circular reference between the `rexec'
and `hooks' objects.  (There's even a nice comment to that effect).

I'm curious however as to why the spiffy new cyclic-garbage collector
doesn't pick this up?

Just-wondering-ly y'rs,
		  cgw










From tim_one@email.msn.com  Tue Dec 19 09:24:18 2000
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 19 Dec 2000 04:24:18 -0500
Subject: [Python-Dev] RE: The Dictionary Gem is polished!
In-Reply-To: <3A3EBF23.750CF761@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEIKIEAA.tim_one@email.msn.com>

[Christian Tismer]
> ...
> For short strings, this prime has bad influence on the low bits,
> making it perform supoptimally for small dicts.
> See the new2 algo which funnily corrects for that.
> The reason is obvious: Just look at the bit pattern
> of 1000003:  '0xf4243'
>
> Without giving proof, this smells like bad bit distribution on small
> strings to me. You smell it too, right?
> ...

[Tim]
> As is, string hashes are a lot like integer hashes, in that
> "consecutive" strings
>
>    J001
>    J002
>    J003
>    J004
>    ...
>
> yield hashes very close together in value.

[back to Christian]
> A bad generator in that case. I'll look for a better one.

Not necessarily!  It's for that same reason "consecutive strings" can have
"better than random" behavior today.  And consecutive strings-- like
consecutive ints --are a common case.

Here are the numbers for the synthesized string cases:

N=1000
trips for strings old=293 new=302 new2=221
trips for bin strings old=0 new=0 new2=0
N=2000
trips for strings old=1093 new=1109 new2=786
trips for bin strings old=0 new=0 new2=0
N=3000
trips for strings old=810 new=839 new2=609
trips for bin strings old=0 new=0 new2=0
N=4000
trips for strings old=1850 new=1843 new2=1375
trips for bin strings old=0 new=0 new2=0

Here they are again, after doing nothing except changing the "^" to "+" in
the string hash, i.e. replacing

		x = (1000003*x) ^ *p++;
by
		x = (1000003*x) + *p++;

N=1000
trips for strings old=140 new=127 new2=108
trips for bin strings old=0 new=0 new2=0
N=2000
trips for strings old=480 new=434 new2=411
trips for bin strings old=0 new=0 new2=0
N=3000
trips for strings old=821 new=857 new2=631
trips for bin strings old=0 new=0 new2=0
N=4000
trips for strings old=1892 new=1852 new2=1476
trips for bin strings old=0 new=0 new2=0

The first two sizes are dramatically better, the last two a wash.  If you
want to see a real disaster, replace the "+" with "*" <wink>:

N=1000
trips for strings old=71429 new=6449 new2=2048
trips for bin strings old=81187 new=41117 new2=41584
N=2000
trips for strings old=26882 new=9300 new2=6103
trips for bin strings old=96018 new=46932 new2=42408

I got tired of waiting at that point ...

suspecting-a-better-string-hash-is-hard-to-find-ly y'rs  - tim




From martin@loewis.home.cs.tu-berlin.de  Tue Dec 19 11:58:17 2000
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 19 Dec 2000 12:58:17 +0100
Subject: [Python-Dev] Death to string functions!
Message-ID: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>

> I agree it would be useful to define these terms, although those
> particular definitions appear to be missing the most important point
> from the user's POV (not a one says "going away someday").

PEP 4 says

# Usage of a module may be `deprecated', which means that it may be
# removed from a future Python release.

Proposals for better wording are welcome (and yes, I still have to get
the comments that I got into the document).

Regards,
Martin


From guido@python.org  Tue Dec 19 14:48:47 2000
From: guido@python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 09:48:47 -0500
Subject: [Python-Dev] cycle-GC question
In-Reply-To: Your message of "Tue, 19 Dec 2000 00:32:15 CST."
 <14911.239.12288.546710@buffalo.fnal.gov>
References: <14911.239.12288.546710@buffalo.fnal.gov>
Message-ID: <200012191448.JAA28737@cj20424-a.reston1.va.home.com>

> The following program:
> 
> import rexec
> while 1:
>       x = rexec.RExec()
>       del x
> 
> leaks memory at a fantastic rate.
> 
> It seems clear (?) that this is due to the call to "set_rexec" at
> rexec.py:140, which creates a circular reference between the `rexec'
> and `hooks' objects.  (There's even a nice comment to that effect).
> 
> I'm curious however as to why the spiffy new cyclic-garbage collector
> doesn't pick this up?

Me too.  I turned on gc debugging (gc.set_debug(077) :-) and got
messages suggesting that it is not collecting everything.  The
output looks like this:

   .
   .
   .
gc: collecting generation 0...
gc: objects in each generation: 764 6726 89174
gc: done.
gc: collecting generation 1...
gc: objects in each generation: 0 8179 89174
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 764 0 97235
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 757 747 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 764 1386 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 757 2082 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 764 2721 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 757 3417 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 764 4056 97184
gc: done.
   .
   .
   .

With the third number growing each time a "generation 1" collection is
done.

Maybe Neil can shed some light?  The gc.garbage list is empty.

This is about as much as I know about the GC stuff...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From petrilli@amber.org  Tue Dec 19 15:25:18 2000
From: petrilli@amber.org (Christopher Petrilli)
Date: Tue, 19 Dec 2000 10:25:18 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Tue, Dec 19, 2000 at 12:58:17PM +0100
References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>
Message-ID: <20001219102518.A14288@trump.amber.org>

So I was thinking about this whole thing, and wondering why it was
that seeing things like:

     " ".join(aList)

bugged me to no end, while:

     aString.lower()

didn't seem to look wrong. I finally put my finger on it, and I
haven't seen anyone mention it, so I guess I'll do so.  To me, the
concept of "join" on a string is just not quite kosher, instead it
should be something like this:

     aList.join(" ")

or if you want it without the indirection:

     ['item', 'item', 'item'].join(" ")

Now *THAT* looks right to me.  The example of a join method on a
string just doesn't quite gel in my head, and I did some thinking and
digging, and well, when I pulled up my Smalltalk browser, things like
join are done on Collections, not on Strings.  You're joining the
collection, not the string.

Perhaps in a rush to move some things that were "string related" in
the string module into methods on the strings themselves (something I
whole-heartedly support), we moved a few too many things
there---things that symantically don't really belong as methods on a
string object.

How this gets resolved, I don't know... but I know a lot of people
have looked at the string methods---and they each keep coming back to
1 or 2 that bug them... and I think it's those that really aren't
methods of a string, but instead something that operates with strings, 
but expects other things.

Chris
-- 
| Christopher Petrilli
| petrilli@amber.org


From guido@python.org  Tue Dec 19 15:37:15 2000
From: guido@python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 10:37:15 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Tue, 19 Dec 2000 10:25:18 EST."
 <20001219102518.A14288@trump.amber.org>
References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>
 <20001219102518.A14288@trump.amber.org>
Message-ID: <200012191537.KAA28909@cj20424-a.reston1.va.home.com>

> So I was thinking about this whole thing, and wondering why it was
> that seeing things like:
> 
>      " ".join(aList)
> 
> bugged me to no end, while:
> 
>      aString.lower()
> 
> didn't seem to look wrong. I finally put my finger on it, and I
> haven't seen anyone mention it, so I guess I'll do so.  To me, the
> concept of "join" on a string is just not quite kosher, instead it
> should be something like this:
> 
>      aList.join(" ")
> 
> or if you want it without the indirection:
> 
>      ['item', 'item', 'item'].join(" ")
> 
> Now *THAT* looks right to me.  The example of a join method on a
> string just doesn't quite gel in my head, and I did some thinking and
> digging, and well, when I pulled up my Smalltalk browser, things like
> join are done on Collections, not on Strings.  You're joining the
> collection, not the string.
> 
> Perhaps in a rush to move some things that were "string related" in
> the string module into methods on the strings themselves (something I
> whole-heartedly support), we moved a few too many things
> there---things that symantically don't really belong as methods on a
> string object.
> 
> How this gets resolved, I don't know... but I know a lot of people
> have looked at the string methods---and they each keep coming back to
> 1 or 2 that bug them... and I think it's those that really aren't
> methods of a string, but instead something that operates with strings, 
> but expects other things.

Boy, are you stirring up a can of worms that we've been through many
times before!  Nothing you say hasn't been said at least a hundred
times before, on this list as well as on c.l.py.

The problem is that if you want to make this a method on lists, you'll
also have to make it a method on tuples, and on arrays, and on NumPy
arrays, and on any user-defined type that implements the sequence
protocol...  That's just not reasonable to expect.

There really seem to be only two possibilities that don't have this
problem: (1) make it a built-in, or (2) make it a method on strings.

We chose for (2) for uniformity, and to avoid the potention with
os.path.join(), which is sometimes imported as a local.  If
" ".join(L) bugs you, try this:

    space = " "	 # This could be a global
     .
     .
     .
    s = space.join(L)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@digicool.com  Tue Dec 19 15:46:55 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 10:46:55 -0500
Subject: [Python-Dev] Death to string functions!
References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>
 <20001219102518.A14288@trump.amber.org>
Message-ID: <14911.33519.764029.306876@anthem.concentric.net>

>>>>> "CP" == Christopher Petrilli <petrilli@amber.org> writes:

    CP> So I was thinking about this whole thing, and wondering why it
    CP> was that seeing things like:

    CP>      " ".join(aList)

    CP> bugged me to no end, while:

    CP>      aString.lower()

    CP> didn't seem to look wrong. I finally put my finger on it, and
    CP> I haven't seen anyone mention it, so I guess I'll do so.

Actually, it has been debated to death. ;)  This looks better:

    SPACE = ' '
    SPACE.join(aList)

That reads good to me ("space-join this list") and that's how I always
write it.  That said, there are certainly lots of people who agree
with you.  You can't put join() on sequences though, until you have
builtin base-classes, or interfaces, or protocols or some such
construct, because otherwise you'd have to add it to EVERY sequence,
including classes that act like sequences.

One idea that I believe has merit is to consider adding join() to the
builtins, probably with a signature like:

    join(aList, aString) -> aString

This horse has been whacked pretty good too, but I don't remember
seeing a patch or a pronouncement.

-Barry


From nas@arctrix.com  Tue Dec 19 08:53:36 2000
From: nas@arctrix.com (Neil Schemenauer)
Date: Tue, 19 Dec 2000 00:53:36 -0800
Subject: [Python-Dev] cycle-GC question
In-Reply-To: <200012191448.JAA28737@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Dec 19, 2000 at 09:48:47AM -0500
References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com>
Message-ID: <20001219005336.A303@glacier.fnational.com>

On Tue, Dec 19, 2000 at 09:48:47AM -0500, Guido van Rossum wrote:
> > import rexec
> > while 1:
> >       x = rexec.RExec()
> >       del x
> > 
> > leaks memory at a fantastic rate.
> > 
> > It seems clear (?) that this is due to the call to "set_rexec" at
> > rexec.py:140, which creates a circular reference between the `rexec'
> > and `hooks' objects.  (There's even a nice comment to that effect).

Line 140 is not the only place a circular reference is created.
There is another one which is trickier to find:

    def add_module(self, mname):
        if self.modules.has_key(mname):
            return self.modules[mname]
        self.modules[mname] = m = self.hooks.new_module(mname)
        m.__builtins__ = self.modules['__builtin__']
        return m

If the module being added is __builtin__ then m.__builtins__ = m.
The GC currently doesn't track modules.  I guess it should.  It
might be possible to avoid this circular reference but I don't
know enough about how RExec works.  Would something like:

    def add_module(self, mname):
        if self.modules.has_key(mname):
            return self.modules[mname]
        self.modules[mname] = m = self.hooks.new_module(mname)
        if mname != '__builtin__':
            m.__builtins__ = self.modules['__builtin__']
        return m
    
do the trick?

  Neil


From fredrik@effbot.org  Tue Dec 19 15:39:49 2000
From: fredrik@effbot.org (Fredrik Lundh)
Date: Tue, 19 Dec 2000 16:39:49 +0100
Subject: [Python-Dev] Death to string functions!
References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> <20001219102518.A14288@trump.amber.org>
Message-ID: <008301c069d3$76560a20$3c6340d5@hagrid>

"Christopher Petrilli" wrote:
> didn't seem to look wrong. I finally put my finger on it, and I
> haven't seen anyone mention it, so I guess I'll do so.  To me, the
> concept of "join" on a string is just not quite kosher, instead it
> should be something like this:
> 
>      aList.join(" ")
> 
> or if you want it without the indirection:
> 
>      ['item', 'item', 'item'].join(" ")
> 
> Now *THAT* looks right to me.

why do we keep coming back to this?

aString.join can do anything string.join can do, but aList.join
cannot.  if you don't understand why, check the archives.

</F>



From martin@loewis.home.cs.tu-berlin.de  Tue Dec 19 15:44:48 2000
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 19 Dec 2000 16:44:48 +0100
Subject: [Python-Dev] cycle-GC question
Message-ID: <200012191544.QAA11408@loewis.home.cs.tu-berlin.de>

> It seems clear (?) that this is due to the call to "set_rexec" at
> rexec.py:140, which creates a circular reference between the `rexec'
> and `hooks' objects.  (There's even a nice comment to that effect).

It's not all that clear that *this* is the cycle. In fact, it is not.

> I'm curious however as to why the spiffy new cyclic-garbage
> collector doesn't pick this up?

It's an interesting problem, so I spent this afternoon investigating
it. I soon found that I need a tool, so I introduced a new function
gc.getreferents which, when given an object, returns a list of objects
referring to that object. The patch for that feature is in

http://sourceforge.net/patch/?func=detailpatch&patch_id=102925&group_id=5470

Applying that function recursively, I can get an output that looks
like that:

<rexec.RExec instance at 0x81f5dcc>
 <method RExec.r_import of RExec instance at 0x81f5dcc>
  dictionary 0x81f4f24
 <method RExec.r_reload of RExec instance at 0x81f5dcc>
  dictionary 0x81f4f24 (seen)
 <method RExec.r_open of RExec instance at 0x81f5dcc>
  dictionary 0x81f4f24 (seen)
 <method RExec.r_exc_info of RExec instance at 0x81f5dcc>
  dictionary 0x8213bc4
 dictionary 0x820869c
  <rexec.RHooks instance at 0x8216cbc>
   dictionary 0x820866c
    <rexec.RExec instance at 0x81f5dcc> (seen)
   dictionary 0x8213bf4
    <ihooks.FancyModuleLoader instance at 0x81f7464>
     dictionary 0x820866c (seen)
     dictionary 0x8214144
      <ihooks.ModuleImporter instance at 0x8214124>
       dictionary 0x820866c (seen)

Each indentation level shows the objects which refer to the outer-next
object, e.g. the dictionary 0x820869c refers to the RExec instance,
and the RHooks instance refers to that dictionary. Clearly, the
dictionary 0x820869c is the RHooks' __dict__, and the reference
belongs to the 'rexec' key in that dictionary.

The recursion stops only when an object has been seen before (so its a
cycle, or other non-tree graph), or if there are no referents (the
lists created to do the iteration are ignored).

So it appears that the r_import method is referenced from some
dictionary, but that dictionary is not referenced anywhere???

Checking the actual structures shows that rexec creates a __builtin__
module, which has a dictionary that has an __import__ key. So the
reference to the method comes from the __builtin__ module, which in
turn is referenced as the RExec's .modules attribute, giving another
cycle.

However, module objects don't participate in garbage
collection. Therefore, gc.getreferents cannot traverse a module, and
the garbage collector won't find a cycle involving a garbage module.
I just submitted a bug report,

http://sourceforge.net/bugs/?func=detailbug&bug_id=126345&group_id=5470

which suggests that modules should also participate in garbage
collection.

Regards,
Martin


From guido@python.org  Tue Dec 19 16:01:46 2000
From: guido@python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 11:01:46 -0500
Subject: [Python-Dev] cycle-GC question
In-Reply-To: Your message of "Tue, 19 Dec 2000 00:53:36 PST."
 <20001219005336.A303@glacier.fnational.com>
References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com>
 <20001219005336.A303@glacier.fnational.com>
Message-ID: <200012191601.LAA29015@cj20424-a.reston1.va.home.com>

> might be possible to avoid this circular reference but I don't
> know enough about how RExec works.  Would something like:
> 
>     def add_module(self, mname):
>         if self.modules.has_key(mname):
>             return self.modules[mname]
>         self.modules[mname] = m = self.hooks.new_module(mname)
>         if mname != '__builtin__':
>             m.__builtins__ = self.modules['__builtin__']
>         return m
>     
> do the trick?

That's certainly a good thing to do (__builtin__ has no business
having a __builtins__!), but (in my feeble experiment) it doesn't make
the leaks go away.

Note that almost every module participates heavily in cycles: whenever
you define a function f(), f.func_globals is the module's __dict__,
which also contains a reference to f.  Similar for classes, with an
extra hop via the class object and its __dict__.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From cgw@fnal.gov  Tue Dec 19 16:06:06 2000
From: cgw@fnal.gov (Charles G Waldman)
Date: Tue, 19 Dec 2000 10:06:06 -0600 (CST)
Subject: [Python-Dev] cycle-GC question
In-Reply-To: <20001219005336.A303@glacier.fnational.com>
References: <14911.239.12288.546710@buffalo.fnal.gov>
 <200012191448.JAA28737@cj20424-a.reston1.va.home.com>
 <20001219005336.A303@glacier.fnational.com>
Message-ID: <14911.34670.664178.418523@buffalo.fnal.gov>

Neil Schemenauer writes:
 > 
 > Line 140 is not the only place a circular reference is created.
 > There is another one which is trickier to find:
 > 
 >     def add_module(self, mname):
 >         if self.modules.has_key(mname):
 >             return self.modules[mname]
 >         self.modules[mname] = m = self.hooks.new_module(mname)
 >         m.__builtins__ = self.modules['__builtin__']
 >         return m
 > 
 > If the module being added is __builtin__ then m.__builtins__ = m.
 > The GC currently doesn't track modules.  I guess it should.  It
 > might be possible to avoid this circular reference but I don't
 > know enough about how RExec works.  Would something like:
 > 
 >     def add_module(self, mname):
 >         if self.modules.has_key(mname):
 >             return self.modules[mname]
 >         self.modules[mname] = m = self.hooks.new_module(mname)
 >         if mname != '__builtin__':
 >             m.__builtins__ = self.modules['__builtin__']
 >         return m
 >     
 > do the trick?

No... if you change "add_module" in exactly the way you suggest
(without worrying about whether it breaks the functionality of rexec!)
and run the test

while 1:
      rexec.REXec()

you will find that it still leaks memory at a prodigious rate.

So, (unless there is yet another module-level cyclic reference) I
don't think this theory explains the problem.


From martin@loewis.home.cs.tu-berlin.de  Tue Dec 19 16:07:04 2000
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 19 Dec 2000 17:07:04 +0100
Subject: [Python-Dev] cycle-GC question
Message-ID: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de>

> There is another one which is trickier to find:
[__builtin__.__builtins__ == __builtin__]

> Would something like:
[do not add builtins to builtin
> work?

No, because there is another one that is even trickier to find :-)

>>> print r
<rexec.RExec instance at 0x81f7cac>
>>> print r.modules['__builtin__'].open.im_self
<rexec.RExec instance at 0x81f7cac>

Please see my other message; I think modules should be gc'ed.

Regards,
Martin


From nas@arctrix.com  Tue Dec 19 09:24:29 2000
From: nas@arctrix.com (Neil Schemenauer)
Date: Tue, 19 Dec 2000 01:24:29 -0800
Subject: [Python-Dev] cycle-GC question
In-Reply-To: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Tue, Dec 19, 2000 at 05:07:04PM +0100
References: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de>
Message-ID: <20001219012429.A520@glacier.fnational.com>

On Tue, Dec 19, 2000 at 05:07:04PM +0100, Martin v. Loewis wrote:
> I think modules should be gc'ed.

I agree.  Its easy to do.  If no one does over Christmas I will
do it before 2.1 is released.

  Neil


From tismer@tismer.com  Tue Dec 19 15:48:58 2000
From: tismer@tismer.com (Christian Tismer)
Date: Tue, 19 Dec 2000 17:48:58 +0200
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCMEIAIEAA.tim.one@home.com>
Message-ID: <3A3F836A.DEDF1011@tismer.com>


Tim Peters wrote:
> 
> Something else to ponder:  my tests show that the current ("old") algorithm
> performs much better (somewhat worse than "new2" == new algorithm + warmup)
> if incr is simply initialized like so instead:
> 
>         if mp.oldalg:
>             incr = (_hash & 0xffffffffL) % (mp.ma_size - 1)

Sure. I did this as well, but didn't consider a division
since it said to be too slow. But this is very platform
dependant. On Pentiums this might be not noticeable.

> That's another way to get all the bits to contribute to the result.  Note
> that a mod by size-1 is analogous to "casting out nines" in decimal:  it's
> the same as breaking hash into fixed-sized pieces from the right (10 bits
> each if size=2**10, etc), adding the pieces together, and repeating that
> process until only one piece remains.  IOW, it's a degenerate form of
> division, but works well all the same.  It didn't improve over that when I
> tried a mod by the largest prime less than the table size (which suggests
> we're sucking all we can out of the *probe* sequence given a sometimes-poor
> starting index).

Again I tried this too. Instead of the largest near prime I used
the nearest prime. Remarkably the nearest prime is identical
to the primitive element in a lot of cases.
But no improvement over the modulus.

> 
> However, it's subject to the same weak clustering phenomenon as the old
> method due to the ill-advised "~hash" operation in computing the initial
> index.  If ~ is also thrown away, it's as good as new2 (here I've tossed out
> the "windows names", and "old" == existing algorithm except (a) get rid of ~
> when computing index and (b) do mod by size-1 when computing incr):
...
> The new and new2 values differ in minor ways from the ones you posted
> because I got rid of the ~ (the ~ has a bad interaction with "additive"
> means of computing incr, because the ~ tends to move the index in the
> opposite direction, and these moves in opposite directions tend to cancel
> out when computing incr+index the first time).

Remarkable.

> too-bad-mod-is-expensive!-ly y'rs  - tim

Yes. The wheel is cheapest yet.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From just@letterror.com  Tue Dec 19 17:11:55 2000
From: just@letterror.com (Just van Rossum)
Date: Tue, 19 Dec 2000 18:11:55 +0100
Subject: [Python-Dev] Death to string functions!
Message-ID: <l03102804b6653dd31c4e@[193.78.237.123]>

Barry wrote:
>Actually, it has been debated to death. ;)  This looks better:
>
>    SPACE = ' '
>    SPACE.join(aList)
>
>That reads good to me ("space-join this list") and that's how I always
>write it.

I just did a quick scan through the 1.5.2 library, and _most_
occurrances of string.join() are used with a string constant
for the second argument. There is a whole bunch of one-arg
string.join()'s, too. Recommending replacing all of these (not
to mention all the code "out there") with named constants seems
plain silly.

Sure, " ".join() is the most "logical" choice for Python as it
stands, but it's definitely not the most intuitive, as evidenced
by the number of times this comes up on c.l.py: to many people
it simply "looks wrong". Maybe this is the deal: joiner.join()
makes a whole lot of sense from an _implementation_ standpoint,
but a whole lot less as a public interface.

It's easy to explain why join() can't be a method of sequences
(in Python), but that alone doesn't justify a string method.
string.join() is not quite unlike map() and friends:
map() wouldn't be so bad as a sequence method, but that isn't
practical for exactly the same reasons: so it's a builtin.
(And not a function method...)

So, making join() a builtin makes a whole lot of sense. Not
doing this because people sometimes use a local reference to
os.path.join seems awfully backward. Hm, maybe joiner.join()
could become a special method: joiner.__join__(), that way
other objects could define their own implementation for
join(). (Hm, wouldn't be the worst thing for, say, a file
path object...)

Just




From barry@digicool.com  Tue Dec 19 17:20:07 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 12:20:07 -0500
Subject: [Python-Dev] Death to string functions!
References: <l03102804b6653dd31c4e@[193.78.237.123]>
Message-ID: <14911.39111.710940.342986@anthem.concentric.net>

>>>>> "JvR" == Just van Rossum <just@letterror.com> writes:

    JvR> Recommending replacing all of these (not to mention all the
    JvR> code "out there") with named constants seems plain silly.

Until there's a tool to do the migration, I don't (personally)
recommend wholesale migration.  For new code I write though, I usually
do it the way I described (which is intuitive to me, but then so is
moving your fingers at a blinding speed up and down 5 long strips of
metal to cause low bowel-tickling rumbly noises).

    JvR> So, making join() a builtin makes a whole lot of sense. Not
    JvR> doing this because people sometimes use a local reference to
    JvR> os.path.join seems awfully backward.

I agree.  Have we agreed on the semantics and signature of builtin
join() though?  Is it just string.join() stuck in builtins?

-Barry


From fredrik@effbot.org  Tue Dec 19 17:25:49 2000
From: fredrik@effbot.org (Fredrik Lundh)
Date: Tue, 19 Dec 2000 18:25:49 +0100
Subject: [Python-Dev] Death to string functions!
References: <l03102804b6653dd31c4e@[193.78.237.123]> <14911.39111.710940.342986@anthem.concentric.net>
Message-ID: <012901c069e0$bd724fb0$3c6340d5@hagrid>

Barry wrote:
>     JvR> So, making join() a builtin makes a whole lot of sense. Not
>     JvR> doing this because people sometimes use a local reference to
>     JvR> os.path.join seems awfully backward.
> 
> I agree.  Have we agreed on the semantics and signature of builtin
> join() though?  Is it just string.join() stuck in builtins?

+1

(let's leave the __join__ slot and other super-generalized
variants for 2.2)

</F>



From thomas@xs4all.net  Tue Dec 19 17:54:34 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 19 Dec 2000 18:54:34 +0100
Subject: [Python-Dev] SourceForge SSH silliness
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEIDIEAA.tim.one@home.com>; from tim.one@home.com on Tue, Dec 19, 2000 at 12:50:01AM -0500
References: <20001217220008.D29681@xs4all.nl> <LNBBLJKPBEHFEDALKOLCIEIDIEAA.tim.one@home.com>
Message-ID: <20001219185434.E29681@xs4all.nl>

On Tue, Dec 19, 2000 at 12:50:01AM -0500, Tim Peters wrote:

> [Thomas Wouters]
> > What sourceforge did was switch Linux distributions, and upgrade.
> > ... [and quite a bit more] ...

> I hope you're feeling better today <wink>.  "The problem" was one the wng
> msg spelled out:  "It is also possible that the host key has just been
> changed.".  SF changed keys.  That's the whole banana right there.  Deleting
> the sourceforge keys from known_hosts fixed it (== convinced ssh to install
> new SF keys the next time I connected).

Well, if you'd read the thread <wink>, you'll notice that other people had
problems even after that. I'm glad you're not one of them, though :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From barry@digicool.com  Tue Dec 19 18:22:19 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 13:22:19 -0500
Subject: [Python-Dev] Error: syncmail script missing
References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCIEIFIEAA.tim.one@home.com>
Message-ID: <14911.42843.284822.935268@anthem.concentric.net>

Folks,

Python wasn't installed on the new SF CVS machine, which was why
syncmail was broken.  My thanks to the SF guys for quickly remedying
this situation!

Please give it a test.
-Barry


From barry@digicool.com  Tue Dec 19 18:23:32 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 13:23:32 -0500
Subject: [Python-Dev] Error: syncmail script missing
References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCIEIFIEAA.tim.one@home.com>
 <14911.42843.284822.935268@anthem.concentric.net>
Message-ID: <14911.42916.573600.922606@anthem.concentric.net>

>>>>> "BAW" == Barry A Warsaw <barry@digicool.com> writes:

    BAW> Python wasn't installed on the new SF CVS machine, which was
    BAW> why syncmail was broken.  My thanks to the SF guys for
    BAW> quickly remedying this situation!

BTW, it's currently Python 1.5.2.


From tismer@tismer.com  Tue Dec 19 17:34:14 2000
From: tismer@tismer.com (Christian Tismer)
Date: Tue, 19 Dec 2000 19:34:14 +0200
Subject: [Python-Dev] Re: The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCGEHJIEAA.tim.one@home.com>
Message-ID: <3A3F9C16.562F9D9F@tismer.com>

This is a multi-part message in MIME format.
--------------F2E36624A7D999AC873AD6CE
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Again...

Tim Peters wrote:
> 
> Sounds good to me!  It's a very cheap way to get the high bits into play.
...
> [Christian]
> > - The bits used from the string hash are not well distributed
> > - using a "warmup wheel" on the hash to suck all bits in
> >   gives the same quality of hashes like random numbers.
> 
> See above and be very cautious:  none of Python's hash functions produce
> well-distributed bits, and-- in effect --that's why Python dicts often
> perform "better than random" on common data.  Even what you've done so far
> appears to provide marginally worse statistics for Guido's favorite kind of
> test case ("worse" in two senses:  total number of collisions (a measure of
> amortized lookup cost), and maximum collision chain length (a measure of
> worst-case lookup cost)):
> 
>    d = {}
>    for i in range(N):
>        d[repr(i)] = i

I will look into this.

> check-in-one-thing-then-let-it-simmer-ly y'rs  - tim

Are you saying I should check the thing in? Really?


In another reply to this message I was saying
"""
This is why I think to be even more conservative:
Try to use a division wheel, but with the inverses
of the original primitive roots, just in order to
get at Guido's results :-)
"""

This was a religious desire, but such an inverse cannot exist.
Well, all inverses exists, but it is an error to think
that they can produce similar bit patterns. Changing the
root means changing the whole system, since we have just
a *representation* of a goup, via polynomial coefficients.

A simple example which renders my thought useless is this:
There is no general pattern that can turn a physical right
shift into a left shift, for all bit combinations.

Anyway, how can I produce a nearly complete scheme like today
with the same "cheaper than random" properties?

Ok, we have to stick with the given polymomials to stay
compatible, and we also have to shift left.
How do we then rotate the random bits in?
Well, we can in fact do a rotation of the whole index, moving
the highest bit into the lowest.
Too bad that this isn't supported in C. It is a native
machine instruction on X86 machines.

We would then have:

                incr = ROTATE_LEFT(incr, 1)
                if (incr > mask):
                    incr = incr ^ mp.ma_poly

The effect is similar to the "old" algorithm, bits are shiftet
left. Only if the hash happens to have hight bits, they appear
in the modulus.
On the current "faster than random" cases, I assume that
high bits in the hash are less likely than low bits, so it is
more likely that an entry finds its good place in the dict,
before bits are rotated in. hence the "good" cases would be kept.

I did all tests again, now including maximum trip length, and
added a "rotate-left" version as well:

D:\crml_doc\platf\py>python dictest.py
N=1000
trips for strings old=293/9 new=302/7 new2=221/7 rot=278/5
trips for bad integers old=499500/999 new=13187/31 new2=999/1 rot=16754/31
trips for random integers old=360/8 new=369/8 new2=358/6 rot=356/7
trips for windows names old=230/5 new=207/7 new2=200/5 rot=225/5
N=2000
trips for strings old=1093/11 new=1109/10 new2=786/6 rot=1082/8
trips for bad integers old=0/0 new=26455/32 new2=1999/1 rot=33524/34
trips for random integers old=704/7 new=686/8 new2=685/7 rot=693/7
trips for windows names old=503/8 new=542/9 new2=564/6 rot=529/7
N=3000
trips for strings old=810/5 new=839/6 new2=609/5 rot=796/5
trips for bad integers old=0/0 new=38681/36 new2=2999/1 rot=49828/38
trips for random integers old=708/5 new=723/7 new2=724/5 rot=722/6
trips for windows names old=712/6 new=711/5 new2=691/5 rot=738/9
N=4000
trips for strings old=1850/9 new=1843/8 new2=1375/11 rot=1848/10
trips for bad integers old=0/0 new=52994/39 new2=3999/1 rot=66356/38
trips for random integers old=1395/9 new=1397/8 new2=1435/9 rot=1394/13
trips for windows names old=1449/8 new=1434/8 new2=1457/11 rot=1513/9

D:\crml_doc\platf\py>

Concerning trip length, rotate is better than old in most cases.
Random integers seem to withstand any of these procedures.
For bad integers, rot takes naturally more trips than new, since
the path to the bits is longer.

All in all I don't see more than marginal differences between
the approaches, and I tent to stick with "new", since it is
theapest to implement.
(it does not cost anything and might instead be a little cheaper
for some compilers, since it does not reference the mask variable).

I'd say let's do the patch   --   ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com
--------------F2E36624A7D999AC873AD6CE
Content-Type: text/plain; charset=us-ascii;
 name="dictest.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="dictest.py"

## dictest.py
## Test of a new rehash algorithm
## Chris Tismer
## 2000-12-17
## Mission Impossible 5oftware Team

# The following is a partial re-implementation of
# Python dictionaries in Python.
# The original algorithm was literally turned
# into Python code.

##/*
##Table of irreducible polynomials to efficiently cycle through
##GF(2^n)-{0}, 2<=n<=30.
##*/
polys = [
    4 + 3,
    8 + 3,
    16 + 3,
    32 + 5,
    64 + 3,
    128 + 3,
    256 + 29,
    512 + 17,
    1024 + 9,
    2048 + 5,
    4096 + 83,
    8192 + 27,
    16384 + 43,
    32768 + 3,
    65536 + 45,
    131072 + 9,
    262144 + 39,
    524288 + 39,
    1048576 + 9,
    2097152 + 5,
    4194304 + 3,
    8388608 + 33,
    16777216 + 27,
    33554432 + 9,
    67108864 + 71,
    134217728 + 39,
    268435456 + 9,
    536870912 + 5,
    1073741824 + 83,
    0
]
polys = map(long, polys)

class NULL: pass

class Dictionary:
    dummy = "<dummy key>"
    def __init__(mp, newalg=0):
        mp.ma_size = 0
        mp.ma_poly = 0
        mp.ma_table = []
        mp.ma_fill = 0
        mp.ma_used = 0
        mp.oldalg = not newalg
        mp.warmup = newalg==2
        mp.rotleft = newalg==3
        mp.trips = 0
        mp.tripmax = 0

    def getTrips(self):
        trips, tripmax = self.trips, self.tripmax
        self.trips = self.tripmax = 0
        return trips, tripmax

    def lookdict(mp, key, _hash):
        me_hash, me_key, me_value = range(3) # rec slots
        dummy = mp.dummy
        
        mask = mp.ma_size-1
        ep0 = mp.ma_table
        i = (~_hash) & mask
        ep = ep0[i]
        if ep[me_key] is NULL or ep[me_key] == key:
            return ep
        if ep[me_key] == dummy:
            freeslot = ep
        else:
            if (ep[me_hash] == _hash and
                cmp(ep[me_key], key) == 0) :
                return ep
            freeslot = NULL

###### FROM HERE
        if mp.oldalg:
            incr = (_hash ^ (_hash >> 3)) & mask
        else:
            # note that we do not mask!
            # the shifting is worth it in the incremental case.

            ## added after posting to python-dev:
            uhash = _hash & 0xffffffffl
            if mp.warmup:
                incr = uhash
                mask2 = 0xffffffffl ^ mask
                while mask2 > mask:
                    if (incr & 1):
                        incr = incr ^ mp.ma_poly
                    incr = incr >> 1
                    mask2 = mask2>>1
                # this loop *can* be sped up by tables
                # with precomputed multiple shifts.
                # But I'm not sure if it is worth it at all.
            else:
                incr = uhash ^ (uhash >> 3)

###### TO HERE
            
        if (not incr):
            incr = mask

        triplen = 0            
        while 1:
            mp.trips = mp.trips+1
            triplen = triplen+1
            if triplen > mp.tripmax:
                mp.tripmax = triplen
            
            ep = ep0[int((i+incr)&mask)]
            if (ep[me_key] is NULL) :
                if (freeslot is not NULL) :
                    return freeslot
                else:
                    return ep
            if (ep[me_key] == dummy) :
                if (freeslot == NULL):
                    freeslot = ep
            elif (ep[me_key] == key or
                 (ep[me_hash] == _hash and
                  cmp(ep[me_key], key) == 0)) :
                return ep

            # Cycle through GF(2^n)-{0}
###### FROM HERE
            if mp.oldalg:
                incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            elif mp.rotleft:
                if incr &0x80000000L:
                    incr = (incr << 1) | 1
                else:
                    incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            else:
                # new algorithm: do a division
                if (incr & 1):
                    incr = incr ^ mp.ma_poly
                incr = incr >> 1
###### TO HERE

    def insertdict(mp, key, _hash, value):
        me_hash, me_key, me_value = range(3) # rec slots
        
        ep = mp.lookdict(key, _hash)
        if (ep[me_value] is not NULL) :
            old_value = ep[me_value]
            ep[me_value] = value
        else :
            if (ep[me_key] is NULL):
                mp.ma_fill=mp.ma_fill+1
            ep[me_key] = key
            ep[me_hash] = _hash
            ep[me_value] = value
            mp.ma_used = mp.ma_used+1

    def dictresize(mp, minused):
        me_hash, me_key, me_value = range(3) # rec slots

        oldsize = mp.ma_size
        oldtable = mp.ma_table
        MINSIZE = 4
        newsize = MINSIZE
        for i in range(len(polys)):
            if (newsize > minused) :
                newpoly = polys[i]
                break
            newsize = newsize << 1
        else:
            return -1

        _nullentry = range(3)
        _nullentry[me_hash] = 0
        _nullentry[me_key] = NULL
        _nullentry[me_value] = NULL

        newtable = map(lambda x,y=_nullentry:y[:], range(newsize))      

        mp.ma_size = newsize
        mp.ma_poly = newpoly
        mp.ma_table = newtable
        mp.ma_fill = 0
        mp.ma_used = 0

        for ep in oldtable:
            if (ep[me_value] is not NULL):
                mp.insertdict(ep[me_key],ep[me_hash],ep[me_value])
        return 0

    # PyDict_GetItem
    def __getitem__(op, key):
        me_hash, me_key, me_value = range(3) # rec slots

        if not op.ma_table:
            raise KeyError, key
        _hash = hash(key)
        return op.lookdict(key, _hash)[me_value]

    # PyDict_SetItem
    def __setitem__(op, key, value):
        mp = op
        _hash = hash(key)
##      /* if fill >= 2/3 size, double in size */
        if (mp.ma_fill*3 >= mp.ma_size*2) :
            if (mp.dictresize(mp.ma_used*2) != 0):
                if (mp.ma_fill+1 > mp.ma_size):
                    raise MemoryError
        mp.insertdict(key, _hash, value)

    # more interface functions
    def keys(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _key)
        return res

    def values(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _value)
        return res

    def items(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( (_key, _value) )
        return res

    def __cmp__(self, other):
        mine = self.items()
        others = other.items()
        mine.sort()
        others.sort()
        return cmp(mine, others)

######################################################
## tests

def test(lis, dic):
    for key in lis: dic[key]

def nulltest(lis, dic):
    for key in lis: dic

def string_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    for i in range(n):
        s = str(i) #* 5
        #s = chr(i%256) + chr(i>>8)##
        d1[s] = d2[s] = d3[s] = d4[s] = i
    return d1, d2, d3, d4

def istring_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    for i in range(n):
        s = chr(i%256) + chr(i>>8)
        d1[s] = d2[s] = d3[s] = d4[s] = i
    return d1, d2, d3, d4

def random_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    from whrandom import randint
    import sys
    keys = []
    for i in range(n):
        keys.append(randint(0, sys.maxint-1))
    for i in keys:
        d1[i] = d2[i] = d3[i] = d4[i] = i
    return d1, d2, d3, d4

def badnum_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    shift = 10
    if EXTREME:
        shift = 16
    for i in range(n):
        bad = i << 16
        d2[bad] = d3[bad] = d4[bad] = i
        if n <= 1000: d1[bad] = i
    return d1, d2, d3, d4

def names_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    import win32con
    keys = win32con.__dict__.keys()
    if len(keys) < n:
        keys = []
    for s in keys[:n]:
        d1[s] = d2[s] = d3[s] = d4[s] = s
    return d1, d2, d3, d4

def do_test(dict):
    keys = dict.keys()
    dict.getTrips() # reset
    test(keys, dict)
    return "%d/%d" % dict.getTrips()

EXTREME=1

if __name__ == "__main__":

    for N in (1000,2000,3000,4000):    

        sdold, sdnew, sdnew2, sdrot = string_dicts(N)
        #idold, idnew, idnew2, idrot = istring_dicts(N)
        bdold, bdnew, bdnew2, bdrot = badnum_dicts(N)
        rdold, rdnew, rdnew2, rdrot = random_dicts(N)
        ndold, ndnew, ndnew2, ndrot = names_dicts(N)
        fmt = "old=%s new=%s new2=%s rot=%s"
        print "N=%d" %N        
        print ("trips for strings "+fmt) % tuple(
            map(do_test, (sdold, sdnew, sdnew2, sdrot)) )
        #print ("trips for bin strings "+fmt) % tuple(
        #    map(do_test, (idold, idnew, idnew2, idrot)) )
        print ("trips for bad integers "+fmt) % tuple(
            map(do_test, (bdold, bdnew, bdnew2, bdrot)))
        print ("trips for random integers "+fmt) % tuple(
            map(do_test, (rdold, rdnew, rdnew2, rdrot)))
        print ("trips for windows names "+fmt) % tuple(
            map(do_test, (ndold, ndnew, ndnew2, ndrot)))

"""
Results with a shift of 10 (EXTREME=0):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.097 new=5.088
timing for bad integers old=101.540 new=12.610

Results with a shift of 16 (EXTREME=1):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.218 new=5.147
timing for bad integers old=571.210 new=19.220
"""

--------------F2E36624A7D999AC873AD6CE--



From just@letterror.com  Tue Dec 19 18:46:18 2000
From: just@letterror.com (Just van Rossum)
Date: Tue, 19 Dec 2000 19:46:18 +0100
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <14911.39111.710940.342986@anthem.concentric.net>
References: <l03102804b6653dd31c4e@[193.78.237.123]>
Message-ID: <l03102806b6655cbd62fa@[193.78.237.123]>

At 12:20 PM -0500 19-12-2000, Barry A. Warsaw wrote:
>I agree.  Have we agreed on the semantics and signature of builtin
>join() though?  Is it just string.join() stuck in builtins?

Yep. I'm with /F that further generalization can be done later. Oh, does
this mean that "".join() becomes deprecated? (Nice test case for the
warning framework...)

Just




From barry@digicool.com  Tue Dec 19 18:56:45 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 13:56:45 -0500
Subject: [Python-Dev] Death to string functions!
References: <l03102804b6653dd31c4e@[193.78.237.123]>
 <l03102806b6655cbd62fa@[193.78.237.123]>
Message-ID: <14911.44909.414520.788073@anthem.concentric.net>

>>>>> "JvR" == Just van Rossum <just@letterror.com> writes:

    JvR> Oh, does this mean that "".join() becomes deprecated?

Please, no.



From guido@python.org  Tue Dec 19 18:56:39 2000
From: guido@python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 13:56:39 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Tue, 19 Dec 2000 13:56:45 EST."
 <14911.44909.414520.788073@anthem.concentric.net>
References: <l03102804b6653dd31c4e@[193.78.237.123]> <l03102806b6655cbd62fa@[193.78.237.123]>
 <14911.44909.414520.788073@anthem.concentric.net>
Message-ID: <200012191856.NAA30524@cj20424-a.reston1.va.home.com>

> >>>>> "JvR" == Just van Rossum <just@letterror.com> writes:
> 
>     JvR> Oh, does this mean that "".join() becomes deprecated?
> 
> Please, no.

No.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From just@letterror.com  Tue Dec 19 19:15:19 2000
From: just@letterror.com (Just van Rossum)
Date: Tue, 19 Dec 2000 20:15:19 +0100
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <14911.44909.414520.788073@anthem.concentric.net>
References: <l03102804b6653dd31c4e@[193.78.237.123]>
 <l03102806b6655cbd62fa@[193.78.237.123]>
Message-ID: <l03102808b665629fc5bf@[193.78.237.123]>

At 1:56 PM -0500 19-12-2000, Barry A. Warsaw wrote:
>>>>>> "JvR" == Just van Rossum <just@letterror.com> writes:
>
>    JvR> Oh, does this mean that "".join() becomes deprecated?
>
>Please, no.

And keep two non-deprecated ways to do the same thing? I'm not saying it
should be removed, just that the powers that be declare that _one_ of them
is the preferred way.

And-if-that-one-isn't-builtin-join()-I-don't-know-why-to-even-bother y'rs
-- Just




From greg@cosc.canterbury.ac.nz  Tue Dec 19 22:35:05 2000
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 20 Dec 2000 11:35:05 +1300 (NZDT)
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012191537.KAA28909@cj20424-a.reston1.va.home.com>
Message-ID: <200012192235.LAA02763@s454.cosc.canterbury.ac.nz>

Guido:

> Boy, are you stirring up a can of worms that we've been through many
> times before!  Nothing you say hasn't been said at least a hundred
> times before, on this list as well as on c.l.py.

And I'll wager you'll continue to hear them said at regular intervals
for a long time to come, because you've done something which a lot of
people feel very strongly was a mistake, and they have some very
rational arguments as to why it was a mistake, whereas you don't seem
to have any arguments to the contrary which those people are likely to
find convincing.

> There really seem to be only two possibilities that don't have this
> problem: (1) make it a built-in, or (2) make it a method on strings.

False dichotomy. Some other possibilities:

(3) Use an operator.

(4) Leave it in the string module! Really, I don't see what
would be so bad about that. You still need somewhere to put
all the string-related constants, so why not keep the string
module for those, plus the few functions that don't have
any other obvious place?

> If " ".join(L) bugs you, try this:
>
>    space = " "	 # This could be a global
>     .
>     .
>     .
>    s = space.join(L)

Surely you must realise that this completely fails to
address Mr. Petrilli's concern?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From akuchlin@mems-exchange.org  Wed Dec 20 14:40:58 2000
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 20 Dec 2000 09:40:58 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <E148ZW1-0000xd-00@usw-sf-web3.sourceforge.net>; from noreply@sourceforge.net on Tue, Dec 19, 2000 at 07:02:05PM -0800
References: <E148ZW1-0000xd-00@usw-sf-web3.sourceforge.net>
Message-ID: <20001220094058.A17623@kronos.cnri.reston.va.us>

On Tue, Dec 19, 2000 at 07:02:05PM -0800, noreply@sourceforge.net wrote:
>Date: 2000-Dec-19 19:02
>By: tim_one

>Unrelated to your patch but in the same area: the other msg, "ord()
>expected string or Unicode character", doesn't read right.  The type
>names in question are "string" and "unicode":
>
>>>> type("")
><type 'string'>
>>>> type(u"")
><type 'unicode'>
>>>>
>
>"character" is out of place, or not in enough places.  Just thought I'd mention that, since *you're* so cute!

Is it OK to refer to 8-bit strings under that name?  
How about "expected an 8-bit string or Unicode string", when the object passed to ord() isn't of the right type.

Similarly, when the value is of the right type but has length>1,
the message is "ord() expected a character, length-%d string found".  
Should that be "length-%d (string / unicode) found)" 

And should the type names be changed to '8-bit string'/'Unicode
string', maybe?

--amk


From barry@digicool.com  Wed Dec 20 15:39:30 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Wed, 20 Dec 2000 10:39:30 -0500
Subject: [Python-Dev] IGNORE - this is only a test
Message-ID: <14912.53938.280864.596141@anthem.concentric.net>

Testing the new MX for python.org...


From fdrake@acm.org  Wed Dec 20 16:57:09 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 20 Dec 2000 11:57:09 -0500 (EST)
Subject: [Python-Dev] scp with SourceForge
Message-ID: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>

  I've not been able to get scp to work with SourceForge since they
upgraded their machines.  ssh works fine.  Is this related to the
protocol mismatch problem that was discussed earlier?  My ssh tells me
"SSH Version OpenSSH-1.2.2, protocol version 1.5.", and the remote
sshd is sending it's version as "Remote protocol version 1.99, remote
software version OpenSSH_2.2.0p1".
  Was there a reasonable way to deal with this?  I'm running
Linux-Mandrake 7.1 with very little customization or extra stuff.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From tismer@tismer.com  Wed Dec 20 16:31:00 2000
From: tismer@tismer.com (Christian Tismer)
Date: Wed, 20 Dec 2000 18:31:00 +0200
Subject: [Python-Dev] Re: The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCGEHJIEAA.tim.one@home.com> <3A3F9C16.562F9D9F@tismer.com>
Message-ID: <3A40DEC4.5F659E8E@tismer.com>


Christian Tismer wrote:
...
When talking about left rotation, an error crept in. Sorry!

> We would then have:
> 
>                 incr = ROTATE_LEFT(incr, 1)
>                 if (incr > mask):
>                     incr = incr ^ mp.ma_poly

If incr contains the high bits of the hash, then the
above must be replaced by

                incr = ROTATE_LEFT(incr, 1)
                if (incr & (mask+1)):
                    incr = incr ^ mp.ma_poly

or the multiplicative group is not guaranteed to be
generated, obviously.

This doesn't change my results, rotating right is still my choice.

ciao - chris

D:\crml_doc\platf\py>python dictest.py
N=1000
trips for strings old=293/9 new=302/7 new2=221/7 rot=272/8
trips for bad integers old=499500/999 new=13187/31 new2=999/1 rot=16982/27
trips for random integers old=339/9 new=337/7 new2=343/10 rot=342/8
trips for windows names old=230/5 new=207/7 new2=200/5 rot=225/6
N=2000
trips for strings old=1093/11 new=1109/10 new2=786/6 rot=1090/9
trips for bad integers old=0/0 new=26455/32 new2=1999/1 rot=33985/31
trips for random integers old=747/10 new=733/7 new2=734/7 rot=728/8
trips for windows names old=503/8 new=542/9 new2=564/6 rot=521/11
N=3000
trips for strings old=810/5 new=839/6 new2=609/5 rot=820/6
trips for bad integers old=0/0 new=38681/36 new2=2999/1 rot=50985/26
trips for random integers old=709/4 new=728/5 new2=767/5 rot=711/6
trips for windows names old=712/6 new=711/5 new2=691/5 rot=727/7
N=4000
trips for strings old=1850/9 new=1843/8 new2=1375/11 rot=1861/9
trips for bad integers old=0/0 new=52994/39 new2=3999/1 rot=67986/26
trips for random integers old=1584/9 new=1606/8 new2=1505/9 rot=1579/8
trips for windows names old=1449/8 new=1434/8 new2=1457/11 rot=1476/7
-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From tim.one@home.com  Wed Dec 20 19:52:40 2000
From: tim.one@home.com (Tim Peters)
Date: Wed, 20 Dec 2000 14:52:40 -0500
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>

[Fred L. Drake, Jr.]
>   I've not been able to get scp to work with SourceForge since they
> upgraded their machines.  ssh works fine.

Same here.  In particular, I can use ssh to log in to shell.sourceforge.net,
but attempts to scp there act like this (breaking long lines by hand with
\n\t):

> scp -v pep-0042.html
	tim_one@shell.sourceforge.net:/home/groups/python/htdocs/peps
Executing: host shell.sourceforge.net, user tim_one,
	command scp -v -t /home/groups/python/htdocs/peps
SSH Version 1.2.14 [winnt-4.0-x86], protocol version 1.4.
Standard version.  Does not use RSAREF.
ssh_connect: getuid 0 geteuid 0 anon 0
Connecting to shell.sourceforge.net [216.136.171.201] port 22.
Connection established.
Remote protocol version 1.99, remote software version OpenSSH_2.2.0p1
Waiting for server public key.
Received server public key (768 bits) and host key (1024 bits).
Host 'shell.sourceforge.net' is known and matches the host key.
Initializing random; seed file C:\Code/.ssh/random_seed
IDEA not supported, using 3des instead.
Encryption type: 3des
Sent encrypted session key.
Received encrypted confirmation.
Trying RSA authentication with key 'sourceforge'
Server refused our key.
Doing password authentication.
Password:  **** here tim enteredth his password ****
Sending command: scp -v -t /home/groups/python/htdocs/peps
Entering interactive session.

And there it sits forever.  Several others report the same symptom on SF
forums, and assorted unresolved SF Support and Bug reports.  We don't know
what your symptom is!

> Is this related to the protocol mismatch problem that was discussed
> earlier?

Doubt it.  Most commentators pin the blame elsewhere.

> ...
>   Was there a reasonable way to deal with this?

A new note was added to

http://sourceforge.net/support/?func=detailsupport&support_id=110235&group_i
d=1

today, including:

"""
Re: Shell server

We're also aware of the number of problems on the shell server with respect
to restricitive permissions on some programs - and sourcing of shell
environments.  We're also aware of the troubles with scp and transferring
files.  As a work around, we recommend either editing files on the shell
server, or scping files to the shell server from external hosts to the shell
server, whilst logged in to the shell server.
"""

So there you go:  scp files to the shell server from external hosts to the
shell server whilst logged in to the shell server <wink>.

Is scp working for *anyone*???



From fdrake@acm.org  Wed Dec 20 20:17:58 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 20 Dec 2000 15:17:58 -0500 (EST)
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>
References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>
Message-ID: <14913.5110.271684.107030@cj42289-a.reston1.va.home.com>

Tim Peters writes:
 > And there it sits forever.  Several others report the same symptom on SF
 > forums, and assorted unresolved SF Support and Bug reports.  We don't know
 > what your symptom is!

  Exactly the same.

 > So there you go:  scp files to the shell server from external hosts to the
 > shell server whilst logged in to the shell server <wink>.

  Yeah, that really helps.... NOT!  All I want to be able to do is
post a new development version of the documentation.  ;-(


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From bckfnn@worldonline.dk  Wed Dec 20 20:23:33 2000
From: bckfnn@worldonline.dk (Finn Bock)
Date: Wed, 20 Dec 2000 20:23:33 GMT
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
Message-ID: <3a411449.5247545@smtp.worldonline.dk>

[Fred L. Drake]

>  I've not been able to get scp to work with SourceForge since they
>upgraded their machines.  ssh works fine.  Is this related to the
>protocol mismatch problem that was discussed earlier?  My ssh tells me
>"SSH Version OpenSSH-1.2.2, protocol version 1.5.", and the remote
>sshd is sending it's version as "Remote protocol version 1.99, remote
>software version OpenSSH_2.2.0p1".
>  Was there a reasonable way to deal with this?  I'm running
>Linux-Mandrake 7.1 with very little customization or extra stuff.

I managed to update the jython website by logging into the shell machine
by ssh and doing a ftp back to my machine (using the IP number). That
isn't exactly reasonable, but I was desperate.

regards,
finn


From tim.one@home.com  Wed Dec 20 20:42:11 2000
From: tim.one@home.com (Tim Peters)
Date: Wed, 20 Dec 2000 15:42:11 -0500
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14913.5110.271684.107030@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENBIEAA.tim.one@home.com>

[Tim]
> So there you go:  scp files to the shell server from external
> hosts to the shell server whilst logged in to the shell server <wink>.

[Fred]
>   Yeah, that really helps.... NOT!  All I want to be able to do is
> post a new development version of the documentation.  ;-(

All I want to do is make a measly change to a PEP -- I'm afraid it doesn't
ask how trivial your intents are.  If some suck^H^H^H^Hdeveloper admits that
scp works for them, maybe we can mail them stuff and have *them* copy it
over.

no-takers-so-far-though-ly y'rs  - tim



From barry@digicool.com  Wed Dec 20 20:49:00 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Wed, 20 Dec 2000 15:49:00 -0500
Subject: [Python-Dev] scp with SourceForge
References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>
Message-ID: <14913.6972.934625.840781@anthem.concentric.net>

>>>>> "TP" == Tim Peters <tim.one@home.com> writes:

    TP> So there you go: scp files to the shell server from external
    TP> hosts to the shell server whilst logged in to the shell server
    TP> <wink>.

Psheesh, /that/ was obvious.  Did you even have to ask?

    TP> Is scp working for *anyone*???

Nope, same thing happens to me; it just hangs.
-Barry


From tim.one@home.com  Wed Dec 20 20:53:38 2000
From: tim.one@home.com (Tim Peters)
Date: Wed, 20 Dec 2000 15:53:38 -0500
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14913.6972.934625.840781@anthem.concentric.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKENEIEAA.tim.one@home.com>

[Tim, quoting a bit of immortal SF support prose]
>     TP> So there you go: scp files to the shell server from external
>     TP> hosts to the shell server whilst logged in to the shell server
>     TP> <wink>.

[Barry]
> Psheesh, /that/ was obvious.  Did you even have to ask?

Actually, isn't this easy to do on Linux?  That is, run an ssh server
(whatever) on your home machine, log in to the SF shell (which everyone
seems able to do), then

   scp  whatever  your_home_IP_address:your_home_path

from the SF shell?  Heck, I can even get that to work on Windows, except I
don't know how to set up anything on my end to accept the connection <wink>.

>     TP> Is scp working for *anyone*???

> Nope, same thing happens to me; it just hangs.

That's good to know -- since nobody else mentioned this, Fred probably
figured he was unique.

not-that-he-isn't-it's-just-that-he's-not-ly y'rs  - tim



From fdrake@acm.org  Wed Dec 20 20:52:10 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 20 Dec 2000 15:52:10 -0500 (EST)
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKENEIEAA.tim.one@home.com>
References: <14913.6972.934625.840781@anthem.concentric.net>
 <LNBBLJKPBEHFEDALKOLCKENEIEAA.tim.one@home.com>
Message-ID: <14913.7162.824838.63143@cj42289-a.reston1.va.home.com>

Tim Peters writes:
 > Actually, isn't this easy to do on Linux?  That is, run an ssh server
 > (whatever) on your home machine, log in to the SF shell (which everyone
 > seems able to do), then
 > 
 >    scp  whatever  your_home_IP_address:your_home_path
 > 
 > from the SF shell?  Heck, I can even get that to work on Windows, except I
 > don't know how to set up anything on my end to accept the connection <wink>.

  Err, yes, that's easy to do, but... that means putting your private
key on SourceForge.  They're a great bunch of guys, but they can't
have my private key!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From tim.one@home.com  Wed Dec 20 21:06:07 2000
From: tim.one@home.com (Tim Peters)
Date: Wed, 20 Dec 2000 16:06:07 -0500
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14913.7162.824838.63143@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENGIEAA.tim.one@home.com>

[Fred]
>   Err, yes, that's easy to do, but... that means putting your private
> key on SourceForge.  They're a great bunch of guys, but they can't
> have my private key!

So generate a unique one-shot key pair for the life of the copy.  I can do
that for you on Windows if you lack a real OS <snort>.



From thomas@xs4all.net  Wed Dec 20 22:59:49 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 20 Dec 2000 23:59:49 +0100
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>; from tim.one@home.com on Wed, Dec 20, 2000 at 02:52:40PM -0500
References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>
Message-ID: <20001220235949.F29681@xs4all.nl>

On Wed, Dec 20, 2000 at 02:52:40PM -0500, Tim Peters wrote:

> So there you go:  scp files to the shell server from external hosts to the
> shell server whilst logged in to the shell server <wink>.

> Is scp working for *anyone*???

Not for me, anyway. And I'm not just saying that to avoid scp-duty :) And
I'm using the same ssh version, which works fine on all other machines. It
probably has to do with the funky setup Sourceforge uses. (Try looking at
'df' and 'cat /proc/mounts', and comparing the two -- you'll see what I mean
:) That also means I'm not tempted to try and reproduce it, obviously :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From tim.one@home.com  Thu Dec 21 03:24:12 2000
From: tim.one@home.com (Tim Peters)
Date: Wed, 20 Dec 2000 22:24:12 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012192235.LAA02763@s454.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEOEIEAA.tim.one@home.com>

[Guido]
>> Boy, are you stirring up a can of worms that we've been through many
>> times before!  Nothing you say hasn't been said at least a hundred
>> times before, on this list as well as on c.l.py.

[Greg Ewing]
> And I'll wager you'll continue to hear them said at regular intervals
> for a long time to come, because you've done something which a lot of
> people feel very strongly was a mistake, and they have some very
> rational arguments as to why it was a mistake, whereas you don't seem
> to have any arguments to the contrary which those people are likely to
> find convincing.

Then it's a wash:  Guido doesn't find their arguments convincing either, and
ties favor the status quo even in the absence of BDFLness.

>> There really seem to be only two possibilities that don't have this
>> problem: (1) make it a built-in, or (2) make it a method on strings.

> False dichotomy. Some other possibilities:
>
> (3) Use an operator.

Oh, that's likely <wink>.

> (4) Leave it in the string module! Really, I don't see what
> would be so bad about that. You still need somewhere to put
> all the string-related constants, so why not keep the string
> module for those, plus the few functions that don't have
> any other obvious place?

Guido said he wants to deprecate the entire string module, so that Python
can eventually warn on the mere presence of "import string".  That's what he
said when I earlier ranted in favor of keeping the string module around.

My guess is that making it a builtin is the only alternative that stands any
chance at this point.

>> If " ".join(L) bugs you, try this:
>>
>>    space = " "	 # This could be a global
>>     .
>>     .
>>     .
>>    s = space.join(L)

> Surely you must realise that this completely fails to
> address Mr. Petrilli's concern?

Don't know about Guido, but I don't realize that, and we haven't heard back
from Charles.  His objections were raised the first day " ".join was
suggested, space.join was suggested almost immediately after, and that
latter suggestion did seem to pacify at least several objectors.  Don't know
whether it makes Charles happier, but since it *has* made others happier in
the past, it's not unreasonable to imagine that Charles might like it too.

if-we're-to-be-swayed-by-his-continued-outrage-afraid-it-will-
    have-to-come-from-him-ly y'rs  - tim



From tim.one@home.com  Thu Dec 21 07:44:19 2000
From: tim.one@home.com (Tim Peters)
Date: Thu, 21 Dec 2000 02:44:19 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <20001220094058.A17623@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEOMIEAA.tim.one@home.com>

[Andrew Kuchling]
> Is it OK to refer to 8-bit strings under that name?
> How about "expected an 8-bit string or Unicode string", when the
> object passed to ord() isn't of the right type.
>
> Similarly, when the value is of the right type but has length>1,
> the message is "ord() expected a character, length-%d string found".
> Should that be "length-%d (string / unicode) found)"
>
> And should the type names be changed to '8-bit string'/'Unicode
> string', maybe?

Actually, upon reflection I think it was a mistake to add all these "or
Unicode" clauses to the error msgs to begin with.  Python used to have only
one string type, we're saying that's also a hope for the future, and in the
meantime I know I'd have no trouble understanding "string" as including both
8-bit strings and Unicode strings.

So we should say "8-bit string" or "Unicode string" when *only* one of those
is allowable.  So

    "ord() expected string ..."

instead of (even a repaired version of)

    "ord() expected string or Unicode character ..."

but-i'm-not-even-motivated-enough-to-finish-this-sig-



From tim.one@home.com  Thu Dec 21 08:52:54 2000
From: tim.one@home.com (Tim Peters)
Date: Thu, 21 Dec 2000 03:52:54 -0500
Subject: [Python-Dev] RE: The Dictionary Gem is polished!
In-Reply-To: <3A3F9C16.562F9D9F@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPAIEAA.tim.one@home.com>

[Christian Tismer]
> Are you saying I should check the thing in? Really?

Of course.  The first thing you talked about showed a major improvement in
some bad cases, did no harm in the others, and both results were more than
just plausible -- they made compelling sense and were backed by simulation.
So why not check it in?  It's a clear net win!

Stuff since then has been a spattering of maybe-good maybe-bad maybe-neutral
ideas that hasn't gotten anywhere conclusive.  What I want to avoid is
another "Unicode compression" scenario, where we avoid grabbing a clear win
for months just because it may not be the best possible of all conceivable
compression schemes -- and then mistakes get made in a last-second rush to
get *any* improvement.

Checking in a clear improvement today does not preclude checking in a better
one next week <wink>.

> ...
> Ok, we have to stick with the given polymomials to stay
> compatible,

Na, feel free to explore that too, if you like.  It really should get some
study!  The polys there now are utterly arbitrary:  of all polys that happen
to be irreducible and that have x as a primitive root in the induced
multiplicative group, these are simply the smallest when viewed as binary
integers.  That's because they were *found* by trying all odd binary ints
with odd parity (even ints and ints with even parity necessarily correspond
to reducible polys), starting with 2**N+3 and going up until finding the
first one that was both irreducible and had x as a primitive root.  There's
no theory at all that I know of to say that any such poly is any better for
this purpose than any other.  And they weren't tested for that either --
they're simply the first ones "that worked at all" in a brute search.

Besides, Python's "better than random" dict behavior-- when it obtains! --is
almost entirely due to that its hash functions produce distinct starting
indices more often than a random hash function would.  The contribution of
the GF-based probe sequence in case of collision is to avoid the terrible
behavior most other forms of probe sequence would cause given that Python's
hash functions also tend to fill solid contiguous slices of the table more
often than would a random hash function.

[stuff about rotation]
> ...
> Too bad that this isn't supported in C. It is a native
> machine instruction on X86 machines.

Guido long ago rejected hash functions based on rotation for this reason;
he's not likely to approve of rotations more in the probe sequence <wink>.

A similar frustration is that almost modern CPUs have a fast instruction to
get at the high 32 bits of a 32x32->64 bit multiply:  another way to get the
high bits of the hash code into play is to multiply the 32-bit hash code by
a 32-bit constant (see Knuth for "Fibonacci hashing" details), and take the
least-significant N bits of the *upper* 32 bits of the 64-bit product as the
initial table index.  If the constant is chosen correctly, this defines a
permutation on the space of 32-bit unsigned ints, and can be very effective
at "scrambling" arithmetic progressions (which Python's hash functions often
produce).  But C doesn't give a decent way to get at that either.

> ...
> On the current "faster than random" cases, I assume that
> high bits in the hash are less likely than low bits,

I'm not sure what this means.  As the comment in dictobject.c says, it's
common for Python's hash functions to return a result with lots of leading
zeroes.  But the lookup currently applies ~ to those first (which is a bad
idea -- see earlier msgs), so the actual hash that gets *used* often has
lots of leading ones.

> so it is more likely that an entry finds its good place in the dict,
> before bits are rotated in. hence the "good" cases would be kept.

I can agree with this easily if I read the above as asserting that in the
very good cases today, the low bits of hashes (whether or not ~ is applied)
vary more than the high bits.

> ...
> Random integers seem to withstand any of these procedures.

If you wanted to, you could *define* random this way <wink>.

> ...
> I'd say let's do the patch   --   ciao - chris

full-circle-ly y'rs  - tim



From mal@lemburg.com  Thu Dec 21 11:16:27 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 21 Dec 2000 12:16:27 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
References: <LNBBLJKPBEHFEDALKOLCKEOMIEAA.tim.one@home.com>
Message-ID: <3A41E68B.6B12CD71@lemburg.com>

Tim Peters wrote:
> 
> [Andrew Kuchling]
> > Is it OK to refer to 8-bit strings under that name?
> > How about "expected an 8-bit string or Unicode string", when the
> > object passed to ord() isn't of the right type.
> >
> > Similarly, when the value is of the right type but has length>1,
> > the message is "ord() expected a character, length-%d string found".
> > Should that be "length-%d (string / unicode) found)"
> >
> > And should the type names be changed to '8-bit string'/'Unicode
> > string', maybe?
> 
> Actually, upon reflection I think it was a mistake to add all these "or
> Unicode" clauses to the error msgs to begin with.  Python used to have only
> one string type, we're saying that's also a hope for the future, and in the
> meantime I know I'd have no trouble understanding "string" as including both
> 8-bit strings and Unicode strings.
> 
> So we should say "8-bit string" or "Unicode string" when *only* one of those
> is allowable.  So
> 
>     "ord() expected string ..."
> 
> instead of (even a repaired version of)
> 
>     "ord() expected string or Unicode character ..."

I think this has to do with understanding that there are two
string types in Python 2.0 -- a novice won't notice this until
she sees the error message.

My understanding is similar to yours, "string" should mean
"any string object" and in cases where the difference between
8-bit string and Unicode matters, these should be referred to
as "8-bit string" and "Unicode string".

Still, I think it is a good idea to make people aware of the
possibility of passing Unicode objects to these functions, so
perhaps the idea of adding both possibilies to error messages
is not such a bad idea for 2.1. The next phases would be converting
all messages back to "string" and then convert all strings to
Unicode ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From akuchlin@mems-exchange.org  Thu Dec 21 18:37:19 2000
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 21 Dec 2000 13:37:19 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEOMIEAA.tim.one@home.com>; from tim.one@home.com on Thu, Dec 21, 2000 at 02:44:19AM -0500
References: <20001220094058.A17623@kronos.cnri.reston.va.us> <LNBBLJKPBEHFEDALKOLCKEOMIEAA.tim.one@home.com>
Message-ID: <20001221133719.B11880@kronos.cnri.reston.va.us>

On Thu, Dec 21, 2000 at 02:44:19AM -0500, Tim Peters wrote:
>So we should say "8-bit string" or "Unicode string" when *only* one of those
>is allowable.  So

OK... how about this patch?

Index: bltinmodule.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v
retrieving revision 2.185
diff -u -r2.185 bltinmodule.c
--- bltinmodule.c	2000/12/20 15:07:34	2.185
+++ bltinmodule.c	2000/12/21 18:36:54
@@ -1524,13 +1524,14 @@
 		}
 	} else {
 		PyErr_Format(PyExc_TypeError,
-			     "ord() expected string or Unicode character, " \
+			     "ord() expected string of length 1, but " \
 			     "%.200s found", obj->ob_type->tp_name);
 		return NULL;
 	}
 
 	PyErr_Format(PyExc_TypeError, 
-		     "ord() expected a character, length-%d string found",
+		     "ord() expected a character, "
+                     "but string of length %d found",
 		     size);
 	return NULL;
 }


From thomas@xs4all.net  Fri Dec 22 15:21:43 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 22 Dec 2000 16:21:43 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>; from noreply@sourceforge.net on Fri, Dec 22, 2000 at 07:07:03AM -0800
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
Message-ID: <20001222162143.A5515@xs4all.nl>

On Fri, Dec 22, 2000 at 07:07:03AM -0800, noreply@sourceforge.net wrote:

>   * Guido-style:  8-column hard-tab indents.
>   * New style:  4-column space-only indents.

Hm, I must have missed this... Is 'new style' the preferred style, as its
name suggests, or is Guido mounting a rebellion to adhere to the One True
Style (or rather his own version of it, which just has the * in pointer
type declarations wrong ? :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From fdrake@acm.org  Fri Dec 22 15:31:21 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Dec 2000 10:31:21 -0500 (EST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <20001222162143.A5515@xs4all.nl>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
 <20001222162143.A5515@xs4all.nl>
Message-ID: <14915.29641.806901.661707@cj42289-a.reston1.va.home.com>

Thomas Wouters writes:
 > Hm, I must have missed this... Is 'new style' the preferred style, as its
 > name suggests, or is Guido mounting a rebellion to adhere to the One True
 > Style (or rather his own version of it, which just has the * in pointer
 > type declarations wrong ? :)

  Guido has grudgingly granted that new code in the "New style" is
acceptable, mostly because many people complain that "Guido style"
causes too much code to get scrunched up on the right margin.  The
"New style" is more like the recommendations for Python code as well,
so it's easier for Python programmers to read (Tabs are hard to read
clearly! ;).


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From cgw@fnal.gov  Fri Dec 22 15:43:45 2000
From: cgw@fnal.gov (Charles G Waldman)
Date: Fri, 22 Dec 2000 09:43:45 -0600 (CST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add
 panel support
In-Reply-To: <14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
 <20001222162143.A5515@xs4all.nl>
 <14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
Message-ID: <14915.30385.201343.360880@buffalo.fnal.gov>

Fred L. Drake, Jr. writes:
 > 
 >   Guido has grudgingly granted that new code in the "New style" is
 > acceptable, mostly because many people complain that "Guido style"
 > causes too much code to get scrunched up on the right margin.

I am reminded of Linus Torvalds comments on this subject (see
/usr/src/linux/Documentation/CodingStyle):

  Now, some people will claim that having 8-character indentations
  makes the code move too far to the right, and makes it hard to read
  on a 80-character terminal screen.  The answer to that is that if
  you need more than 3 levels of indentation, you're screwed anyway,
  and should fix your program.



From fdrake@acm.org  Fri Dec 22 15:58:56 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Dec 2000 10:58:56 -0500 (EST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add
 panel support
In-Reply-To: <14915.30385.201343.360880@buffalo.fnal.gov>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
 <20001222162143.A5515@xs4all.nl>
 <14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
 <14915.30385.201343.360880@buffalo.fnal.gov>
Message-ID: <14915.31296.56181.260479@cj42289-a.reston1.va.home.com>

Charles G Waldman writes:
 > I am reminded of Linus Torvalds comments on this subject (see
 > /usr/src/linux/Documentation/CodingStyle):

  The catch, of course, is Python/cevel.c, where breaking it up can
hurt performance.  People scream when you do things like that....


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From cgw@fnal.gov  Fri Dec 22 16:07:47 2000
From: cgw@fnal.gov (Charles G Waldman)
Date: Fri, 22 Dec 2000 10:07:47 -0600 (CST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add
 panel support
In-Reply-To: <14915.31296.56181.260479@cj42289-a.reston1.va.home.com>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
 <20001222162143.A5515@xs4all.nl>
 <14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
 <14915.30385.201343.360880@buffalo.fnal.gov>
 <14915.31296.56181.260479@cj42289-a.reston1.va.home.com>
Message-ID: <14915.31827.250987.283364@buffalo.fnal.gov>

Fred L. Drake, Jr. writes:
 > 
 >   The catch, of course, is Python/cevel.c, where breaking it up can
 > hurt performance.  People scream when you do things like that....

Quoting again from the same source:

 Use helper functions with descriptive names (you can ask the compiler
 to in-line them if you think it's performance-critical, and it will
 probably do a better job of it that you would have done).

But I should have pointed out that I was quoting the great Linus
mostly for entertainment/cultural value, and was not really trying to
add fuel to the fire.  In other words, a message that I thought was
amusing, but probably shouldn't have sent ;-)


From fdrake@acm.org  Fri Dec 22 16:20:52 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Dec 2000 11:20:52 -0500 (EST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add
 panel support
In-Reply-To: <14915.31827.250987.283364@buffalo.fnal.gov>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
 <20001222162143.A5515@xs4all.nl>
 <14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
 <14915.30385.201343.360880@buffalo.fnal.gov>
 <14915.31296.56181.260479@cj42289-a.reston1.va.home.com>
 <14915.31827.250987.283364@buffalo.fnal.gov>
Message-ID: <14915.32612.252115.562296@cj42289-a.reston1.va.home.com>

Charles G Waldman writes:
 > But I should have pointed out that I was quoting the great Linus
 > mostly for entertainment/cultural value, and was not really trying to
 > add fuel to the fire.  In other words, a message that I thought was
 > amusing, but probably shouldn't have sent ;-)

  I understood the intent; I think he's really got a point.  There are
a few places in Python where it would really help to break things up!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From fredrik@effbot.org  Fri Dec 22 16:33:37 2000
From: fredrik@effbot.org (Fredrik Lundh)
Date: Fri, 22 Dec 2000 17:33:37 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net><20001222162143.A5515@xs4all.nl><14915.29641.806901.661707@cj42289-a.reston1.va.home.com><14915.30385.201343.360880@buffalo.fnal.gov><14915.31296.56181.260479@cj42289-a.reston1.va.home.com><14915.31827.250987.283364@buffalo.fnal.gov> <14915.32612.252115.562296@cj42289-a.reston1.va.home.com>
Message-ID: <004b01c06c34$f08151c0$e46940d5@hagrid>

Fred wrote:
>   I understood the intent; I think he's really got a point.  There are
> a few places in Python where it would really help to break things up!

if that's what you want, maybe you could start by
putting the INLINE stuff back again? <halfwink>

(if C/C++ compatibility is a problem, put it inside a
cplusplus ifdef, and mark it as "for internal use only.
don't use inline on public interfaces")

</F>



From fdrake@acm.org  Fri Dec 22 16:36:15 2000
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Dec 2000 11:36:15 -0500 (EST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <004b01c06c34$f08151c0$e46940d5@hagrid>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
 <20001222162143.A5515@xs4all.nl>
 <14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
 <14915.30385.201343.360880@buffalo.fnal.gov>
 <14915.31296.56181.260479@cj42289-a.reston1.va.home.com>
 <14915.31827.250987.283364@buffalo.fnal.gov>
 <14915.32612.252115.562296@cj42289-a.reston1.va.home.com>
 <004b01c06c34$f08151c0$e46940d5@hagrid>
Message-ID: <14915.33535.520957.215310@cj42289-a.reston1.va.home.com>

Fredrik Lundh writes:
 > if that's what you want, maybe you could start by
 > putting the INLINE stuff back again? <halfwink>

  I could not see the value in the inline stuff that configure was
setting up, and still don't.

 > (if C/C++ compatibility is a problem, put it inside a
 > cplusplus ifdef, and mark it as "for internal use only.
 > don't use inline on public interfaces")

  We should be able to come up with something reasonable, but I don't
have time right now, and my head isn't currently wrapped around C
compilers.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From akuchlin@mems-exchange.org  Fri Dec 22 18:01:43 2000
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Fri, 22 Dec 2000 13:01:43 -0500
Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>; from noreply@sourceforge.net on Fri, Dec 22, 2000 at 07:07:03AM -0800
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
Message-ID: <20001222130143.B7127@newcnri.cnri.reston.va.us>

On Fri, Dec 22, 2000 at 07:07:03AM -0800, noreply@sourceforge.net wrote:
>  * Guido-style:  8-column hard-tab indents.
>  * New style:  4-column space-only indents.
>  * _curses style: 2 column indents.
>
>I'd prefer "New style", myself.

New style it is.  (Barry, is the "python" style in cc-mode.el going to
be changed to new style, or a "python2" style added?)

I've been wanting to reformat _cursesmodule.c to match the Python
style for some time.  Probably I'll do that a little while after the
panel module has settled down a bit.

Fred, did you look at the use of the CObject for exposing the API?
Did that look reasonable?  Also, should py_curses.h go in the Include/
subdirectory instead of Modules/?

--amk


From fredrik@effbot.org  Fri Dec 22 18:03:43 2000
From: fredrik@effbot.org (Fredrik Lundh)
Date: Fri, 22 Dec 2000 19:03:43 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net><20001222162143.A5515@xs4all.nl><14915.29641.806901.661707@cj42289-a.reston1.va.home.com><14915.30385.201343.360880@buffalo.fnal.gov><14915.31296.56181.260479@cj42289-a.reston1.va.home.com><14915.31827.250987.283364@buffalo.fnal.gov><14915.32612.252115.562296@cj42289-a.reston1.va.home.com><004b01c06c34$f08151c0$e46940d5@hagrid> <14915.33535.520957.215310@cj42289-a.reston1.va.home.com>
Message-ID: <006701c06c41$896a1a00$e46940d5@hagrid>

Fred wrote:
>  > if that's what you want, maybe you could start by
>  > putting the INLINE stuff back again? <halfwink>
> 
>   I could not see the value in the inline stuff that configure was
> setting up, and still don't.

the INLINE stuff guarantees that "inline" is defined to be
whatever directive the compiler uses for explicit inlining.
quoting the autoconf docs:

    If the C compiler supports the keyword inline,
    do nothing. Otherwise define inline to __inline__
    or __inline if it accepts one of those, otherwise
    define inline to be empty

as a result, you can always use "inline" in your code, and
have it do the right thing on all compilers that support ex-
plicit inlining (all modern C compilers, in practice).

:::

to deal with people compiling Python with a C compiler, but
linking it with a C++ compiler, the config.h.in file could be
written as:

/* Define "inline" to be whatever the C compiler calls it.
    To avoid problems when mixing C and C++, make sure
    to only use "inline" for internal interfaces. */
#ifndef __cplusplus
#undef inline
#endif

</F>



From akuchlin@mems-exchange.org  Fri Dec 22 19:40:15 2000
From: akuchlin@mems-exchange.org (A.M. Kuchling)
Date: Fri, 22 Dec 2000 14:40:15 -0500
Subject: [Python-Dev] PEP 222 draft
Message-ID: <200012221940.OAA01936@207-172-57-45.s45.tnt2.ann.va.dialup.rcn.com>

I've completed a draft of PEP 222 (sort of -- note the XXX comments in
the text for things that still need to be resolved).  This is being
posted to python-dev, python-web-modules, and
python-list/comp.lang.python, to get comments on the proposed
interface.  I'm on all three lists, but would prefer to see followups
on python-list/comp.lang.python, so if you can reply there, please do
so.

--amk

Abstract

    This PEP proposes a set of enhancements to the CGI development
    facilities in the Python standard library.  Enhancements might be
    new features, new modules for tasks such as cookie support, or
    removal of obsolete code.

    The intent is to incorporate the proposals emerging from this
    document into Python 2.1, due to be released in the first half of
    2001.


Open Issues

    This section lists changes that have been suggested, but about
    which no firm decision has yet been made.  In the final version of
    this PEP, this section should be empty, as all the changes should
    be classified as accepted or rejected.

    cgi.py: We should not be told to create our own subclass just so
    we can handle file uploads. As a practical matter, I have yet to
    find the time to do this right, so I end up reading cgi.py's temp
    file into, at best, another file. Some of our legacy code actually
    reads it into a second temp file, then into a final destination!
    And even if we did, that would mean creating yet another object
    with its __init__ call and associated overhead.

    cgi.py: Currently, query data with no `=' are ignored.  Even if
    keep_blank_values is set, queries like `...?value=&...' are
    returned with blank values but queries like `...?value&...' are
    completely lost.  It would be great if such data were made
    available through the FieldStorage interface, either as entries
    with None as values, or in a separate list.

    Utility function: build a query string from a list of 2-tuples

    Dictionary-related utility classes: NoKeyErrors (returns an empty
    string, never a KeyError), PartialStringSubstitution (returns 
    the original key string, never a KeyError)


    
New Modules

    This section lists details about entire new packages or modules
    that should be added to the Python standard library.

    * fcgi.py : A new module adding support for the FastCGI protocol.
      Robin Dunn's code needs to be ported to Windows, though.

Major Changes to Existing Modules

    This section lists details of major changes to existing modules,
    whether in implementation or in interface.  The changes in this
    section therefore carry greater degrees of risk, either in
    introducing bugs or a backward incompatibility.

    The cgi.py module would be deprecated.  (XXX A new module or
    package name hasn't been chosen yet: 'web'?  'cgilib'?)

Minor Changes to Existing Modules

    This section lists details of minor changes to existing modules.
    These changes should have relatively small implementations, and
    have little risk of introducing incompatibilities with previous
    versions.


Rejected Changes

    The changes listed in this section were proposed for Python 2.1,
    but were rejected as unsuitable.  For each rejected change, a
    rationale is given describing why the change was deemed
    inappropriate.

    * An HTML generation module is not part of this PEP.  Several such
      modules exist, ranging from HTMLgen's purely programming
      interface to ASP-inspired simple templating to DTML's complex
      templating.  There's no indication of which templating module to
      enshrine in the standard library, and that probably means that
      no module should be so chosen.

    * cgi.py: Allowing a combination of query data and POST data.
      This doesn't seem to be standard at all, and therefore is
      dubious practice.

Proposed Interface

    XXX open issues: naming convention (studlycaps or
    underline-separated?); need to look at the cgi.parse*() functions
    and see if they can be simplified, too.

    Parsing functions: carry over most of the parse* functions from
    cgi.py
    
    # The Response class borrows most of its methods from Zope's
    # HTTPResponse class.
    
    class Response:
        """
        Attributes:
        status: HTTP status code to return
        headers: dictionary of response headers
        body: string containing the body of the HTTP response
        """
        
        def __init__(self, status=200, headers={}, body=""):
            pass
    
        def setStatus(self, status, reason=None):
            "Set the numeric HTTP response code"
            pass
    
        def setHeader(self, name, value):
            "Set an HTTP header"
            pass
    
        def setBody(self, body):
            "Set the body of the response"
            pass
    
        def setCookie(self, name, value,
                      path = '/',  
                      comment = None, 
                      domain = None, 
                      max-age = None,
                      expires = None,
                      secure = 0
                      ):
            "Set a cookie"
            pass
    
        def expireCookie(self, name):
            "Remove a cookie from the user"
            pass
    
        def redirect(self, url):
            "Redirect the browser to another URL"
            pass
    
        def __str__(self):
            "Convert entire response to a string"
            pass
    
        def dump(self):
            "Return a string representation useful for debugging"
            pass
            
        # XXX methods for specific classes of error:serverError, badRequest, etc.?
    
    
    class Request:
    
        """
        Attributes: 

        XXX should these be dictionaries, or dictionary-like objects?
        .headers : dictionary containing HTTP headers
        .cookies : dictionary of cookies
        .fields  : data from the form
        .env     : environment dictionary
        """
        
        def __init__(self, environ=os.environ, stdin=sys.stdin,
                     keep_blank_values=1, strict_parsing=0):
            """Initialize the request object, using the provided environment
            and standard input."""
            pass
    
        # Should people just use the dictionaries directly?
        def getHeader(self, name, default=None):
            pass
    
        def getCookie(self, name, default=None):
            pass
    
        def getField(self, name, default=None):
            "Return field's value as a string (even if it's an uploaded file)"
            pass
            
        def getUploadedFile(self, name):
            """Returns a file object that can be read to obtain the contents
            of an uploaded file.  XXX should this report an error if the 
            field isn't actually an uploaded file?  Or should it wrap
            a StringIO around simple fields for consistency?
            """
            
        def getURL(self, n=0, query_string=0):
            """Return the URL of the current request, chopping off 'n' path
            components from the right.  Eg. if the URL is
            "http://foo.com/bar/baz/quux", n=2 would return
            "http://foo.com/bar".  Does not include the query string (if
            any)
            """

        def getBaseURL(self, n=0):
            """Return the base URL of the current request, adding 'n' path
            components to the end to recreate more of the whole URL.  
            
            Eg. if the request URL is
            "http://foo.com/q/bar/baz/qux", n=0 would return
            "http://foo.com/", and n=2 "http://foo.com/q/bar".
            
            Returned URL does not include the query string, if any.
            """
        
        def dump(self):
            "String representation suitable for debugging output"
            pass
    
        # Possibilities?  I don't know if these are worth doing in the 
        # basic objects.
        def getBrowser(self):
            "Returns Mozilla/IE/Lynx/Opera/whatever"
    
        def isSecure(self):
            "Return true if this is an SSLified request"
            

    # Module-level function        
    def wrapper(func, logfile=sys.stderr):
        """
        Calls the function 'func', passing it the arguments
        (request, response, logfile).  Exceptions are trapped and
        sent to the file 'logfile'.  
        """
        # This wrapper will detect if it's being called from the command-line,
        # and if so, it will run in a debugging mode; name=value pairs 
        # can be entered on standard input to set field values.
        # (XXX how to do file uploads in this syntax?)

    
Copyright
    
    This document has been placed in the public domain.



From tim.one@home.com  Fri Dec 22 19:31:07 2000
From: tim.one@home.com (Tim Peters)
Date: Fri, 22 Dec 2000 14:31:07 -0500
Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support)
In-Reply-To: <20001222162143.A5515@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCAECIIFAA.tim.one@home.com>

[Thomas Wouters]
>>   * Guido-style:  8-column hard-tab indents.
>>   * New style:  4-column space-only indents.
>
> Hm, I must have missed this... Is 'new style' the preferred style, as
> its name suggests, or is Guido mounting a rebellion to adhere to the
> One True Style (or rather his own version of it, which just has
> the * in pointer type declarations wrong ? :)

Every time this comes up wrt C code,

1. Fred repeats that he thinks Guido caved in (but doesn't supply a
reference to anything saying so).

2. Guido repeats that he prefers old-style (but in a wishy-washy way that
leaves it uncertain (*)).

3. Fredrik and/or I repeat a request for a BDFL Pronouncement.

4. And there the thread ends.

It's *very* hard to find this history in the Python-Dev archives because
these threads always have subject lines like this one originally had ("RE:
[Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel
support").

Fred already did the #1 bit in this thread.

You can consider this msg the repeat of #3.

Since Guido is out of town, we can skip #2 and go straight to #4 early
<wink>.


(*) Two examples of #2 from this year:

Subject: Re: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/
Modules mmapmodule.c,2.1,2.2
From: Guido van Rossum <guido@python.org>
Date: Fri, 31 Mar 2000 07:10:45 -0500
> Can we change the 8-space-tab rule for all new C code that goes in?  I
> know that we can't practically change existing code right now, but for
> new C code, I propose we use no tab characters, and we use a 4-space
> block indentation.

Actually, this one was formatted for 8-space indents but using 4-space
tabs, so in my editor it looked like 16-space indents!

Given that we don't want to change existing code, I'd prefer to stick
with 1-tab 8-space indents.



Subject: Re: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules
linuxaudiodev.c,2.2,2.3
From: Guido van Rossum <guido@beopen.com>
Date: Sat, 08 Jul 2000 09:39:51 -0500

> Aren't tabs preferred as C-source indents, instead of 4-spaces ? At
> least, that's what I see in Python/*.c and Object/*.c, but I only
> vaguely recall it from the style document...

Yes, you're right.



From fredrik@effbot.org  Fri Dec 22 20:37:35 2000
From: fredrik@effbot.org (Fredrik Lundh)
Date: Fri, 22 Dec 2000 21:37:35 +0100
Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support)
References: <LNBBLJKPBEHFEDALKOLCAECIIFAA.tim.one@home.com>
Message-ID: <00e201c06c57$052fff00$e46940d5@hagrid>

> 3. Fredrik and/or I repeat a request for a BDFL Pronouncement.

and.

</F>



From akuchlin@mems-exchange.org  Fri Dec 22 21:09:47 2000
From: akuchlin@mems-exchange.org (A.M. Kuchling)
Date: Fri, 22 Dec 2000 16:09:47 -0500
Subject: [Python-Dev] Reviving the bookstore
Message-ID: <200012222109.QAA02737@207-172-57-45.s45.tnt2.ann.va.dialup.rcn.com>

Since the PSA isn't doing anything for us any longer, I've been
working on reviving the bookstore at a new location with a new
affiliate code.  

A draft version is up at its new home,
http://www.kuchling.com/bookstore/ .  Please take a look and offer
comments.  Book authors, please take a look at the entry for your book
and let me know about any corrections.  Links to reviews of books
would also be really welcomed.

I'd like to abolish having book listings with no description or
review, so if you notice a book that you've read has no description,
please feel free to submit a description and/or review.

--amk


From tim.one@home.com  Sat Dec 23 07:15:59 2000
From: tim.one@home.com (Tim Peters)
Date: Sat, 23 Dec 2000 02:15:59 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <3A41E68B.6B12CD71@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDJIFAA.tim.one@home.com>

[Tim]
> ...
> So we should say "8-bit string" or "Unicode string" when *only*
> one of those is allowable.  So
>
>     "ord() expected string ..."
>
> instead of (even a repaired version of)
>
>     "ord() expected string or Unicode character ..."

[MAL]
> I think this has to do with understanding that there are two
> string types in Python 2.0 -- a novice won't notice this until
> she sees the error message.

Except that this error msg has nothing to do with how many string types
there are:  they didn't pass *any* flavor of string when they get this msg.
At the time they pass (say) a float to ord(), that there are currently two
flavors of string is more information than they need to know.

> My understanding is similar to yours, "string" should mean
> "any string object" and in cases where the difference between
> 8-bit string and Unicode matters, these should be referred to
> as "8-bit string" and "Unicode string".

In that happy case of universal harmony, the msg above should say just
"string" and leave it at that.

> Still, I think it is a good idea to make people aware of the
> possibility of passing Unicode objects to these functions,

Me too.

> so perhaps the idea of adding both possibilies to error messages
> is not such a bad idea for 2.1.

But not that.  The user is trying to track down their problem.  Advertising
an irrelevant (to their problem) distinction at that time of crisis is
simply spam.

    TypeError:  ord() requires an 8-bit string or a Unicode string.
                On the other hand, you'd be surprised to discover
                all the things you can pass to chr():  it's not just
                ints.  Long ints are also accepted, by design, and
                due to an obscure bug in the Python internals, you
                can also pass floats, which get truncated to ints.

> The next phases would be converting all messages back to "string"
> and then convert all strings to Unicode ;-)

Then we'll save a lot of work by skipping the need for the first half of
that -- unless you're volunteering to do all of it <wink>.




From tim.one@home.com  Sat Dec 23 07:16:29 2000
From: tim.one@home.com (Tim Peters)
Date: Sat, 23 Dec 2000 02:16:29 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <20001221133719.B11880@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEDKIFAA.tim.one@home.com>

[Tim]
> So we should say "8-bit string" or "Unicode string" when *only*
> one of those is allowable.

[Andrew]
> OK... how about this patch?

+1 from me.  And maybe if you offer to send a royalty to Marc-Andre each
time it's printed, he'll back down from wanting to use the error msgs as a
billboard <wink>.

> Index: bltinmodule.c
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v
> retrieving revision 2.185
> diff -u -r2.185 bltinmodule.c
> --- bltinmodule.c	2000/12/20 15:07:34	2.185
> +++ bltinmodule.c	2000/12/21 18:36:54
> @@ -1524,13 +1524,14 @@
>  		}
>  	} else {
>  		PyErr_Format(PyExc_TypeError,
> -			     "ord() expected string or Unicode character, " \
> +			     "ord() expected string of length 1, but " \
>  			     "%.200s found", obj->ob_type->tp_name);
>  		return NULL;
>  	}
>
>  	PyErr_Format(PyExc_TypeError,
> -		     "ord() expected a character, length-%d string found",
> +		     "ord() expected a character, "
> +                     "but string of length %d found",
>  		     size);
>  	return NULL;
>  }



From barry@digicool.com  Sat Dec 23 16:43:37 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Sat, 23 Dec 2000 11:43:37 -0500
Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
 <20001222130143.B7127@newcnri.cnri.reston.va.us>
Message-ID: <14916.54841.418495.194558@anthem.concentric.net>

>>>>> "AK" == Andrew Kuchling <akuchlin@cnri.reston.va.us> writes:

    AK> New style it is.  (Barry, is the "python" style in cc-mode.el
    AK> going to be changed to new style, or a "python2" style added?)

There should probably be a second style added to cc-mode.el.  I
haven't maintained that package in a long time, but I'll work out a
patch and send it to the current maintainer.  Let's call it
"python2".

-Barry



From cgw@fnal.gov  Sat Dec 23 17:09:57 2000
From: cgw@fnal.gov (Charles G Waldman)
Date: Sat, 23 Dec 2000 11:09:57 -0600 (CST)
Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <14916.54841.418495.194558@anthem.concentric.net>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
 <20001222130143.B7127@newcnri.cnri.reston.va.us>
 <14916.54841.418495.194558@anthem.concentric.net>
Message-ID: <14916.56421.370499.762023@buffalo.fnal.gov>

Barry A. Warsaw writes:

 > There should probably be a second style added to cc-mode.el.  I
 > haven't maintained that package in a long time, but I'll work out a
 > patch and send it to the current maintainer.  Let's call it
 > "python2".

Maybe we should wait for the BDFL's pronouncement?


From barry@digicool.com  Sat Dec 23 19:24:42 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Sat, 23 Dec 2000 14:24:42 -0500
Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
 <20001222130143.B7127@newcnri.cnri.reston.va.us>
 <14916.54841.418495.194558@anthem.concentric.net>
 <14916.56421.370499.762023@buffalo.fnal.gov>
Message-ID: <14916.64506.56351.443287@anthem.concentric.net>

>>>>> "CGW" == Charles G Waldman <cgw@fnal.gov> writes:

    CGW> Barry A. Warsaw writes:

    >> There should probably be a second style added to cc-mode.el.  I
    >> haven't maintained that package in a long time, but I'll work
    >> out a patch and send it to the current maintainer.  Let's call
    >> it "python2".

    CGW> Maybe we should wait for the BDFL's pronouncement?

Sure, at least before submitting a patch.  Here's the simple one liner
you can add to your .emacs file to play with the new style in the
meantime.

-Barry

(c-add-style "python2" '("python" (c-basic-offset . 4)))



From tim.one@home.com  Sun Dec 24 04:04:47 2000
From: tim.one@home.com (Tim Peters)
Date: Sat, 23 Dec 2000 23:04:47 -0500
Subject: [Python-Dev] PEP 208 and __coerce__
In-Reply-To: <20001209033006.A3737@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEEMIFAA.tim.one@home.com>

[Neil Schemenauer
 Saturday, December 09, 2000 6:30 AM]

> While working on the implementation of PEP 208, I discovered that
> __coerce__ has some surprising properties.  Initially I
> implemented __coerce__ so that the numberic operation currently
> being performed was called on the values returned by __coerce__.
> This caused test_class to blow up due to code like this:
>
>     class Test:
>         def __coerce__(self, other):
>             return (self, other)
>
> The 2.0 "solves" this by not calling __coerce__ again if the
> objects returned by __coerce__ are instances.

If C.__coerce__ doesn't *know* it can do the full job, it should return
None.   This is what's documented, too:  a coerce method should return a
pair consisting of objects of the same type, or return None.

It's always going to be somewhat clumsy since what you really want is double
(or, in the case of pow, sometimes triple) dispatch.

Now there's a deliberate cheat that may not have gotten documented
comprehensibly:  when __coerce__ returns a pair, Python does not check to
verify both elements are of the same class.  That's because "a pair
consisting of objects of the same type" is often not what you *want* from
coerce.  For example, if I've got a matrix class M, then in

    M() + 42

I really don't want M.__coerce__ "promoting" 42 to a multi-gigabyte matrix
matching the shape and size of M().  M.__add__ can deal with that much more
efficiently if it gets 42 directly.  OTOH, M.__coerce__ may want to coerce
types other than scalar numbers to conform to the shape and size of self, or
fiddle self to conform to some other type.

What Python accepts back from __coerce__ has to be flexible enough to allow
all of those without further interference from the interpreter (just ask MAL
<wink>:  the *real* problem in practice is making coerce more of a help than
a burden to the end user; outside of int->long->float->complex (which is
itself partly broken, because long->float can lose precision or even fail
outright), "coercion to a common type" is almost never quite right; note
that C99 introduces distinct imaginary and complex types, because even
auto-conversion of imaginary->complex can be a royal PITA!).

> This has the effect of making code like:
>
>     class A:
>         def __coerce__(self, other):
>             return B(), other
>
>     class B:
>         def __coerce__(self, other):
>             return 1, other
>
>     A() + 1
>
> fail to work in the expected way.

I have no idea how you expected that to work.  Neither coerce() method looks
reasonable:  they don't follow the rules for coerce methods.  If A thinks it
needs to create a B() and have coercion "start over from scratch" with that,
then it should do so explicitly:

    class A:
        def __coerce__(self, other):
            return coerce(B(), other)

> The question is: how should __coerce__ work?

This can't be answered by a one-liner:  the intended behavior is documented
by a complex set of rules at the bottom of Lang Ref 3.3.6 ("Emulating
numeric types").  Alternatives should be written up as a diff against those
rules, which Guido worked hard on in years past -- more than once, too
<wink>.



From esr@thyrsus.com  Mon Dec 25 09:17:23 2000
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 25 Dec 2000 04:17:23 -0500
Subject: [Python-Dev] Tkinter support under RH 7.0?
Message-ID: <20001225041723.A9567@thyrsus.com>

I just upgraded to Red Hat 7.0 and installed Python 2.0.  Anybody have
a recipe for making Tkinter support work in this environment?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Government is not reason, it is not eloquence, it is force; like fire, a
troublesome servant and a fearful master. Never for a moment should it be left
to irresponsible action."
	-- George Washington, in a speech of January 7, 1790


From thomas@xs4all.net  Mon Dec 25 10:59:45 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 25 Dec 2000 11:59:45 +0100
Subject: [Python-Dev] Tkinter support under RH 7.0?
In-Reply-To: <20001225041723.A9567@thyrsus.com>; from esr@thyrsus.com on Mon, Dec 25, 2000 at 04:17:23AM -0500
References: <20001225041723.A9567@thyrsus.com>
Message-ID: <20001225115945.A25820@xs4all.nl>

On Mon, Dec 25, 2000 at 04:17:23AM -0500, Eric S. Raymond wrote:

> I just upgraded to Red Hat 7.0 and installed Python 2.0.  Anybody have
> a recipe for making Tkinter support work in this environment?

I installed Python 2.0 + Tkinter both from the BeOpen rpms and later
from source (for various reasons) and both were a breeze. I didn't really
use the 2.0+tkinter rpm version until I needed Numpy and various other
things and had to revert to the self-compiled version, but it seemed to work
fine.

As far as I can recall, there's only two things you have to keep in mind:
the tcl/tk version that comes with RedHat 7.0 is 8.3, so you have to adjust
the Tkinter section of Modules/Setup accordingly, and some of the
RedHat-supplied scripts stop working because they use deprecated modules (at
least 'rand') and use the socket.socket call wrong.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From esr@thyrsus.com  Wed Dec 27 19:37:50 2000
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 27 Dec 2000 14:37:50 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
Message-ID: <20001227143750.A26894@thyrsus.com>

I have 2.0 up and running on RH7.0, compiled from sources.  In the process, 
I discovered a couple of issues:

1. The curses module is commented out in the default Modules/Setup
file.  This is not good, as it may lead careless distribution builders
to ship Python 2.0s that will not be able to support the curses front
end in CML2.  Supporting CML2 (and thus getting Python the "design
win" of being involved in the Linux kernel build) was the major point
of integrating the curses module into the Python core.  It is possible
that one little "#" may have blown that.

2.The default Modules/Setup file assumes that various Tkinter-related libraries
are in /usr/local.  But /usr would be a more appropriate choice under most
circumstances.  Most Linux users now install their Tcl/Tk stuff from RPMs
or .deb packages that place the binaries and libraries under /usr.  Under
most other Unixes (e.g. Solaris) they were there to begin with.

3. The configure machinery could be made to deduce more about the contents
of Modules/Setup than it does now.  In particular, it's silly that the person
building Python has to fill in the locations of X librasries when 
configure is in principle perfectly capable of finding them.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Our society won't be truly free until "None of the Above" is always an option.


From guido@digicool.com  Wed Dec 27 21:04:27 2000
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 27 Dec 2000 16:04:27 -0500
Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support)
In-Reply-To: Your message of "Fri, 22 Dec 2000 14:31:07 EST."
 <LNBBLJKPBEHFEDALKOLCAECIIFAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAECIIFAA.tim.one@home.com>
Message-ID: <200012272104.QAA22278@cj20424-a.reston1.va.home.com>

> 2. Guido repeats that he prefers old-style (but in a wishy-washy way that
> leaves it uncertain (*)).

OK, since a pronouncement is obviously needed, here goes: Python C
source code should be indented using tabs only.

Exceptions:

(1) If 3rd party code is already written using a different style, it
    can stay that way, especially if it's a large volume that would be
    hard to reformat.  But only if it is consistent within a file or
    set of files (e.g. a 3rd party patch will have to conform to the
    prevailing style in the patched file).

(2) Occasionally (e.g. in ceval.c) there is code that's very deeply
    nested.  I will allow 4-space indents for the innermost nesting
    levels here.

Other C whitespace nits:

- Always place spaces around assignment operators, comparisons, &&, ||.

- No space between function name and left parenthesis.

- Always a space between a keyword ('if', 'for' etc.) and left paren.

- No space inside parentheses, brackets etc.

- No space before a comma or semicolon.

- Always a space after a comma (and semicolon, if not at end of line).

- Use ``return x;'' instead of ``return(x)''.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From cgw@fnal.gov  Wed Dec 27 22:17:31 2000
From: cgw@fnal.gov (Charles G Waldman)
Date: Wed, 27 Dec 2000 16:17:31 -0600 (CST)
Subject: [Python-Dev] sourceforge: problems with bug list?
Message-ID: <14922.27259.456364.750295@buffalo.fnal.gov>

Is it just me, or is anybody else getting this error when trying to
access the bug list?

 > An error occured in the logger. ERROR: pg_atoi: error in "5470/":
 > can't parse "/" 




From akuchlin@mems-exchange.org  Wed Dec 27 22:39:35 2000
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 27 Dec 2000 17:39:35 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001227143750.A26894@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 27, 2000 at 02:37:50PM -0500
References: <20001227143750.A26894@thyrsus.com>
Message-ID: <20001227173935.A25605@kronos.cnri.reston.va.us>

On Wed, Dec 27, 2000 at 02:37:50PM -0500, Eric S. Raymond wrote:
>1. The curses module is commented out in the default Modules/Setup
>file.  This is not good, as it may lead careless distribution builders

It always has been commented out.  Good distributions ship with most
of the available modules enabled; I can't say if RH7.0 counts as a
good distribution or not (still on 6.2).

>3. The configure machinery could be made to deduce more about the contents
>of Modules/Setup than it does now.  In particular, it's silly that the person

This is the point of PEP 229 and patch #102588, which uses a setup.py
script to build extension modules.  (I need to upload an updated
version of the patch which actually includes setup.py -- thought I did
that, but apparently not...)  The patch is still extremely green,
though, but I think it's the best course; witness the tissue of
hackery required to get the bsddb module automatically detected and
built.

--amk


From guido@digicool.com  Wed Dec 27 22:54:26 2000
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 27 Dec 2000 17:54:26 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: Your message of "Fri, 22 Dec 2000 10:58:56 EST."
 <14915.31296.56181.260479@cj42289-a.reston1.va.home.com>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net> <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov>
 <14915.31296.56181.260479@cj42289-a.reston1.va.home.com>
Message-ID: <200012272254.RAA22931@cj20424-a.reston1.va.home.com>

> Charles G Waldman writes:
>  > I am reminded of Linus Torvalds comments on this subject (see
>  > /usr/src/linux/Documentation/CodingStyle):

Fred replied:
>   The catch, of course, is Python/cevel.c, where breaking it up can
> hurt performance.  People scream when you do things like that....

Funny, Jeremy is doing just that, and it doesn't seem to be hurting
performance at all.  See

 http://sourceforge.net/patch/?func=detailpatch&patch_id=102337&group_id=5470

(though this is not quite finished).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@thyrsus.com  Wed Dec 27 23:05:46 2000
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 27 Dec 2000 18:05:46 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001227173935.A25605@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Wed, Dec 27, 2000 at 05:39:35PM -0500
References: <20001227143750.A26894@thyrsus.com> <20001227173935.A25605@kronos.cnri.reston.va.us>
Message-ID: <20001227180546.A4365@thyrsus.com>

Andrew Kuchling <akuchlin@mems-exchange.org>:
> >1. The curses module is commented out in the default Modules/Setup
> >file.  This is not good, as it may lead careless distribution builders
> 
> It always has been commented out.  Good distributions ship with most
> of the available modules enabled; I can't say if RH7.0 counts as a
> good distribution or not (still on 6.2).

I think this needs to change.  If curses is a core facility  now, the
default build should tread it as one.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

If a thousand men were not to pay their tax-bills this year, that would
... [be] the definition of a peaceable revolution, if any such is possible.
	-- Henry David Thoreau


From tim.one@home.com  Thu Dec 28 00:44:29 2000
From: tim.one@home.com (Tim Peters)
Date: Wed, 27 Dec 2000 19:44:29 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Misc python-mode.el,3.108,3.109
In-Reply-To: <E14BKaD-0004JB-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKFIFAA.tim.one@home.com>

[Barry Warsaw]
> Modified Files:
> 	python-mode.el
> Log Message:
> (python-font-lock-keywords): Add highlighting of `as' as a keyword,
> but only in "import foo as bar" statements (including optional
> preceding `from' clause).

Oh, that's right, try to make IDLE look bad, will you?  I've got half a mind
to take up the challenge.  Unfortunately, I only have half a mind in total,
so you may get away with this backstabbing for a while <wink>.



From thomas@xs4all.net  Thu Dec 28 09:53:31 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 28 Dec 2000 10:53:31 +0100
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001227143750.A26894@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 27, 2000 at 02:37:50PM -0500
References: <20001227143750.A26894@thyrsus.com>
Message-ID: <20001228105331.A6042@xs4all.nl>

On Wed, Dec 27, 2000 at 02:37:50PM -0500, Eric S. Raymond wrote:
> I have 2.0 up and running on RH7.0, compiled from sources.  In the process, 
> I discovered a couple of issues:

> 1. The curses module is commented out in the default Modules/Setup
> file.  This is not good, as it may lead careless distribution builders
> to ship Python 2.0s that will not be able to support the curses front
> end in CML2.  Supporting CML2 (and thus getting Python the "design
> win" of being involved in the Linux kernel build) was the major point
> of integrating the curses module into the Python core.  It is possible
> that one little "#" may have blown that.

Note that Tkinter is off by default too. And readline. And ssl. And the use
of shared libraries. We *can't* enable the cursesmodule by default, because
we don't know what the system's curses library is called. We'd have to
auto-detect that before we can enable it (and lots of other modules)
automatically, and that's a lot of work. I personally favour autoconf for
the job, but since amk is already busy on using distutils, I'm not going to
work on that.

> 2.The default Modules/Setup file assumes that various Tkinter-related libraries
> are in /usr/local.  But /usr would be a more appropriate choice under most
> circumstances.  Most Linux users now install their Tcl/Tk stuff from RPMs
> or .deb packages that place the binaries and libraries under /usr.  Under
> most other Unixes (e.g. Solaris) they were there to begin with.

This is nonsense. The line above it specifically states 'edit to reflect
where your Tcl/Tk headers are'. And besides from the issue whether they are
usually found in /usr (I don't believe so, not even on Solaris, but 'my'
Solaris box doesn't even have tcl/tk,) /usr/local is a perfectly sane
choice, since /usr is already included in the include-path, but /usr/local
usually is not.

> 3. The configure machinery could be made to deduce more about the contents
> of Modules/Setup than it does now.  In particular, it's silly that the person
> building Python has to fill in the locations of X librasries when 
> configure is in principle perfectly capable of finding them.

In principle, I agree. It's a lot of work, though. For instance, Debian
stores the Tcl/Tk headers in /usr/include/tcl<version>, which means you can
compile for more than one tcl version, by just changing your include path
and the library you link with. And there are undoubtedly several other
variants out there.

Should we really make the Setup file default to Linux, and leave other
operating systems in the dark about what it might be on their system ? I
think people with Linux and without clue are the least likely people to
compile their own Python, since Linux distributions already come with a
decent enough Python. And, please, lets assume the people assembling those
know how to read ?

Maybe we just need a HOWTO document covering Setup ?

(Besides, won't this all be fixed when CML2 comes with a distribution, Eric ?
They'll *have* to have working curses/tkinter then :-)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From MarkH@ActiveState.com  Thu Dec 28 12:34:09 2000
From: MarkH@ActiveState.com (Mark Hammond)
Date: Thu, 28 Dec 2000 23:34:09 +1100
Subject: [Python-Dev] Fwd: try...else
Message-ID: <3A4B3341.5010707@ActiveState.com>

Spotted on c.l.python.  Although Pythonwin is mentioned, python.exe 
gives the same results - as does 1.5.2.

Seems a reasonable question...

[Also, if Robin hasn't been invited to join us here, I think it could 
make some sense...]

Mark.
-------- Original Message --------
Subject: try...else
Date: Fri, 22 Dec 2000 18:02:27 +0000
From: Robin Becker <robin@jessikat.fsnet.co.uk>
Newsgroups: comp.lang.python

I had expected that in try: except: else
the else clause always got executed, but it seems not for return

PythonWin 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on
win32.Portions Copyright 1994-2000 Mark Hammond (MarkH@ActiveState.com)
- see 'Help/About PythonWin' for further copyright information.
 >>> def bang():
....     try:
....             return 'return value'
....     except:
....             print 'bang failed'
....     else:
....             print 'bang succeeded'
....
  >>> bang()
'return value'
 >>>

is this a 'feature' or bug. The 2.0 docs seem not to mention
return/continue except for try finally.
-- 
Robin Becker



From mal@lemburg.com  Thu Dec 28 14:45:49 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Dec 2000 15:45:49 +0100
Subject: [Python-Dev] Fwd: try...else
References: <3A4B3341.5010707@ActiveState.com>
Message-ID: <3A4B521D.4372224A@lemburg.com>

Mark Hammond wrote:
> 
> Spotted on c.l.python.  Although Pythonwin is mentioned, python.exe
> gives the same results - as does 1.5.2.
> 
> Seems a reasonable question...
> 
> [Also, if Robin hasn't been invited to join us here, I think it could
> make some sense...]
> 
> Mark.
> -------- Original Message --------
> Subject: try...else
> Date: Fri, 22 Dec 2000 18:02:27 +0000
> From: Robin Becker <robin@jessikat.fsnet.co.uk>
> Newsgroups: comp.lang.python
> 
> I had expected that in try: except: else
> the else clause always got executed, but it seems not for return

I think Robin mixed up try...finally with try...except...else.
The finally clause is executed even in case an exception occurred.

He does have a point however that 'return' will bypass 
try...else and try...finally clauses. I don't think we can change
that behaviour, though, as it would break code.
 
> PythonWin 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on
> win32.Portions Copyright 1994-2000 Mark Hammond (MarkH@ActiveState.com)
> - see 'Help/About PythonWin' for further copyright information.
>  >>> def bang():
> ....     try:
> ....             return 'return value'
> ....     except:
> ....             print 'bang failed'
> ....     else:
> ....             print 'bang succeeded'
> ....
>   >>> bang()
> 'return value'
>  >>>
> 
> is this a 'feature' or bug. The 2.0 docs seem not to mention
> return/continue except for try finally.
> --
> Robin Becker
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@digicool.com  Thu Dec 28 15:04:23 2000
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 28 Dec 2000 10:04:23 -0500
Subject: [Python-Dev] chomp()?
Message-ID: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>

Someone just posted a patch to implement s.chomp() as a string method:

http://sourceforge.net/patch/?func=detailpatch&patch_id=103029&group_id=5470

Pseudo code (for those not aware of the Perl function by that name):

def chomp(s):
    if s[-2:] == '\r\n':
        return s[:-2]
    if s[-1:] == '\r' or s[-1:] == '\n':
        return s[:-1]
    return s

I.e. it removes a trailing \r\n, \r, or \n.

Any comments?  Is this needed given that we have s.rstrip() already?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Thu Dec 28 15:30:57 2000
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 28 Dec 2000 10:30:57 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: Your message of "Wed, 27 Dec 2000 14:37:50 EST."
 <20001227143750.A26894@thyrsus.com>
References: <20001227143750.A26894@thyrsus.com>
Message-ID: <200012281530.KAA26049@cj20424-a.reston1.va.home.com>

Eric,

I think your recent posts have shown a worldview that's a bit too
Eric-centered. :-)

Not all the world is Linux.  CML2 isn't the only Python application
that matters.  Python world domination is not a goal.  There is no
Eric conspiracy! :-)

That said, I think that the future is bright: Anderw is already
working on a much more intelligent configuration manager.

I believe it would be a mistake to enable curses by default using the
current approach to module configuration: it doesn't compile out of
the box on every platform, and you wouldn't believe how much email I
get from clueless Unix users trying to build Python when there's a
problem like that in the distribution.

So I'd rather wait for Andrew's work.  You could do worse than help
him with that, to further your goal!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Thu Dec 28 15:41:23 2000
From: fdrake@acm.org (Fred L. Drake)
Date: Thu, 28 Dec 2000 10:41:23 -0500
Subject: [Python-Dev] chomp()?
In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <web-403062@digicool.com>

On Thu, 28 Dec 2000 10:04:23 -0500, Guido
<guido@digicool.com> wrote:
 > Someone just posted a patch to implement s.chomp() as a
 > string method:
...
 > Any comments?  Is this needed given that we have
 > s.rstrip() already?

  I've always considered this a different operation from
rstrip().  When you intend to be as surgical in your changes
as possible, it is important *not* to use rstrip().
  I don't feel strongly that it needs to be implemented in
C, though I imagine people who do a lot of string processing
feel otherwise.  It's just hard to beat the performance
difference if you are doing this a lot.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From barry@digicool.com  Thu Dec 28 16:00:36 2000
From: barry@digicool.com (Barry A. Warsaw)
Date: Thu, 28 Dec 2000 11:00:36 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Misc python-mode.el,3.108,3.109
References: <E14BKaD-0004JB-00@usw-pr-cvs1.sourceforge.net>
 <LNBBLJKPBEHFEDALKOLCGEKFIFAA.tim.one@home.com>
Message-ID: <14923.25508.668453.186209@anthem.concentric.net>

>>>>> "TP" == Tim Peters <tim.one@home.com> writes:

    TP> [Barry Warsaw]
    >> Modified Files: python-mode.el Log Message:
    >> (python-font-lock-keywords): Add highlighting of `as' as a
    >> keyword, but only in "import foo as bar" statements (including
    >> optional preceding `from' clause).

    TP> Oh, that's right, try to make IDLE look bad, will you?  I've
    TP> got half a mind to take up the challenge.  Unfortunately, I
    TP> only have half a mind in total, so you may get away with this
    TP> backstabbing for a while <wink>.

With my current network (un)connectivity, I feel like a nuclear sub
which can only surface once a month to receive low frequency orders
from some remote antenna farm out in New Brunswick.  Just think of me
as a rogue commander who tries to do as much damage as possible when
he's not joyriding in the draft-wake of giant squids.

rehoming-all-remaining-missiles-at-the-Kingdom-of-Timbotia-ly y'rs,
-Barry



From esr@thyrsus.com  Thu Dec 28 16:01:54 2000
From: esr@thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 11:01:54 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <200012281530.KAA26049@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 10:30:57AM -0500
References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com>
Message-ID: <20001228110154.D32394@thyrsus.com>

Guido van Rossum <guido@digicool.com>:
> Not all the world is Linux.  CML2 isn't the only Python application
> that matters.  Python world domination is not a goal.  There is no
> Eric conspiracy! :-)

Perhaps I misunderstood you, then.  I thought you considered CML2 an
potentially important design win, and that was why curses didn't get
dropped from the core.  Have you changed your mind about this?

If Python world domination is not a goal then I can only conclude that
you haven't had your morning coffee yet :-).

There's a more general question here about what it means for something
to be in the core language.  Developers need to have a clear,
bright-line picture of what they can count on to be present.  To me
this implies that it's the job of the Python maintainers to make sure
that a facility declared "core" by its presence in the standard
library documentation is always present, for maximum "batteries are
included" effect.  

Yes, dealing with cross-platform variations in linking curses is a
pain -- but dealing with that kind of pain so the Python user doesn't
have to is precisely our job.  Or so I understand it, anyway.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Conservatism is the blind and fear-filled worship of dead radicals.


From moshez@zadka.site.co.il  Thu Dec 28 16:51:32 2000
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: 28 Dec 2000 16:51:32 -0000
Subject: [Python-Dev] chomp()?
In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <20001228165132.8025.qmail@stimpy.scso.com>

On Thu, 28 Dec 2000, Guido van Rossum <guido@digicool.com> wrote:

> Someone just posted a patch to implement s.chomp() as a string method:
...
> Any comments?  Is this needed given that we have s.rstrip() already?

Yes.

i=0
for line in fileinput.input():
	print '%d: %s' % (i, line.chomp())
	i++

I want that operation to be invertable by

sed 's/^[0-9]*: //'


From guido@digicool.com  Thu Dec 28 17:08:18 2000
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 28 Dec 2000 12:08:18 -0500
Subject: [Python-Dev] scp to sourceforge
Message-ID: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>

I've seen a thread on this but there was no conclusive answer, so I'm
reopening this.

I can't SCP updated PEPs to the SourceForge machine.  The "pep2html.py
-i" command just hangs.  I can ssh into shell.sourceforge.net just
fine, but scp just hangs.  "scp -v" prints a bunch of things
suggesting that it can authenticate itself just fine, ending with
these three lines:

  cj20424-a.reston1.va.home.com: RSA authentication accepted by server.
  cj20424-a.reston1.va.home.com: Sending command: scp -v -t .
  cj20424-a.reston1.va.home.com: Entering interactive session.

and then nothing.  It just sits there.

Would somebody please figure out a way to update the PEPs?  It's kind
of pathetic to see the website not have the latest versions...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From moshez@zadka.site.co.il  Thu Dec 28 16:28:07 2000
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: 28 Dec 2000 16:28:07 -0000
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <3A4B521D.4372224A@lemburg.com>
References: <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com>
Message-ID: <20001228162807.7229.qmail@stimpy.scso.com>

On Thu, 28 Dec 2000, "M.-A. Lemburg" <mal@lemburg.com> wrote:

> He does have a point however that 'return' will bypass 
> try...else and try...finally clauses. I don't think we can change
> that behaviour, though, as it would break code.

It doesn't bypass try..finally:

>>> def foo():
...     try:
...             print "hello"
...             return
...     finally:
...             print "goodbye"
...
>>> foo()
hello
goodbye



From guido@digicool.com  Thu Dec 28 16:43:26 2000
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 28 Dec 2000 11:43:26 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: Your message of "Thu, 28 Dec 2000 11:01:54 EST."
 <20001228110154.D32394@thyrsus.com>
References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com>
 <20001228110154.D32394@thyrsus.com>
Message-ID: <200012281643.LAA26687@cj20424-a.reston1.va.home.com>

> Guido van Rossum <guido@digicool.com>:
> > Not all the world is Linux.  CML2 isn't the only Python application
> > that matters.  Python world domination is not a goal.  There is no
> > Eric conspiracy! :-)
> 
> Perhaps I misunderstood you, then.  I thought you considered CML2 an
> potentially important design win, and that was why curses didn't get
> dropped from the core.  Have you changed your mind about this?

Supporting CML2 was one of the reasons to keep curses in the core, but
not the only one.  Linux kernel configuration is so far removed from
my daily use of computers that I don't have a good way to judge its
importance in the grand scheme of things.  Since you obviously
consider it very important, and since I generally trust your judgement
(except on the issue of firearms :-), your plea for keeping, and
improving, curses support in the Python core made a difference in my
decision.  And don't worry, I don't expect to change that decision
-- though I personally still find it curious that curses is so important.
I find curses-style user interfaces pretty pathetic, and wished that
Linux migrated to a real GUI for configuration.  (And the linuxconf
approach does *not* qualify as a a real GUI. :-)

> If Python world domination is not a goal then I can only conclude that
> you haven't had your morning coffee yet :-).

Sorry to disappoint you, Eric.  I gave up coffee years ago. :-)

I was totally serious though: my personal satisfaction doesn't come
from Python world domination.  Others seem have that goal, and if it
doesn't inconvenience me too much I'll play along, but in the end I've
got some goals in my personal life that are much more important.

> There's a more general question here about what it means for something
> to be in the core language.  Developers need to have a clear,
> bright-line picture of what they can count on to be present.  To me
> this implies that it's the job of the Python maintainers to make sure
> that a facility declared "core" by its presence in the standard
> library documentation is always present, for maximum "batteries are
> included" effect.  

We do the best we can.  Using the current module configuration system,
it's a one-character edit to enable curses if you need it.  With
Andrew's new scheme, it will be automatic.

> Yes, dealing with cross-platform variations in linking curses is a
> pain -- but dealing with that kind of pain so the Python user doesn't
> have to is precisely our job.  Or so I understand it, anyway.

So help Andrew: http://python.sourceforge.net/peps/pep-0229.html

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Thu Dec 28 16:52:36 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Dec 2000 17:52:36 +0100
Subject: [Python-Dev] Fwd: try...else
References: <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com>
Message-ID: <3A4B6FD3.9B576E9A@lemburg.com>

Moshe Zadka wrote:
> 
> On Thu, 28 Dec 2000, "M.-A. Lemburg" <mal@lemburg.com> wrote:
> 
> > He does have a point however that 'return' will bypass
> > try...else and try...finally clauses. I don't think we can change
> > that behaviour, though, as it would break code.
> 
> It doesn't bypass try..finally:
> 
> >>> def foo():
> ...     try:
> ...             print "hello"
> ...             return
> ...     finally:
> ...             print "goodbye"
> ...
> >>> foo()
> hello
> goodbye

Hmm, that must have changed between Python 1.5 and more recent
versions:

Python 1.5:
>>> def f():
...     try:
...             return 1
...     finally:
...             print 'finally'
... 
>>> f()
1
>>> 

Python 2.0:
>>> def f():
...     try:
...             return 1
...     finally:
...             print 'finally'
... 
>>> f()
finally
1
>>>

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From moshez@stimpy.scso.com  Thu Dec 28 16:59:32 2000
From: moshez@stimpy.scso.com (Moshe Zadka)
Date: 28 Dec 2000 16:59:32 -0000
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <3A4B6FD3.9B576E9A@lemburg.com>
References: <3A4B6FD3.9B576E9A@lemburg.com>, <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com>
Message-ID: <20001228165932.8143.qmail@stimpy.scso.com>

On Thu, 28 Dec 2000 17:52:36 +0100, "M.-A. Lemburg" <mal@lemburg.com> wrote:

[about try..finally not playing well with return]
> Hmm, that must have changed between Python 1.5 and more recent
> versions:

I posted a 1.5.2 test. So it changed between 1.5 and 1.5.2?


From esr@thyrsus.com  Thu Dec 28 17:20:48 2000
From: esr@thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 12:20:48 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001228105331.A6042@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 10:53:31AM +0100
References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl>
Message-ID: <20001228122048.A1381@thyrsus.com>

Thomas Wouters <thomas@xs4all.net>:
> > 1. The curses module is commented out in the default Modules/Setup
> > file.  This is not good, as it may lead careless distribution builders
> > to ship Python 2.0s that will not be able to support the curses front
> > end in CML2.  Supporting CML2 (and thus getting Python the "design
> > win" of being involved in the Linux kernel build) was the major point
> > of integrating the curses module into the Python core.  It is possible
> > that one little "#" may have blown that.
> 
> Note that Tkinter is off by default too. And readline. And ssl. And the use
> of shared libraries.

IMO ssl isn't an issue because it's not documented as being in the standard
module set.  Readline is a minor issue because raw_input()'s functionality
changes somewhat if it's not linked, but I think we can live with this -- the
change isn't visible to calling programs.  

Hm.  It appears tkinter isn't documented in the standard set of modules 
either.  Interesting.  Technically this means I don't have a problem with
it not being built in by default, but I think there is a problem here...

My more general point is that right now Pyjthon has three classes of modules:

1. Documented as being in the core and built in by default.
2. Not documented as being in the core and not built in by default.
3. Documented as being in the core but not built in by default.

My more general claim is that the existence of class 3 is a problem,
because it compromises the "batteries are included" effect -- it means
Python users don't have a bright-line test for what will be present in
every Python (or at least every Python on an operating system
theoretically feature-compatible with theirs).

My struggle to get CML2 adopted brings this problem into particularly
sharp focus because the kernel group is allergic to big footprints or
having to download extension modules to do a build.  But the issue is
really broader than that.  I think we ought to be migrating stuff out
of class 3 into class 1 where possible and to class 2 only where
unavoidable.

>         We *can't* enable the cursesmodule by default, because
> we don't know what the system's curses library is called. We'd have to
> auto-detect that before we can enable it (and lots of other modules)
> automatically, and that's a lot of work. I personally favour autoconf for
> the job, but since amk is already busy on using distutils, I'm not going to
> work on that.

Yes, we need to do a lot more autodetection -- this is a large part of
my point.  I have nothing against distutils, but I don't see how it
solves this problem unless we assume that we'll always have Python
already available on any platform where we're building Python.

I'm willing to put my effort where my mouth is on this.  I have a lot
of experience with autoconf; I'm willing to write some of these nasty
config tests.

> > 2.The default Modules/Setup file assumes that various Tkinter-related libraries
> > are in /usr/local.  But /usr would be a more appropriate choice under most
> > circumstances.  Most Linux users now install their Tcl/Tk stuff from RPMs
> > or .deb packages that place the binaries and libraries under /usr.  Under
> > most other Unixes (e.g. Solaris) they were there to begin with.
> 
> This is nonsense. The line above it specifically states 'edit to reflect
> where your Tcl/Tk headers are'. And besides from the issue whether they are
> usually found in /usr (I don't believe so, not even on Solaris, but 'my'
> Solaris box doesn't even have tcl/tk,) /usr/local is a perfectly sane
> choice, since /usr is already included in the include-path, but /usr/local
> usually is not.

Is it?  That is not clear from the comment.  Perhaps this is just a 
documentation problem.  I'll look again.
 
> > 3. The configure machinery could be made to deduce more about the contents
> > of Modules/Setup than it does now.  In particular, it's silly that the
> > person building Python has to fill in the locations of X librasries when 
> > configure is in principle perfectly capable of finding them.
> 
> In principle, I agree. It's a lot of work, though. For instance, Debian
> stores the Tcl/Tk headers in /usr/include/tcl<version>, which means you can
> compile for more than one tcl version, by just changing your include path
> and the library you link with. And there are undoubtedly several other
> variants out there.

As I said to Guido, I think it is exactly our job to deal with this sort
of grottiness.  One of Python's major selling points is supposed to be
cross-platform consistency of the API.  If we fail to do what you're
describing, we're failing to meet Python users' reasonable expectations
for the language.

> Should we really make the Setup file default to Linux, and leave other
> operating systems in the dark about what it might be on their system ? I
> think people with Linux and without clue are the least likely people to
> compile their own Python, since Linux distributions already come with a
> decent enough Python. And, please, lets assume the people assembling those
> know how to read ?

Please note that I am specifically *not* advocating making the build defaults
Linux-centric.  That's not my point at all.

> Maybe we just need a HOWTO document covering Setup ?

That would be a good idea.

> (Besides, won't this all be fixed when CML2 comes with a distribution, Eric ?
> They'll *have* to have working curses/tkinter then :-)

I'm concerned that it will work the other way around, that CML2 won't happen
if the core does not reliably include these facilities.  In itself CML2 
not happening wouldn't be the end of the world of course, but I'm pushing on
this because I think the larger issue of class 3 modules is actually important
to the health of Python and needs to be attacked seriously.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The Bible is not my book, and Christianity is not my religion.  I could never
give assent to the long, complicated statements of Christian dogma.
	-- Abraham Lincoln


From cgw@fnal.gov  Thu Dec 28 17:36:06 2000
From: cgw@fnal.gov (Charles G Waldman)
Date: Thu, 28 Dec 2000 11:36:06 -0600 (CST)
Subject: [Python-Dev] chomp()?
In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <14923.31238.65155.496546@buffalo.fnal.gov>

Guido van Rossum writes:
 > Someone just posted a patch to implement s.chomp() as a string method:
 > I.e. it removes a trailing \r\n, \r, or \n.
 > 
 > Any comments?  Is this needed given that we have s.rstrip() already?

-1 from me.  P=NP (Python is not Perl).  "Chomp" is an excessively cute name.
And like you said, this is too much like "rstrip" to merit a separate
method.




From esr@thyrsus.com  Thu Dec 28 17:41:17 2000
From: esr@thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 12:41:17 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <200012281643.LAA26687@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 11:43:26AM -0500
References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com> <20001228110154.D32394@thyrsus.com> <200012281643.LAA26687@cj20424-a.reston1.va.home.com>
Message-ID: <20001228124117.B1381@thyrsus.com>

Guido van Rossum <guido@digicool.com>:
> Supporting CML2 was one of the reasons to keep curses in the core, but
> not the only one.  Linux kernel configuration is so far removed from
> my daily use of computers that I don't have a good way to judge its
> importance in the grand scheme of things.  Since you obviously
> consider it very important, and since I generally trust your judgement
> (except on the issue of firearms :-), your plea for keeping, and
> improving, curses support in the Python core made a difference in my
> decision.  And don't worry, I don't expect to change that decision
> -- though I personally still find it curious that curses is so important.
> I find curses-style user interfaces pretty pathetic, and wished that
> Linux migrated to a real GUI for configuration.  (And the linuxconf
> approach does *not* qualify as a a real GUI. :-)

Thank you, that makes your priorities much clearer.

Actually I agree with you that curses interfaces are mostly pretty
pathetic.  A lot of people still like them, though, because they tend
to be fast and lightweight.  Then, too, a really well-designed curses
interface can in fact be good enough that the usability gain from
GUIizing is marginal.  My favorite examples of this are mutt and slrn.
The fact that GUI programs have failed to make much headway against
this is not simply due to user conservatism, it's genuinely hard to
see how a GUI interface could be made significantly better.

And unfortunately, there is a niche where it is still important to
support curses interfacing independently of anyone's preferences in
interface style -- early in the system-configuration process before
one has bootstrapped to the point where X is reliably available.  I
hasten to add that this is not just *my* problem -- one of your
more important Python constituencies in a practical sense is 
the guys who maintain Red Hat's installer.

> I was totally serious though: my personal satisfaction doesn't come
> from Python world domination.  Others seem have that goal, and if it
> doesn't inconvenience me too much I'll play along, but in the end I've
> got some goals in my personal life that are much more important.

There speaks the new husband :-).  OK.  So what *do* you want from Python?

Personally, BTW, my goal is not exactly Python world domination either
-- it's that the world should be dominated by the language that has
the least tendency to produce grotty fragile code (remember that I
tend to obsess about the software-quality problem :-)).  Right now
that's Python.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The people of the various provinces are strictly forbidden to have in their
possession any swords, short swords, bows, spears, firearms, or other types
of arms. The possession of unnecessary implements makes difficult the
collection of taxes and dues and tends to foment uprisings.
        -- Toyotomi Hideyoshi, dictator of Japan, August 1588


From mal@lemburg.com  Thu Dec 28 17:43:13 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Dec 2000 18:43:13 +0100
Subject: [Python-Dev] chomp()?
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <3A4B7BB1.F09660ED@lemburg.com>

Guido van Rossum wrote:
> 
> Someone just posted a patch to implement s.chomp() as a string method:
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=103029&group_id=5470
> 
> Pseudo code (for those not aware of the Perl function by that name):
> 
> def chomp(s):
>     if s[-2:] == '\r\n':
>         return s[:-2]
>     if s[-1:] == '\r' or s[-1:] == '\n':
>         return s[:-1]
>     return s
> 
> I.e. it removes a trailing \r\n, \r, or \n.
> 
> Any comments?  Is this needed given that we have s.rstrip() already?

We already have .splitlines() which does the above (remove
line breaks) not only for a single line, but for many lines at once.

Even better: .splitlines() also does the right thing for Unicode.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Thu Dec 28 19:06:33 2000
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Dec 2000 20:06:33 +0100
Subject: [Python-Dev] Fwd: try...else
References: <3A4B6FD3.9B576E9A@lemburg.com>, <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com> <20001228165932.8143.qmail@stimpy.scso.com>
Message-ID: <3A4B8F39.58C64EFB@lemburg.com>

Moshe Zadka wrote:
> 
> On Thu, 28 Dec 2000 17:52:36 +0100, "M.-A. Lemburg" <mal@lemburg.com> wrote:
> 
> [about try..finally not playing well with return]
> > Hmm, that must have changed between Python 1.5 and more recent
> > versions:
> 
> I posted a 1.5.2 test. So it changed between 1.5 and 1.5.2?

Sorry, false alarm: there was a bug in my patched 1.5 version.
The original 1.5 version does not show the described behaviour.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From thomas@xs4all.net  Thu Dec 28 20:21:15 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 28 Dec 2000 21:21:15 +0100
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <3A4B521D.4372224A@lemburg.com>; from mal@lemburg.com on Thu, Dec 28, 2000 at 03:45:49PM +0100
References: <3A4B3341.5010707@ActiveState.com> <3A4B521D.4372224A@lemburg.com>
Message-ID: <20001228212115.C1811@xs4all.nl>

On Thu, Dec 28, 2000 at 03:45:49PM +0100, M.-A. Lemburg wrote:

> > I had expected that in try: except: else
> > the else clause always got executed, but it seems not for return

> I think Robin mixed up try...finally with try...except...else.
> The finally clause is executed even in case an exception occurred.

(MAL and I already discussed this in private mail: Robin did mean
try/except/else, and 'finally' already executes when returning directly from
the 'try' block, even in Python 1.5)

> He does have a point however that 'return' will bypass 
> try...else and try...finally clauses. I don't think we can change
> that behaviour, though, as it would break code.

This code:

try:
   return
except:
   pass
else:
   print "returning"

will indeed not print 'returning', but I believe it's by design. I'm against
changing it, in any case, and not just because it'd break code :) If you
want something that always executes, use a 'finally'. Or don't return from
the 'try', but return in the 'else' clause. 

The 'except' clause is documented to execute if a matching exception occurs,
and 'else' if no exception occurs. Maybe the intent of the 'else' clause
would be clearer if it was documented to 'execute if the try: clause
finishes without an exception being raised' ? The 'else' clause isn't
executed when you 'break' or (after applying my continue-in-try patch ;)
'continue' out of the 'try', either.

Robin... Did I already reply this, on python-list or to you directly ? I
distinctly remember writing that post, but I'm not sure if it arrived. Maybe
I didn't send it after all, or maybe something on mail.python.org is
detaining it ?

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Thu Dec 28 18:19:06 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 28 Dec 2000 19:19:06 +0100
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001228122048.A1381@thyrsus.com>; from esr@thyrsus.com on Thu, Dec 28, 2000 at 12:20:48PM -0500
References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> <20001228122048.A1381@thyrsus.com>
Message-ID: <20001228191906.F1281@xs4all.nl>

On Thu, Dec 28, 2000 at 12:20:48PM -0500, Eric S. Raymond wrote:

> My more general point is that right now Pyjthon has three classes of
> modules:

> 1. Documented as being in the core and built in by default.
> 2. Not documented as being in the core and not built in by default.
> 3. Documented as being in the core but not built in by default.

> My more general claim is that the existence of class 3 is a problem,
> because it compromises the "batteries are included" effect -- it means
> Python users don't have a bright-line test for what will be present in
> every Python (or at least every Python on an operating system
> theoretically feature-compatible with theirs).

It depends on your definition of 'being in the core'. Some of the things
that are 'in the core' are simply not possible on all platforms. So if you
want really portable code, you don't want to use them. Other features are
available on all systems that matter [to you], so you don't really care
about it, just use them, and at best document that they need feature X.

There is also the subtle difference between a Python user and a Python
compiler/assembler (excuse my overloading of the terms, but you know what I
mean). People who choose to compile their own Python should realize that
they might disable or misconfigure some parts of it. I personally trust most
people that assemble OS distributions to compile a proper Python binary +
modules, but I think a HOWTO isn't a bad idea -- unless we autoconf
everything.

> I think we ought to be migrating stuff out
> of class 3 into class 1 where possible and to class 2 only where
> unavoidable.

[ and ]

> I'm willing to put my effort where my mouth is on this.  I have a lot
> of experience with autoconf; I'm willing to write some of these nasty
> config tests.

[ and ]

> As I said to Guido, I think it is exactly our job to deal with this sort
> of grottiness.  One of Python's major selling points is supposed to be
> cross-platform consistency of the API.  If we fail to do what you're
> describing, we're failing to meet Python users' reasonable expectations
> for the language.

[ and ]

> Please note that I am specifically *not* advocating making the build defaults
> Linux-centric.  That's not my point at all.

I apologize for the tone of my previous post, and the above snippet. I'm not
trying to block progress here ;) I'm actually all for autodetecting as much
as possible, and more than willing to put effort into it as well (as long as
it's deemed useful, and isn't supplanted by a distutils variant weeks
later.) And I personally have my doubts about the distutils variant, too,
but that's partly because I have little experience with distutils. If we can
work out a deal where both autoconf and distutils are an option, I'm happy
to write a few, if not all, autoconf tests for the currently disabled
modules.

So, Eric, let's split the work. I'll do Tkinter if you do curses. :)

However, I'm also keeping those oddball platforms that just don't support
some features in mind. If you want truly portable code, you have to work at
it. I think it's perfectly okay to say "your Python needs to have the curses
module or the tkinter module compiled in -- contact your administrator if it
has neither". There will still be platforms that don't have curses, or
syslog, or crypt(), though hopefully none of them will be Linux.

Oh, and I also apologize for possibly duplicating what has already been said
by others. I haven't seen anything but this post (which was CC'd to me
directly) since I posted my reply to Eric, due to the ululating bouts of
delay on mail.python.org. Maybe DC should hire some *real* sysadmins,
instead of those silly programmer-kniggits ? >:->

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mwh21@cam.ac.uk  Thu Dec 28 18:27:48 2000
From: mwh21@cam.ac.uk (Michael Hudson)
Date: Thu, 28 Dec 2000 18:27:48 +0000 (GMT)
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <3A4B521D.4372224A@lemburg.com>
Message-ID: <Pine.SOL.4.21.0012281820240.3801-100000@yellow.csi.cam.ac.uk>

On Thu, 28 Dec 2000, M.-A. Lemburg wrote:

> I think Robin mixed up try...finally with try...except...else.

I think so too.

> The finally clause is executed even in case an exception occurred.
> 
> He does have a point however that 'return' will bypass 
> try...else and try...finally clauses. I don't think we can change
> that behaviour, though, as it would break code.

return does not skip finally clauses[1].  In my not especially humble
opinion, the current behaviour is the Right Thing.  I'd have to think for
a moment before saying what Robin's example would print, but I think the
alternative would disturb me far more.

Cheers,
M.

[1] In fact the flow of control on return is very similar to that of an
    exception - ooh, look at that implementation...



From esr@thyrsus.com  Thu Dec 28 19:17:51 2000
From: esr@thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 14:17:51 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001228191906.F1281@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 07:19:06PM +0100
References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> <20001228122048.A1381@thyrsus.com> <20001228191906.F1281@xs4all.nl>
Message-ID: <20001228141751.B2528@thyrsus.com>

Thomas Wouters <thomas@xs4all.net>:
> > My more general claim is that the existence of class 3 is a problem,
> > because it compromises the "batteries are included" effect -- it means
> > Python users don't have a bright-line test for what will be present in
> > every Python (or at least every Python on an operating system
> > theoretically feature-compatible with theirs).
> 
> It depends on your definition of 'being in the core'. Some of the things
> that are 'in the core' are simply not possible on all platforms. So if you
> want really portable code, you don't want to use them. Other features are
> available on all systems that matter [to you], so you don't really care
> about it, just use them, and at best document that they need feature X.

I understand.  We can't, for example, guarantee to duplicate the Windows-
specific stuff in the Unix port (nor would we want to in most cases :-)).
However, I think "we build in curses/Tkinter everywhere the corresponding
libraries exist" is a guarantee we can and should make.  Similarly for
other modules presently in class 3.
 
> There is also the subtle difference between a Python user and a Python
> compiler/assembler (excuse my overloading of the terms, but you know what I
> mean).

Yes.  We have three categories here:

1. People who use python for applications (what I've been calling users)
2. People who configure Python binary packages for distribution (what
   you call a "compiler/assembler" and I think of as a "builder").
3. People who hack Python itself.

Problem is that "developer" is very ambiguous in this context...

>           People who choose to compile their own Python should realize that
> they might disable or misconfigure some parts of it. I personally trust most
> people that assemble OS distributions to compile a proper Python binary +
> modules, but I think a HOWTO isn't a bad idea -- unless we autoconf
> everything.

I'd like to see both things happen (HOWTO and autoconfing) and am willing to
work on both.

> I apologize for the tone of my previous post, and the above snippet.

No offense taken at all, I assure you.

>                                                                    I'm not
> trying to block progress here ;) I'm actually all for autodetecting as much
> as possible, and more than willing to put effort into it as well (as long as
> it's deemed useful, and isn't supplanted by a distutils variant weeks
> later.) And I personally have my doubts about the distutils variant, too,
> but that's partly because I have little experience with distutils. If we can
> work out a deal where both autoconf and distutils are an option, I'm happy
> to write a few, if not all, autoconf tests for the currently disabled
> modules.

I admit I'm not very clear on the scope of what distutils is supposed to
handle, and how.  Perhaps amk can enlighten us?

> So, Eric, let's split the work. I'll do Tkinter if you do curses. :)

You've got a deal.  I'll start looking at the autoconf code.  I've already
got a fair idea how to do this.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

No one who's seen it in action can say the phrase "government help" without
either laughing or crying.


From tim.one@home.com  Fri Dec 29 02:59:53 2000
From: tim.one@home.com (Tim Peters)
Date: Thu, 28 Dec 2000 21:59:53 -0500
Subject: [Python-Dev] scp to sourceforge
In-Reply-To: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEMKIFAA.tim.one@home.com>

[Guido]
> I've seen a thread on this but there was no conclusive answer, so I'm
> reopening this.

It hasn't budged an inch since then:  your "Entering interactive session"
problem is the same one everyone has; it gets reported on SF's bug and/or
support managers at least daily; SF has not fixed it yet; these days they
don't even respond to scp bug reports anymore; the cause appears to be SF's
custom sfshell, and only SF can change that; the only known workarounds are
to (a) modify files on SF directly (they suggest vi <wink>), or (b) initiate
scp *from* SF, using your local machine as a server (if you can do that -- I
cannot, or at least haven't succeeded).



From martin@loewis.home.cs.tu-berlin.de  Thu Dec 28 22:52:02 2000
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 28 Dec 2000 23:52:02 +0100
Subject: [Python-Dev] curses in the core?
Message-ID: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>

> If curses is a core facility now, the default build should tread it
> as one.
...
> IMO ssl isn't an issue because it's not documented as being in the
> standard module set.
...
> 3. Documented as being in the core but not built in by default.
> My more general claim is that the existence of class 3 is a problem

In the case of curses, I believe there is a documentation error in the
2.0 documentation. The curses packages is listed under "Generic
Operating System Services". I believe this is wrong, it should be listed
as "Unix Specific Services".

Unless I'm mistaken, the curses module is not available on the Mac and
on Windows. With that change, the curses module would then fall into
Eric's category 2 (Not documented as being in the core and not built
in by default).

That documentation change should be carried out even if curses is
autoconfigured; autoconf is used on Unix only, either.

Regards,
Martin

P.S. The "Python Library Reference" content page does not mention the
word "core" at all, except as part of asyncore...


From thomas@xs4all.net  Thu Dec 28 22:58:25 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 28 Dec 2000 23:58:25 +0100
Subject: [Python-Dev] scp to sourceforge
In-Reply-To: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 12:08:18PM -0500
References: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>
Message-ID: <20001228235824.E1811@xs4all.nl>

On Thu, Dec 28, 2000 at 12:08:18PM -0500, Guido van Rossum wrote:

> I've seen a thread on this but there was no conclusive answer, so I'm
> reopening this.

Actually there was: it's all SourceForge's fault. (At least that's my
professional opinion ;) They honestly have a strange setup, though how
strange and to what end I cannot tell.

> Would somebody please figure out a way to update the PEPs?  It's kind
> of pathetic to see the website not have the latest versions...

The way to update the peps is by ssh'ing into shell.sourceforge.net, and
then scp'ing the files from your work repository to the htdocs/peps
directory. That is, until SF fixes the scp problem. This method works (I
just updated all PEPs to up-to-date CVS versions) but it's a bit cumbersome.
And it only works if you have ssh access to your work environment. And it's
damned hard to script; I tried playing with a single ssh command that did
all the work, but between shell weirdness, scp weirdness and a genuine bash
bug I couldn't figure it out.

I assume that SF is aware of the severity of this problem, and is working on
something akin to a fix or workaround. Until then, I can do an occasional
update of the PEPs, for those that can't themselves.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Thu Dec 28 23:05:28 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 29 Dec 2000 00:05:28 +0100
Subject: [Python-Dev] scp to sourceforge
In-Reply-To: <20001228235824.E1811@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 11:58:25PM +0100
References: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> <20001228235824.E1811@xs4all.nl>
Message-ID: <20001229000528.F1811@xs4all.nl>

On Thu, Dec 28, 2000 at 11:58:25PM +0100, Thomas Wouters wrote:
> On Thu, Dec 28, 2000 at 12:08:18PM -0500, Guido van Rossum wrote:

> > Would somebody please figure out a way to update the PEPs?  It's kind
> > of pathetic to see the website not have the latest versions...
> 
> The way to update the peps is by ssh'ing into shell.sourceforge.net, and
> then scp'ing the files from your work repository to the htdocs/peps

[ blah blah ]

And then they fixed it ! At least, for me, direct scp now works fine. (I
should've tested that before posting my blah blah, sorry.) Anybody else,
like people using F-secure ssh (unix or windows) experience the same ?

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From MarkH@ActiveState.com  Thu Dec 28 23:15:01 2000
From: MarkH@ActiveState.com (Mark Hammond)
Date: Fri, 29 Dec 2000 10:15:01 +1100
Subject: [Python-Dev] chomp()?
In-Reply-To: <14923.31238.65155.496546@buffalo.fnal.gov>
Message-ID: <LCEPIIGDJPKCOIHOBJEPIEILCNAA.MarkH@ActiveState.com>

> -1 from me.  P=NP (Python is not Perl).  "Chomp" is an
> excessively cute name.
> And like you said, this is too much like "rstrip" to merit a separate
> method.

My thoughts exactly.  I can't remember _ever_ wanting to chomp() when
rstrip() wasnt perfectly suitable.  I'm sure it happens, but not often
enough to introduce an ambiguous new function purely for "feature parity"
with Perl.

Mark.



From esr@thyrsus.com  Thu Dec 28 23:25:28 2000
From: esr@thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 18:25:28 -0500
Subject: [Python-Dev] Re: curses in the core?
In-Reply-To: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Thu, Dec 28, 2000 at 11:52:02PM +0100
References: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>
Message-ID: <20001228182528.A10743@thyrsus.com>

Martin v. Loewis <martin@loewis.home.cs.tu-berlin.de>:
> In the case of curses, I believe there is a documentation error in the
> 2.0 documentation. The curses packages is listed under "Generic
> Operating System Services". I believe this is wrong, it should be listed
> as "Unix Specific Services".

I agree that this is an error and should be fixed.
 
> Unless I'm mistaken, the curses module is not available on the Mac and
> on Windows. With that change, the curses module would then fall into
> Eric's category 2 (Not documented as being in the core and not built
> in by default).

Well...that's a definitional question that is part of the larger issue here.

What does being in the Python core mean?  There are two potential definitions:

1. Documentation says it's available on all platforms.

2. Documentation restricts it to one of the three platform groups 
   (Unix/Windows/Mac) but implies that it will be available on any
   OS in that group.  

I think the second one is closer to what application programmers
thinking about which batteries are included expect.  But I could be
persuaded otherwise by a good argument.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The difference between death and taxes is death doesn't get worse
every time Congress meets
	-- Will Rogers


From akuchlin@mems-exchange.org  Fri Dec 29 00:33:36 2000
From: akuchlin@mems-exchange.org (A.M. Kuchling)
Date: Thu, 28 Dec 2000 19:33:36 -0500
Subject: [Python-Dev] Bookstore completed
Message-ID: <200012290033.TAA01295@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com>

OK, I think I'm ready to declare the Python bookstore complete enough
to go public.  Before I set up redirects from www.python.org, please
take another look.  (More book descriptions would be helpful...)

http://www.amk.ca/bookstore/

--amk




From akuchlin@mems-exchange.org  Fri Dec 29 00:46:16 2000
From: akuchlin@mems-exchange.org (A.M. Kuchling)
Date: Thu, 28 Dec 2000 19:46:16 -0500
Subject: [Python-Dev] Help wanted with setup.py script
Message-ID: <200012290046.TAA01346@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com>

Want to help with the laudable goal of automating the Python build
process?  It'll need lots of testing on many different platforms, and
I'd like to start the process now.

First, download the setup.py script from  
       http://www.amk.ca/files/python/setup.py

Next, drop it in the root directory of your Python source tree and run
"python setup.py build".  

If it dies with an exception, let me know.  (Replies to this list are
OK.)

If it runs to completion, look in the Modules/build/lib.<something>
directory to see which modules got built.  (On my system, <something>
is "linux-i686-2.0", but of course this will depend on your platform.)

Is anything missing that should have been built?  (_tkinter.so is the
prime candidate; the autodetection code is far too simple at the
moment and assumes one particular version of Tcl and Tk.)  Did an
attempt at building a module fail?  These indicate problems
autodetecting something, so if you can figure out how to find the
required library or include file, let me know what to do.

--amk


From fdrake@acm.org  Fri Dec 29 04:12:18 2000
From: fdrake@acm.org (Fred L. Drake)
Date: Thu, 28 Dec 2000 23:12:18 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <20001228212115.C1811@xs4all.nl>
Message-ID: <web-404134@digicool.com>

On Thu, 28 Dec 2000 21:21:15 +0100,
 Thomas Wouters <thomas@xs4all.net> wrote:
 > The 'except' clause is documented to execute if a
 > matching exception occurs,
 > and 'else' if no exception occurs. Maybe the intent of
 > the 'else' clause

  This can certainly be clarified in the documentation --
please file a bug report at http://sourceforge.net/projects/python/.
  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From tim.one@home.com  Fri Dec 29 04:25:44 2000
From: tim.one@home.com (Tim Peters)
Date: Thu, 28 Dec 2000 23:25:44 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <20001228212115.C1811@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEMMIFAA.tim.one@home.com>

[Fred, suggested doc change near the end]

[Thomas Wouters]
> (MAL and I already discussed this in private mail: Robin did mean
> try/except/else, and 'finally' already executes when returning
> directly from the 'try' block, even in Python 1.5)
>
> This code:
>
> try:
>    return
> except:
>    pass
> else:
>    print "returning"
>
> will indeed not print 'returning', but I believe it's by design.
> I'm against changing it, in any case, and not just because it'd
> break code :) If you want something that always executes, use a
> 'finally'. Or don't return from the 'try', but return in the
> 'else' clause.

Guido's out of town again, so I'll channel him:  Thomas is correct on all
counts.  In try/else, the "else" clause should execute if and only if
control "falls off the end" of the "try" block.

IOW, consider:

    try:
        arbitrary stuff
    x = 1

An "else" clause added to that "try" should execute when and only when the
code as written executes the "x = 1" after the block.  When "arbitrary
stuff" == "return", control does not fall off the end, so "else" shouldn't
trigger.  Same thing if "arbitrary stuff" == "break" and we're inside a
loop, or "continue" after Thomas's patch gets accepted.

> The 'except' clause is documented to execute if a matching
> exception occurs, and 'else' if no exception occurs.

Yup, and that's imprecise:  the same words are used to describe (part of)
when 'finally' executes, but they weren't intended to be the same.

> Maybe the intent of the 'else' clause would be clearer if it
> was documented to 'execute if the try: clause finishes without
> an exception being raised' ?

Sorry, I don't find that any clearer.  Let's be explicit:

    The optional 'else' clause is executed when the 'try' clause
    terminates by any means other than an exception or executing a
    'return', 'continue' or 'break' statement.  Exceptions in the
    'else' clause are not handled by the preceding 'except' clauses.

> The 'else' clause isn't executed when you 'break' or (after
> applying my continue-in-try patch ;) 'continue' out of the
> 'try', either.

Hey, now you're channeling me <wink>!  Be afraid -- be very afraid.



From moshez@zadka.site.co.il  Fri Dec 29 14:42:44 2000
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Fri, 29 Dec 2000 16:42:44 +0200 (IST)
Subject: [Python-Dev] chomp()?
In-Reply-To: <3A4B7BB1.F09660ED@lemburg.com>
References: <3A4B7BB1.F09660ED@lemburg.com>, <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <20001229144244.D5AD0A84F@darjeeling.zadka.site.co.il>

On Thu, 28 Dec 2000, "M.-A. Lemburg" <mal@lemburg.com> wrote:

[about chomp]
> We already have .splitlines() which does the above (remove
> line breaks) not only for a single line, but for many lines at once.
> 
> Even better: .splitlines() also does the right thing for Unicode.

OK, I retract my earlier +1, and instead I move that this be added
to the FAQ. Where is the FAQ maintained nowadays? The grail link
doesn't work anymore.

-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From loewis@informatik.hu-berlin.de  Fri Dec 29 16:52:13 2000
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Fri, 29 Dec 2000 17:52:13 +0100 (MET)
Subject: [Python-Dev] Re: [Patch #103002] Fix for #116285: Properly raise UnicodeErrors
Message-ID: <200012291652.RAA20251@pandora.informatik.hu-berlin.de>

[resent since python.org ran out of disk space]
> My only problem with it is your copyright notice. AFAIK, patches to
> the Python core cannot contain copyright notices without proper
> license information. OTOH, I don't think that these minor changes
> really warrant adding a complete license paragraph.

I'd like to get an "official" clarification on this question. Is it
the case that patches containing copyright notices are only accepted
if they are accompanied with license information?

I agree that the changes are minor, I also believe that I hold the
copyright to the changes whether I attach a notice or not (at least
according to our local copyright law).

What concerns me that without such a notice, gencodec.py looks as if
CNRI holds the copyright to it. I'm not willing to assign the
copyright of my changes to CNRI, and I'd like to avoid the impression
of doing so.

What is even more concerning is that CNRI also holds the copyright to
the generated files, even though they are derived from information
made available by the Unicode consortium!

Regards,
Martin


From tim.one@home.com  Fri Dec 29 19:56:36 2000
From: tim.one@home.com (Tim Peters)
Date: Fri, 29 Dec 2000 14:56:36 -0500
Subject: [Python-Dev] scp to sourceforge
In-Reply-To: <20001229000528.F1811@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEOBIFAA.tim.one@home.com>

[Thomas Wouters]
> And then they fixed it ! At least, for me, direct scp now works
> fine. (I should've tested that before posting my blah blah, sorry.)

I tried it immediately before posting my blah-blah yesterday, and it was
still hanging.

> Anybody else, like people using F-secure ssh (unix or windows)
> experience the same ?

Same here:  I tried it again just now (under Win98 cmdline ssh/scp) and it
worked fine!  We're in business again.  Thanks for fixing it, Thomas <wink>.

now-if-only-we-could-get-python-dev-email-on-an-approximation-to-the-
    same-day-it's-sent-ly y'rs  - tim



From tim.one@home.com  Fri Dec 29 20:27:40 2000
From: tim.one@home.com (Tim Peters)
Date: Fri, 29 Dec 2000 15:27:40 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <EC$An3AHZGT6EwJP@jessikat.fsnet.co.uk>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEODIFAA.tim.one@home.com>

[Robin Becker]
> The 2.0 docs clearly state 'The optional else clause is executed when no
> exception occurs in the try clause.' This makes it sound as though it
> gets executed on the 'way out'.

Of course.  That's not what the docs meant, though, and Guido is not going
to change the implementation now because that would break code that relies
on how Python has always *worked* in these cases.  The way Python works is
also the way Guido intended it to work (I'm allowed to channel him when he's
on vacation <0.9 wink)>.

Indeed, that's why I suggested a specific doc change.  If your friend would
also be confused by that, then we still have a problem; else we don't.



From tim.one@home.com  Fri Dec 29 20:37:08 2000
From: tim.one@home.com (Tim Peters)
Date: Fri, 29 Dec 2000 15:37:08 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <web-404134@digicool.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEOFIFAA.tim.one@home.com>

[Fred]
>   This can certainly be clarified in the documentation --
> please file a bug report at http://sourceforge.net/projects/python/.

Here you go:

https://sourceforge.net/bugs/?func=detailbug&bug_id=127098&group_id=5470



From thomas@xs4all.net  Fri Dec 29 20:59:16 2000
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 29 Dec 2000 21:59:16 +0100
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEODIFAA.tim.one@home.com>; from tim.one@home.com on Fri, Dec 29, 2000 at 03:27:40PM -0500
References: <EC$An3AHZGT6EwJP@jessikat.fsnet.co.uk> <LNBBLJKPBEHFEDALKOLCKEODIFAA.tim.one@home.com>
Message-ID: <20001229215915.L1281@xs4all.nl>

On Fri, Dec 29, 2000 at 03:27:40PM -0500, Tim Peters wrote:

> Indeed, that's why I suggested a specific doc change.  If your friend would
> also be confused by that, then we still have a problem; else we don't.

Note that I already uploaded a patch to fix the docs, assigned to fdrake,
using Tim's wording exactly. (patch #103045)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From moshez@zadka.site.co.il  Sun Dec 31 00:33:30 2000
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Sun, 31 Dec 2000 02:33:30 +0200 (IST)
Subject: [Python-Dev] FAQ Horribly Out Of Date
Message-ID: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>

Hi!
The current FAQ is horribly out of date. I think the FAQ-Wizard method
has proven itself not very efficient (for example, apparently no one
noticed until now that it's not working <0.2 wink>). Is there any 
hope putting the FAQ in Misc/, having a script which scp's it
to the SF page, and making that the official FAQ?

On a related note, what is the current status of the PSA? Is it officially
dead?
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From tim.one@home.com  Sat Dec 30 20:48:08 2000
From: tim.one@home.com (Tim Peters)
Date: Sat, 30 Dec 2000 15:48:08 -0500
Subject: [Python-Dev] Most everything is busted
Message-ID: <LNBBLJKPBEHFEDALKOLCCEOMIFAA.tim.one@home.com>

Add this error to the pot:

"""
http://www.python.org/cgi-bin/moinmoin

Proxy Error
The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /cgi-bin/moinmoin.

Reason: Document contains no data
-------------------------------------------------------------------

Apache/1.3.9 Server at www.python.org Port 80
"""

Also, as far as I can tell:

+ news->mail for c.l.py hasn't delivered anything for well over 24 hours.

+ No mail to Python-Dev has showed up in the archives (let alone been
delivered) since Fri, 29 Dec 2000 16:42:44 +0200 (IST).

+ The other Python mailing lists appear equally dead.

time-for-a-new-year!-ly y'rs  - tim



From barry@wooz.org  Sun Dec 31 01:06:23 2000
From: barry@wooz.org (Barry A. Warsaw)
Date: Sat, 30 Dec 2000 20:06:23 -0500
Subject: [Python-Dev] Re: Most everything is busted
References: <LNBBLJKPBEHFEDALKOLCCEOMIFAA.tim.one@home.com>
Message-ID: <14926.34447.60988.553140@anthem.concentric.net>

>>>>> "TP" == Tim Peters <tim.one@home.com> writes:

    TP> + news->mail for c.l.py hasn't delivered anything for well
    TP> over 24 hours.

    TP> + No mail to Python-Dev has showed up in the archives (let
    TP> alone been delivered) since Fri, 29 Dec 2000 16:42:44 +0200
    TP> (IST).

    TP> + The other Python mailing lists appear equally dead.

There's a stupid, stupid bug in Mailman 2.0, which I've just fixed and
(hopefully) unjammed things on the Mailman end[1].  We're still
probably subject to the Postfix delays unfortunately; I think those
are DNS related, and I've gotten a few other reports of DNS oddities,
which I've forwarded off to the DC sysadmins.  I don't think that
particular problem will be fixed until after the New Year.

relax-and-enjoy-the-quiet-ly y'rs,
-Barry

[1] For those who care: there's a resource throttle in qrunner which
limits the number of files any single qrunner process will handle.
qrunner does a listdir() on the qfiles directory and ignores any .msg
file it finds (it only does the bulk of the processing on the
corresponding .db files).  But it performs the throttle check on every
file in listdir() so depending on the order that listdir() returns and
the number of files in the qfiles directory, the throttle check might
get triggered before any .db file is seen.  Wedge city.  This is
serious enough to warrant a Mailman 2.0.1 release, probably mid-next
week.



From gstein@lyra.org  Sun Dec 31 10:19:50 2000
From: gstein@lyra.org (Greg Stein)
Date: Sun, 31 Dec 2000 02:19:50 -0800
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Sun, Dec 31, 2000 at 02:33:30AM +0200
References: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>
Message-ID: <20001231021950.M28628@lyra.org>

On Sun, Dec 31, 2000 at 02:33:30AM +0200, Moshe Zadka wrote:
>...
> On a related note, what is the current status of the PSA? Is it officially
> dead?

The PSA was always kind of a (legal) fiction with the basic intent to help
provide some funding for Python development. Since that isn't occurring at
CNRI any more, the PSA is a bit moot. There was always some idea that maybe
the PSA would be the "sponsor" (and possibly the beneficiary) of the
conferences. That wasn't ever really formalized either.

>From the Consortium meeting back in July, when we spoke with Bob Kahn and Al
Vezza, I recall that they agreed the PSA was moot now.

So while I can't say it is "officially dead", it is fair to say that it is
dead for all intents and purposes. There is little motivation or purpose for
it at this point in time.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From akuchlin@mems-exchange.org  Sun Dec 31 15:58:12 2000
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Sun, 31 Dec 2000 10:58:12 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Sun, Dec 31, 2000 at 02:33:30AM +0200
References: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>
Message-ID: <20001231105812.A12168@newcnri.cnri.reston.va.us>

On Sun, Dec 31, 2000 at 02:33:30AM +0200, Moshe Zadka wrote:
>The current FAQ is horribly out of date. I think the FAQ-Wizard method
>has proven itself not very efficient (for example, apparently no one
>noticed until now that it's not working <0.2 wink>). Is there any 

It also leads to one section of the FAQ (#3, I think) having something
like 60 questions jumbled together.  IMHO the FAQ should be a text
file, perhaps in the PEP format so it can be converted to HTML, and it
should have an editor who'll arrange it into smaller sections.  Any
volunteers?  (Must ... resist ...  urge to volunteer myself...  help
me, Spock...)

--amk




From skip@mojam.com (Skip Montanaro)  Sun Dec 31 19:25:18 2000
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sun, 31 Dec 2000 13:25:18 -0600 (CST)
Subject: [Python-Dev] plz test bsddb using shared linkage
Message-ID: <14927.34846.153117.764547@beluga.mojam.com>

A bug was filed on SF contending that the default linkage for bsddb should
be shared instead of static because some Linux systems ship multiple
versions of libdb.

Would those of you who can and do build bsddb (probably only unixoids of
some variety) please give this simple test a try?  Uncomment the *shared*
line in Modules/Setup.config.in, re-run configure, build Python and then
try:

    import bsddb
    db = bsddb.btopen("/tmp/dbtest.db", "c")
    db["1"] = "1"
    print db["1"]
    db.close()
    del db

If this doesn't fail for anyone I'll check the change in and close the bug
report, otherwise I'll add a(nother) comment to the bug report that *shared*
breaks bsddb for others and close the bug report.

Thx,

Skip



From skip@mojam.com (Skip Montanaro)  Sun Dec 31 19:26:16 2000
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sun, 31 Dec 2000 13:26:16 -0600 (CST)
Subject: [Python-Dev] plz test bsddb using shared linkage
Message-ID: <14927.34904.20832.319647@beluga.mojam.com>

oops, forgot the bug report is at

  http://sourceforge.net/bugs/?func=detailbug&bug_id=126564&group_id=5470

for those of you who do not monitor python-bugs-list.

S


From tim.one@home.com  Sun Dec 31 20:28:47 2000
From: tim.one@home.com (Tim Peters)
Date: Sun, 31 Dec 2000 15:28:47 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEBBIGAA.tim.one@home.com>

[Moshe Zadka]
> The current FAQ is horribly out of date.

The password is Spam.  Fix it <wink>.

> I think the FAQ-Wizard method has proven itself not very
> efficient (for example, apparently no one noticed until now
> that it's not working <0.2 wink>).

I'm afraid almost nothing on python.org with an active component works today
(not searches, not the FAQ Wizard, not the 2.0 Wiki, ...).  If history is
any clue, these will remain broken until Guido gets back from vacation.

> Is there any hope putting the FAQ in Misc/, having a script
> which scp's it to the SF page, and making that the official FAQ?

Would be OK by me.  I'm more concerned that the whole of python.org has
barely been updated since March; huge chunks of the FAQ are still relevant,
but, e.g., the Job Board hasn't been touched in over 3 months; the News got
so out of date Guido deleted the whole section; etc.

> On a related note, what is the current status of the PSA? Is it
> officially dead?

It appears that CNRI can only think about one thing at a time <0.5 wink>.
For the last 6 months, that thing has been the license.  If they ever
resolve the GPL compatibility issue, maybe they can be persuaded to think
about the PSA.  In the meantime, I'd suggest you not renew <ahem>.



From tim.one@home.com  Sun Dec 31 22:12:43 2000
From: tim.one@home.com (Tim Peters)
Date: Sun, 31 Dec 2000 17:12:43 -0500
Subject: [Python-Dev] plz test bsddb using shared linkage
In-Reply-To: <14927.34846.153117.764547@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGECEIGAA.tim.one@home.com>

[Skip Montanaro]
> ...
> Would those of you who can and do build bsddb (probably only
> unixoids of some variety) please give this simple test a try?

Just noting that bsddb already ships with the Windows installer as a
(shared) DLL.  But it's an old (1.85?) Windows port from Sam Rushing.



From gward at mems-exchange.org  Fri Dec  1 00:14:39 2000
From: gward at mems-exchange.org (Greg Ward)
Date: Thu, 30 Nov 2000 18:14:39 -0500
Subject: [Python-Dev] PEP 229 and 222
In-Reply-To: <20001128215748.A22105@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Tue, Nov 28, 2000 at 09:57:48PM -0500
References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us>
Message-ID: <20001130181438.A21596@ludwig.cnri.reston.va.us>

On 28 November 2000, Andrew Kuchling said:
> On Tue, Nov 28, 2000 at 06:01:38PM -0500, Guido van Rossum wrote:
> >- Always shared libs.  What about Unixish systems that don't have
> >  shared libs?  What if you just want something to be hardcoded as
> >  statically linked, e.g. for security reasons?  (On the other hand
> 
> Beats me.  I'm not even sure if the Distutils offers a way to compile
> a static Python binary.  (GPW: well, does it?)

It's in the CCompiler interface, but hasn't been exposed to the outside
world.  (IOW, it's mainly a question of desiging the right setup
script/command line interface: the implementation should be fairly
straightforward, assuming the existing CCompiler classes do the right
thing for generating binary executables.)

        Greg



From gward at mems-exchange.org  Fri Dec  1 00:19:38 2000
From: gward at mems-exchange.org (Greg Ward)
Date: Thu, 30 Nov 2000 18:19:38 -0500
Subject: [Python-Dev] A house upon the sand
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEMDIBAA.tim.one@home.com>; from tim.one@home.com on Wed, Nov 29, 2000 at 01:23:10AM -0500
References: <200011281510.KAA03475@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCOEMDIBAA.tim.one@home.com>
Message-ID: <20001130181937.B21596@ludwig.cnri.reston.va.us>

On 29 November 2000, Tim Peters said:
> [Guido]
> > ...
> > Because of its importance, the deprecation time of the string module
> > will be longer than that of most deprecated modules.  I expect it
> > won't be removed until Python 3000.
> 
> I see nothing in the 2.0 docs, code, or "what's new" web pages saying that
> it's deprecated.  So I don't think you can even start the clock on this one
> before 2.1 (a fuzzy stmt on the web page for the unused 1.6 release doesn't
> count ...).

FWIW, I would argue against *ever* removing (much less "deprecating",
ie. threatening to remove) the string module.  To a rough approximation,
every piece of Python code in existence code prior to Python 1.6 depends
on the string module.  I for one do not want to have to change all
occurences of string.foo(x) to x.foo() -- it just doesn't buy enough to
make it worth changing all that code.  

Not only does the amount of code to change mean the change would be
non-trivial, it's not always the right thing, especially if you happen
to be one of the people who dislikes the "delim.join(list)" idiom.  (I'm
still undecided.)

        Greg



From fredrik at effbot.org  Fri Dec  1 07:39:57 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 1 Dec 2000 07:39:57 +0100
Subject: [Python-Dev] TypeError: foo, bar
Message-ID: <008f01c05b61$877263b0$3c6340d5@hagrid>

just stumbled upon yet another (high-profile) python newbie
confused a "TypeError: read-only character buffer, dictionary"
message.

how about changing "read-only character buffer" to "string
or read-only character buffer", and the "foo, bar" format to
"expected foo, found bar", so we get:

    "TypeError: expected string or read-only character
    buffer, found dictionary"

</F>




From tim.one at home.com  Fri Dec  1 07:58:53 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 1 Dec 2000 01:58:53 -0500
Subject: [Python-Dev] TypeError: foo, bar
In-Reply-To: <008f01c05b61$877263b0$3c6340d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEAOICAA.tim.one@home.com>

[Fredrik Lundh]
> just stumbled upon yet another (high-profile) python newbie
> confused a "TypeError: read-only character buffer, dictionary"
> message.
>
> how about changing "read-only character buffer" to "string
> or read-only character buffer", and the "foo, bar" format to
> "expected foo, found bar", so we get:
> 
>     "TypeError: expected string or read-only character
>     buffer, found dictionary"

+0.  +1 if "found" is changed to "got".

"found"-implies-a-search-ly y'rs  - tim




From thomas.heller at ion-tof.com  Fri Dec  1 09:10:21 2000
From: thomas.heller at ion-tof.com (Thomas Heller)
Date: Fri, 1 Dec 2000 09:10:21 +0100
Subject: [Python-Dev] PEP 229 and 222
References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us> <20001130181438.A21596@ludwig.cnri.reston.va.us>
Message-ID: <014301c05b6e$269716a0$e000a8c0@thomasnotebook>

> > Beats me.  I'm not even sure if the Distutils offers a way to compile
> > a static Python binary.  (GPW: well, does it?)
> 
> It's in the CCompiler interface, but hasn't been exposed to the outside
> world.  (IOW, it's mainly a question of desiging the right setup
> script/command line interface: the implementation should be fairly
> straightforward, assuming the existing CCompiler classes do the right
> thing for generating binary executables.)
Distutils currently only supports build_*** commands for
C-libraries and Python extensions.

Shouldn't there also be build commands for shared libraries,
executable programs and static Python binaries?

Thomas

BTW: Distutils-sig seems pretty dead these days...





From ping at lfw.org  Fri Dec  1 11:23:56 2000
From: ping at lfw.org (Ka-Ping Yee)
Date: Fri, 1 Dec 2000 02:23:56 -0800 (PST)
Subject: [Python-Dev] Cryptic error messages
Message-ID: <Pine.LNX.4.10.10011181405020.504-100000@skuld.kingmanhall.org>

An attempt to use sockets for the first time yesterday left a
friend of mine bewildered:

    >>> import socket
    >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    >>> s.connect('localhost:234')
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    TypeError: 2-sequence, 13-sequence
    >>> 

"What the heck does '2-sequence, 13-sequence' mean?" he rightfully asked.


I see in getargs.c (line 275) that this type of message is documented:

    /* Convert a tuple argument.
    [...]
       If the argument is invalid:
    [...]
          *msgbuf contains an error message, whose format is:
             "<typename1>, <typename2>", where:
                <typename1> is the name of the expected type, and
                <typename2> is the name of the actual type,
             (so you can surround it by "expected ... found"),
          and msgbuf is returned.
    */

It's clear that the socketmodule is not prepending "expected" and
appending "found", as the author of converttuple intended.

But when i grepped through the source code, i couldn't find anyone
applying this "expected %s found" % msgbuf convention outside of
getargs.c.  Is it really in use?

Could we just change getargs.c so that converttuple() returns a
message like "expected ..., got ..." instead of seterror()?

Additionally it would be nice to say '13-character string' instead
of '13-sequence'...


-- ?!ng

"All models are wrong; some models are useful."
    -- George Box




From mwh21 at cam.ac.uk  Fri Dec  1 12:20:23 2000
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 01 Dec 2000 11:20:23 +0000
Subject: [Python-Dev] Cryptic error messages
In-Reply-To: Ka-Ping Yee's message of "Fri, 1 Dec 2000 02:23:56 -0800 (PST)"
References: <Pine.LNX.4.10.10011181405020.504-100000@skuld.kingmanhall.org>
Message-ID: <m37l5k5qx4.fsf@atrus.jesus.cam.ac.uk>

Ka-Ping Yee <ping at lfw.org> writes:

> An attempt to use sockets for the first time yesterday left a
> friend of mine bewildered:
> 
>     >>> import socket
>     >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>     >>> s.connect('localhost:234')
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     TypeError: 2-sequence, 13-sequence
>     >>> 
> 
> "What the heck does '2-sequence, 13-sequence' mean?" he rightfully asked.
> 

I'm not sure about the general case, but in this case you could do
something like:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102599&group_id=5470

Now you get an error message like:

TypeError: getsockaddrarg: AF_INET address must be tuple, not string

Cheers,
M.

-- 
  I have gathered a posie of other men's flowers, and nothing but the
  thread that binds them is my own.                       -- Montaigne




From guido at python.org  Fri Dec  1 14:02:02 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 01 Dec 2000 08:02:02 -0500
Subject: [Python-Dev] TypeError: foo, bar
In-Reply-To: Your message of "Fri, 01 Dec 2000 07:39:57 +0100."
             <008f01c05b61$877263b0$3c6340d5@hagrid> 
References: <008f01c05b61$877263b0$3c6340d5@hagrid> 
Message-ID: <200012011302.IAA31609@cj20424-a.reston1.va.home.com>

> just stumbled upon yet another (high-profile) python newbie
> confused a "TypeError: read-only character buffer, dictionary"
> message.
> 
> how about changing "read-only character buffer" to "string
> or read-only character buffer", and the "foo, bar" format to
> "expected foo, found bar", so we get:
> 
>     "TypeError: expected string or read-only character
>     buffer, found dictionary"

The first was easy, and I've done it.  The second one, for some
reason, is hard.  I forget why.  Sorry.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From cgw at fnal.gov  Fri Dec  1 14:41:04 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Fri, 1 Dec 2000 07:41:04 -0600 (CST)
Subject: [Python-Dev] TypeError: foo, bar
In-Reply-To: <008f01c05b61$877263b0$3c6340d5@hagrid>
References: <008f01c05b61$877263b0$3c6340d5@hagrid>
Message-ID: <14887.43632.812342.414156@buffalo.fnal.gov>

Fredrik Lundh writes:

 > how about changing "read-only character buffer" to "string
 > or read-only character buffer", and the "foo, bar" format to
 > "expected foo, found bar", so we get:
 > 
 >     "TypeError: expected string or read-only character
 >     buffer, found dictionary"

+100.  Recently, I've been teaching Python to some beginners and they
find this message absolutely inscrutable.

Also agree with Tim about "found" vs. "got", but this is of secondary
importance.




From moshez at math.huji.ac.il  Fri Dec  1 15:26:03 2000
From: moshez at math.huji.ac.il (Moshe Zadka)
Date: Fri, 1 Dec 2000 16:26:03 +0200 (IST)
Subject: [Python-Dev] [OT] Change of Address
Message-ID: <Pine.GSO.4.10.10012011624170.1366-100000@sundial>

I'm sorry to bother you all with this, but from time to time you might
need to reach my by e-mail...
30 days from now, this e-mail address will no longer be valid.
Please use anything at zadka.site.co.il to reach me.

Thank you for your time.
--
Moshe Zadka <moshez at zadka.site.co.il> -- 95855124
http://advogato.org/person/moshez




From gward at mems-exchange.org  Fri Dec  1 16:14:53 2000
From: gward at mems-exchange.org (Greg Ward)
Date: Fri, 1 Dec 2000 10:14:53 -0500
Subject: [Python-Dev] PEP 229 and 222
In-Reply-To: <014301c05b6e$269716a0$e000a8c0@thomasnotebook>; from thomas.heller@ion-tof.com on Fri, Dec 01, 2000 at 09:10:21AM +0100
References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us> <20001130181438.A21596@ludwig.cnri.reston.va.us> <014301c05b6e$269716a0$e000a8c0@thomasnotebook>
Message-ID: <20001201101452.A26074@ludwig.cnri.reston.va.us>

On 01 December 2000, Thomas Heller said:
> Distutils currently only supports build_*** commands for
> C-libraries and Python extensions.
> 
> Shouldn't there also be build commands for shared libraries,
> executable programs and static Python binaries?

Andrew and I talked about this a bit yesterday, and the proposed
interface is as follows:

    python setup.py build_ext --static

will compile all extensions in the current module distribution, but
instead of creating a .so (.pyd) file for each one, will create a new
python binary in build/bin.<plat>.

Issue to be resolved: what to call the new python binary, especially
when installing it (presumably we *don't* want to clobber the stock
binary, but supplement it with (eg.) "foopython").

Note that there is no provision for selectively building some extensions
as shared.  This means that Andrew's Distutil-ization of the standard
library will have to override the build_ext command and have some extra
way to select extensions for shared/static.  Neither of us considered
this a problem.

> BTW: Distutils-sig seems pretty dead these days...

Yeah, that's a combination of me playing on other things and python.net
email being dead for over a week.  I'll cc the sig on this and see if
this interface proposal gets anyone's attention.

        Greg



From jeremy at alum.mit.edu  Fri Dec  1 20:27:14 2000
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 1 Dec 2000 14:27:14 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
Message-ID: <14887.64402.88530.714821@bitdiddle.concentric.net>

There was recently some idle chatter in Guido's living room about
using a unit testing framework (like PyUnit) for the Python regression
test suite.  We're also writing tests for some DC projects, and need
to decide what framework to use.

Does anyone have opinions on test frameworks?  A quick web search
turned up PyUnit (pyunit.sourceforge.net) and a script by Tres Seaver
that allows implements xUnit-style unit tests.  Are there other tools
we should consider?

Is anyone else interested in migrating the current test suite to a new
framework?  I hope the new framework will allow us to improve the test
suite in a number of ways:

    - run an entire test suite to completion instead of stopping on
      the first failure

    - clearer reporting of what went wrong

    - better support for conditional tests, e.g. write a test for
      httplib that only runs if the network is up.  This is tied into
      better error reporting, since the current test suite could only
      report that httplib succeeded or failed.

Jeremy



From fdrake at acm.org  Fri Dec  1 20:24:46 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 1 Dec 2000 14:24:46 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net>
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
Message-ID: <14887.64254.399477.935828@cj42289-a.reston1.va.home.com>

Jeremy Hylton writes:
 >     - better support for conditional tests, e.g. write a test for
 >       httplib that only runs if the network is up.  This is tied into
 >       better error reporting, since the current test suite could only
 >       report that httplib succeeded or failed.

  There is a TestSkipped exception that can be raised with an
explanation of why.  It's used in the largefile test (at least).  I
think it is documented in the README.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From akuchlin at mems-exchange.org  Fri Dec  1 20:58:27 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Fri, 1 Dec 2000 14:58:27 -0500
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 02:27:14PM -0500
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
Message-ID: <20001201145827.D16751@kronos.cnri.reston.va.us>

On Fri, Dec 01, 2000 at 02:27:14PM -0500, Jeremy Hylton wrote:
>There was recently some idle chatter in Guido's living room about
>using a unit testing framework (like PyUnit) for the Python regression
>test suite.  We're also writing tests for some DC projects, and need

Someone remembered my post of 23 Nov, I see...  The only other test
framework I know of is the unittest.py inside Quixote, written because
we thought PyUnit was kind of clunky.  Greg Ward, who primarily wrote
it, used more sneaky interpreter tricks to make the interface more
natural, though it still worked with Jython last time we checked (some
time ago, though).  No GUI, but it can optionally show the code coverage of a
test suite, too.

See http://x63.deja.com/=usenet/getdoc.xp?AN=683946404 for some notes
on using it.  Obviously I think the Quixote unittest.py is the best
choice for the stdlib.

--amk



From jeremy at alum.mit.edu  Fri Dec  1 21:55:28 2000
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 1 Dec 2000 15:55:28 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <20001201145827.D16751@kronos.cnri.reston.va.us>
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
	<20001201145827.D16751@kronos.cnri.reston.va.us>
Message-ID: <14888.4160.838336.537708@bitdiddle.concentric.net>

Is there any documentation for the Quixote unittest tool?  The Example
page is helpful, but it feels like there are some details that are not
explained.

Jeremy



From akuchlin at mems-exchange.org  Fri Dec  1 22:12:12 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Fri, 1 Dec 2000 16:12:12 -0500
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14888.4160.838336.537708@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 03:55:28PM -0500
References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net>
Message-ID: <20001201161212.A12372@kronos.cnri.reston.va.us>

On Fri, Dec 01, 2000 at 03:55:28PM -0500, Jeremy Hylton wrote:
>Is there any documentation for the Quixote unittest tool?  The Example
>page is helpful, but it feels like there are some details that are not
>explained.

I don't believe we've written docs at all for internal use.  What
details seem to be missing?

--amk




From jeremy at alum.mit.edu  Fri Dec  1 22:21:27 2000
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 1 Dec 2000 16:21:27 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <20001201161212.A12372@kronos.cnri.reston.va.us>
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
	<20001201145827.D16751@kronos.cnri.reston.va.us>
	<14888.4160.838336.537708@bitdiddle.concentric.net>
	<20001201161212.A12372@kronos.cnri.reston.va.us>
Message-ID: <14888.5719.844387.435471@bitdiddle.concentric.net>

>>>>> "AMK" == Andrew Kuchling <akuchlin at mems-exchange.org> writes:

  AMK> On Fri, Dec 01, 2000 at 03:55:28PM -0500, Jeremy Hylton wrote:
  >> Is there any documentation for the Quixote unittest tool?  The
  >> Example page is helpful, but it feels like there are some details
  >> that are not explained.

  AMK> I don't believe we've written docs at all for internal use.
  AMK> What details seem to be missing?

Details:

   - I assume setup/shutdown are equivalent to setUp/tearDown 
   - Is it possible to override constructor for TestScenario?
   - Is there something equivalent to PyUnit self.assert_
   - What does parse_args() do?
   - What does run_scenarios() do?
   - If I have multiple scenarios, how do I get them to run?

Jeremy




From akuchlin at mems-exchange.org  Fri Dec  1 22:34:30 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Fri, 1 Dec 2000 16:34:30 -0500
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14888.5719.844387.435471@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 04:21:27PM -0500
References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net> <20001201161212.A12372@kronos.cnri.reston.va.us> <14888.5719.844387.435471@bitdiddle.concentric.net>
Message-ID: <20001201163430.A12417@kronos.cnri.reston.va.us>

On Fri, Dec 01, 2000 at 04:21:27PM -0500, Jeremy Hylton wrote:
>   - I assume setup/shutdown are equivalent to setUp/tearDown 

Correct.

>   - Is it possible to override constructor for TestScenario?

Beats me; I see no reason why you couldn't, though.

>   - Is there something equivalent to PyUnit self.assert_

Probably test_bool(), I guess: self.test_bool('self.run.is_draft()')
asserts that self.run.is_draft() will return true.  Or does
self.assert_() do something more?

>   - What does parse_args() do?
>   - What does run_scenarios() do?
>   - If I have multiple scenarios, how do I get them to run?

These 3 questions are all related, really.  At the bottom of our test
scripts, we have the following stereotyped code:

if __name__ == "__main__":
    (scenarios, options) = parse_args()
    run_scenarios (scenarios, options)

parse_args() ensures consistent arguments to test scripts; -c measures
code coverage, -v is verbose, etc.  It also looks in the __main__
module and finds all subclasses of TestScenario, so you can do:  

python test_process_run.py  # Runs all N scenarios
python test_process_run.py ProcessRunTest # Runs all cases for 1 scenario
python test_process_run.py ProcessRunTest:check_access # Runs one test case
                                                       # in one scenario class

--amk




From tim.one at home.com  Fri Dec  1 22:47:54 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 1 Dec 2000 16:47:54 -0500
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECFICAA.tim.one@home.com>

[Jeremy Hylton]
> There was recently some idle chatter in Guido's living room about
> using a unit testing framework (like PyUnit) for the Python regression
> test suite.  We're also writing tests for some DC projects, and need
> to decide what framework to use.
>
> Does anyone have opinions on test frameworks?  A quick web search
> turned up PyUnit (pyunit.sourceforge.net) and a script by Tres Seaver
> that allows implements xUnit-style unit tests.  Are there other tools
> we should consider?

My own doctest is loved by people other than just me <wink>, but is aimed at
ensuring that examples in docstrings work exactly as shown (which is why it
starts with "doc" instead of "test").

> Is anyone else interested in migrating the current test suite to a new
> framework?

Yes.

> I hope the new framework will allow us to improve the test
> suite in a number of ways:
>
>     - run an entire test suite to completion instead of stopping on
>       the first failure

doctest does that.

>     - clearer reporting of what went wrong

Ditto.

>     - better support for conditional tests, e.g. write a test for
>       httplib that only runs if the network is up.  This is tied into
>       better error reporting, since the current test suite could only
>       report that httplib succeeded or failed.

A doctest test is simply an interactive Python session pasted into a
docstring (or more than one session, and/or interspersed with prose).  If
you can write an example in the interactive shell, doctest will verify it
still works as advertised.  This allows for embedding unit tests into the
docs for each function, method and class.  Nothing about them "looks like"
an artificial test tacked on:  the examples in the docs *are* the test
cases.

I need to try the other frameworks.  I dare say doctest is ideal for
computational functions, where the intended input->output relationship can
be clearly explicated via examples.  It's useless for GUIs.  Usefulness
varies accordingly between those extremes (doctest is natural exactly to the
extent that a captured interactive session is helpful for documentation
purposes).

testing-ain't-easy-ly y'rs  - tim




From barry at digicool.com  Sat Dec  2 04:52:29 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Fri, 1 Dec 2000 22:52:29 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
Message-ID: <14888.29181.355023.669030@anthem.concentric.net>

I've just uploaded PEP 231, which describes a new hook in the instance
access mechanism, called __findattr__() after a similar mechanism that
exists in Jython (but is not exposed at the Python layer).

You can do all kinds of interesting things with __findattr__(),
including implement the __of__() protocol of ExtensionClass, and thus
implicit and explicit acquisitions, in pure Python.  You can also do
Java Bean-like interfaces and C++-like access control.  The PEP
contains sample implementations of all of these, although the latter
isn't as clean as I'd like, due to other restrictions in Python.

My hope is that __findattr__() would eliminate most, if not all, the
need for ExtensionClass, at least within the Zope and ZODB contexts.
I haven't tried to implement Persistent using it though.

Since it's a long PEP, I won't include it here.  You can read about it
at this URL

    http://python.sourceforge.net/peps/pep-0231.html

It includes a link to the patch implementing this feature on
SourceForge.

Enjoy,
-Barry



From moshez at math.huji.ac.il  Sat Dec  2 10:11:50 2000
From: moshez at math.huji.ac.il (Moshe Zadka)
Date: Sat, 2 Dec 2000 11:11:50 +0200 (IST)
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <14888.29181.355023.669030@anthem.concentric.net>
Message-ID: <Pine.GSO.4.10.10012021109320.1366-100000@sundial>

On Fri, 1 Dec 2000, Barry A. Warsaw wrote:

> I've just uploaded PEP 231, which describes a new hook in the instance
> access mechanism, called __findattr__() after a similar mechanism that
> exists in Jython (but is not exposed at the Python layer).

There's one thing that bothers me about this: what exactly is "the
call stack"? Let me clarify: what happens when you have threads.
Either machine-level threads and stackless threads confuse the issues
here, not to talk about stackless continuations. Can you add a few
words to the PEP about dealing with those?




From mal at lemburg.com  Sat Dec  2 11:03:11 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 02 Dec 2000 11:03:11 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
Message-ID: <3A28C8DF.E430484F@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> I've just uploaded PEP 231, which describes a new hook in the instance
> access mechanism, called __findattr__() after a similar mechanism that
> exists in Jython (but is not exposed at the Python layer).
> 
> You can do all kinds of interesting things with __findattr__(),
> including implement the __of__() protocol of ExtensionClass, and thus
> implicit and explicit acquisitions, in pure Python.  You can also do
> Java Bean-like interfaces and C++-like access control.  The PEP
> contains sample implementations of all of these, although the latter
> isn't as clean as I'd like, due to other restrictions in Python.
> 
> My hope is that __findattr__() would eliminate most, if not all, the
> need for ExtensionClass, at least within the Zope and ZODB contexts.
> I haven't tried to implement Persistent using it though.

The PEP does define when and how __findattr__() is called,
but makes no statement about what it should do or return...

Here's a slightly different idea:

Given the name, I would expect it to go look for an attribute
and then return the attribute and its container (this
doesn't seem to be what you have in mind here, though).

An alternative approach given the semantics above would
then be to first try a __getattr__() lookup and revert
to __findattr__() in case this fails. I don't think there
is any need to overload __setattr__() in such a way, because
you cannot be sure which object actually gets the new attribute.

By exposing the functionality using a new builtin, findattr(),
this could be used for all the examples you give too.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From barry at digicool.com  Sat Dec  2 17:50:02 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 2 Dec 2000 11:50:02 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
	<3A28C8DF.E430484F@lemburg.com>
Message-ID: <14889.10298.621133.961677@anthem.concentric.net>

>>>>> "M" == M  <mal at lemburg.com> writes:

    M> The PEP does define when and how __findattr__() is called,
    M> but makes no statement about what it should do or return...

Good point.  I've clarified that in the PEP.

    M> Here's a slightly different idea:

    M> Given the name, I would expect it to go look for an attribute
    M> and then return the attribute and its container (this doesn't
    M> seem to be what you have in mind here, though).

No, because some applications won't need a wrapped object.  E.g. in
the Java bean example, it just returns the attribute (which is stored
with a slightly different name).

    M> An alternative approach given the semantics above would then be
    M> to first try a __getattr__() lookup and revert to
    M> __findattr__() in case this fails.

I don't think this is as useful.  What would that buy you that you
can't already do today?

The key concept here is that you want to give the class first crack to
interpose on every attribute access.  You want this hook to get called
before anybody else can get at, or set, your attributes.  That gives
you (the class) total control to implement whatever policy is useful.
    
    M> I don't think there is any need to overload __setattr__() in
    M> such a way, because you cannot be sure which object actually
    M> gets the new attribute.

    M> By exposing the functionality using a new builtin, findattr(),
    M> this could be used for all the examples you give too.

No, because then people couldn't use the object in the normal
dot-notational way.

-Barry



From tismer at tismer.com  Sat Dec  2 17:27:33 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sat, 02 Dec 2000 18:27:33 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
Message-ID: <3A2922F5.C2E0D10@tismer.com>

Hi Barry,

"Barry A. Warsaw" wrote:
> 
> I've just uploaded PEP 231, which describes a new hook in the instance
> access mechanism, called __findattr__() after a similar mechanism that
> exists in Jython (but is not exposed at the Python layer).
> 
> You can do all kinds of interesting things with __findattr__(),
> including implement the __of__() protocol of ExtensionClass, and thus
> implicit and explicit acquisitions, in pure Python.  You can also do
> Java Bean-like interfaces and C++-like access control.  The PEP
> contains sample implementations of all of these, although the latter
> isn't as clean as I'd like, due to other restrictions in Python.
> 
> My hope is that __findattr__() would eliminate most, if not all, the
> need for ExtensionClass, at least within the Zope and ZODB contexts.
> I haven't tried to implement Persistent using it though.

I have been using ExtensionClass for quite a long time, and
I have to say that you indeed eliminate most of its need
through this super-elegant idea. Congratulations!

Besides acquisition and persitency interception,
wrapping plain C objects and giving them Class-like behavior
while retaining fast access to internal properties but being
able to override methods by Python methods was my other use
of ExtensionClass. I assume this is the other "20%" part you
mention, which is much harder to achieve?
But that part also looks easier to implement now, by the support
of the __findattr__ method.

> Since it's a long PEP, I won't include it here.  You can read about it
> at this URL
> 
>     http://python.sourceforge.net/peps/pep-0231.html

Great. I had to read it twice, but it was fun.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From tismer at tismer.com  Sat Dec  2 17:55:21 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sat, 02 Dec 2000 18:55:21 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <Pine.GSO.4.10.10012021109320.1366-100000@sundial>
Message-ID: <3A292979.60BB1731@tismer.com>


Moshe Zadka wrote:
> 
> On Fri, 1 Dec 2000, Barry A. Warsaw wrote:
> 
> > I've just uploaded PEP 231, which describes a new hook in the instance
> > access mechanism, called __findattr__() after a similar mechanism that
> > exists in Jython (but is not exposed at the Python layer).
> 
> There's one thing that bothers me about this: what exactly is "the
> call stack"? Let me clarify: what happens when you have threads.
> Either machine-level threads and stackless threads confuse the issues
> here, not to talk about stackless continuations. Can you add a few
> words to the PEP about dealing with those?

As far as I understood the patch (just skimmed), thee is no
stack involved directly, but the instance increments and decrments
a variable infindattr.

+       if (v != NULL && !inst->infindaddr &&
+           (func = inst->in_class->cl_findattr))
+       {
+               PyObject *args, *res;
+               args = Py_BuildValue("(OOO)", inst, name, v);
+               if (args == NULL)
+                       return -1;
+               ++inst->infindaddr;
+               res = PyEval_CallObject(func, args);
+               --inst->infindaddr;

This is: The call modifies the instance's state, while calling
the findattr method.
You are right: I see a serious problem with this. It doesn't
even need continuations to get things messed up. Guido's
proposed coroutines, together with uThread-Switching, might
be able to enter the same instance twice with ease.

Barry, after second thought, I feel this can become
a problem in the future. This infindattr attribute
only works correctly if we are guaranteed to use
strict stack order of execution.
What you're *intending* to to is to tell the PyEval_CallObject
that it should not find the __findattr__ attribute. But
this should be done only for this call and all of its descendants,
but no *fresh* access from elsewhere.

The hard way to get out of this would be to stop scheduling
in that case. Maybe this is very cheap, but quite unelegant.

We have a quite peculiar system state here: A function call
acts like an escape, to make all subsequent calls behave
differently, until this call is finished.

Without blocking microthreads, a clean way to do this would be
a search up in the frame chain, if there is a running __findattr__
method of this object. Fairly expensive. Well, the problem
also exists with real threads, if they are allowed to switch
in such a context.

I fear it is necessary to either block this stuff until it is
ready, or to maintain some thread-wise structure for the
state of this object.

Ok, after thinking some more, I'll start an extra message
to Barry on this topic.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From tismer at tismer.com  Sat Dec  2 18:21:18 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sat, 02 Dec 2000 19:21:18 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
Message-ID: <3A292F8D.7C616449@tismer.com>


"Barry A. Warsaw" wrote:
> 
> I've just uploaded PEP 231, which describes a new hook in the instance
> access mechanism, called __findattr__() after a similar mechanism that
> exists in Jython (but is not exposed at the Python layer).

Ok, as I announced already, here some thoughts on __findattr__,
system state, and how it could work.

Looking at your patch, I realize that you are blocking __findattr__
for your whole instance, until this call ends.
This is not what you want to do, I guess. This has an effect of
affecting the whole system state when threads are involved.
Also you cannot use __findattr__ on any other attribute during
this call.

You want most probably do this:
__findattr__ should not be invoked again for this instance,
with this attribute name, for this "thread", until you are done.

The correct way to find out whether __findattr__ is active or
not would be to look upwards the frame chain and inspect it.
Moshe also asked about continuations: I think this would resolve
quite fine. However we jump around, the current chain of frames
dictates the semantics of __findattr__. It even applies to
Guido's tamed coroutines, given that an explicit switch were
allowed in the context of __findattr__.

In a sense, we get some kind of dynamic context here, since
we need to do a lookup for something in the dynamic call chain.
I guess this would be quite messy to implement, and inefficient.

Isn't there a way to accomplish the desired effect without
modifying the instance? In the context of __findattr__, *we*
know that we don't want to get a recursive call.
Let's assume __getattr__ and __setattr__ had yet another
optional parameter: infindattr, defaulting to 0.
We would than have to pass a positive value in this context,
which would object.c tell to not try to invoke __findattr__
again.
With explicit passing of state, no problems with threads
can occour. Readability might improve as well.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From moshez at math.huji.ac.il  Sun Dec  3 14:14:43 2000
From: moshez at math.huji.ac.il (Moshe Zadka)
Date: Sun, 3 Dec 2000 15:14:43 +0200 (IST)
Subject: [Python-Dev] Another Python Developer Missing
Message-ID: <Pine.GSO.4.10.10012031512430.7826-100000@sundial>

Gordon McMillan is not a possible assignee in the assign_to field.


--
Moshe Zadka <moshez at zadka.site.co.il> -- 95855124
http://moshez.org




From tim.one at home.com  Sun Dec  3 18:35:36 2000
From: tim.one at home.com (Tim Peters)
Date: Sun, 3 Dec 2000 12:35:36 -0500
Subject: [Python-Dev] Another Python Developer Missing
In-Reply-To: <Pine.GSO.4.10.10012031512430.7826-100000@sundial>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEDOICAA.tim.one@home.com>

[Moshe Zadka]
> Gordon McMillan is not a possible assignee in the assign_to field.

We almost never add people as Python developers unless they ask for that,
since it comes with responsibility as well as riches beyond the dreams of
avarice.  If Gordon would like to apply, we won't charge him any interest
until 2001 <wink>.




From mal at lemburg.com  Sun Dec  3 20:21:11 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sun, 03 Dec 2000 20:21:11 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib urllib.py,1.107,1.108
References: <200012031830.KAA30620@slayer.i.sourceforge.net>
Message-ID: <3A2A9D27.AF43D665@lemburg.com>

"Martin v. L?wis" wrote:
> 
> Update of /cvsroot/python/python/dist/src/Lib
> In directory slayer.i.sourceforge.net:/tmp/cvs-serv30506
> 
> Modified Files:
>         urllib.py
> Log Message:
> Convert Unicode strings to byte strings before passing them into specific
> protocols. Closes bug #119822.
> 
> ...
> +
> + def toBytes(url):
> +     """toBytes(u"URL") --> 'URL'."""
> +     # Most URL schemes require ASCII. If that changes, the conversion
> +     # can be relaxed
> +     if type(url) is types.UnicodeType:
> +         try:
> +             url = url.encode("ASCII")

You should make this: 'ascii' -- encoding names are lower case 
per convention (and the implementation has a short-cut to speed up
conversion to 'ascii' -- not for 'ASCII').

> +         except UnicodeError:
> +             raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters")

Would it be better to use a simple ValueError here ? (UnicodeError
is a subclass of ValueError, but the error doesn't really have something to
do with Unicode conversions...)

> +     return url
> 
>   def unwrap(url):


-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tismer at tismer.com  Sun Dec  3 21:01:07 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sun, 03 Dec 2000 22:01:07 +0200
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib filecmp.py,1.6,1.7
References: <200012032048.MAA10353@slayer.i.sourceforge.net>
Message-ID: <3A2AA683.3840AA8A@tismer.com>


Moshe Zadka wrote:
> 
> Update of /cvsroot/python/python/dist/src/Lib
> In directory slayer.i.sourceforge.net:/tmp/cvs-serv9465
> 
> Modified Files:
>         filecmp.py
> Log Message:
> Call of _cmp had wrong number of paramereters.
> Fixed definition of _cmp.

...

> !         return not abs(cmp(a, b, sh, st))
>       except os.error:
>           return 2

Ugh! Wouldn't that be a fine chance to rename the cmp
function in this module? Overriding a built-in
is really not nice to have in a library.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From moshez at math.huji.ac.il  Sun Dec  3 22:01:07 2000
From: moshez at math.huji.ac.il (Moshe Zadka)
Date: Sun, 3 Dec 2000 23:01:07 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 filecmp.py,1.6,1.7
In-Reply-To: <3A2AA683.3840AA8A@tismer.com>
Message-ID: <Pine.GSO.4.10.10012032300100.7608-100000@sundial>

On Sun, 3 Dec 2000, Christian Tismer wrote:

> Ugh! Wouldn't that be a fine chance to rename the cmp
> function in this module? Overriding a built-in
> is really not nice to have in a library.

The fine chance was when we moved cmp.py->filecmp.py. 
Now it would just break backwards compatability.
--
Moshe Zadka <moshez at zadka.site.co.il> -- 95855124
http://moshez.org




From tismer at tismer.com  Sun Dec  3 21:12:15 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sun, 03 Dec 2000 22:12:15 +0200
Subject: [Python-Dev] Re: [Python-checkins] CVS: 
 python/dist/src/Libfilecmp.py,1.6,1.7
References: <Pine.GSO.4.10.10012032300100.7608-100000@sundial>
Message-ID: <3A2AA91F.843E2BAE@tismer.com>


Moshe Zadka wrote:
> 
> On Sun, 3 Dec 2000, Christian Tismer wrote:
> 
> > Ugh! Wouldn't that be a fine chance to rename the cmp
> > function in this module? Overriding a built-in
> > is really not nice to have in a library.
> 
> The fine chance was when we moved cmp.py->filecmp.py.
> Now it would just break backwards compatability.

Yes, I see. cmp belongs to the module's interface.
Maybe it could be renamed anyway, and be assigned
to cmp at the very end of the file, but not using
cmp anywhere in the code. My first reaction on reading
the patch was "juck!" since I didn't know this module.

python-dev/null - ly y'rs - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From martin at loewis.home.cs.tu-berlin.de  Sun Dec  3 22:56:44 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 3 Dec 2000 22:56:44 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
Message-ID: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>

> Isn't there a way to accomplish the desired effect without modifying
> the instance? In the context of __findattr__, *we* know that we
> don't want to get a recursive call.  Let's assume __getattr__ and
> __setattr__ had yet another optional parameter: infindattr,
> defaulting to 0.  We would than have to pass a positive value in
> this context, which would object.c tell to not try to invoke
> __findattr__ again.

Who is "we" here? The Python code implementing __findattr__? How would
it pass a value to __setattr__? It doesn't call __setattr__, instead
it has "self.__myfoo = x"...

I agree that the current implementation is not thread-safe. To solve
that, you'd need to associate with each instance not a single
"infindattr" attribute, but a whole set of them - one per "thread of
execution" (which would be a thread-id in most threading systems). Of
course, that would need some cooperation from the any thread scheme
(including uthreads), which would need to provide an identification
for a "calling context".

Regards,
Martin



From martin at loewis.home.cs.tu-berlin.de  Sun Dec  3 23:07:17 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 3 Dec 2000 23:07:17 +0100
Subject: [Python-Dev] Re: CVS: python/dist/src/Lib urllib.py,1.107,1.108
Message-ID: <200012032207.XAA03394@loewis.home.cs.tu-berlin.de>

> You should make this: 'ascii' -- encoding names are lower case per
> convention (and the implementation has a short-cut to speed up
> conversion to 'ascii' -- not for 'ASCII').

With conventions, it is a difficult story. I'm pretty certain that
users typically see that particular american standard as ASCII (to the
extend of calling it "a s c two"), not ascii.

As for speed - feel free to change the code if you think it matters.

> +             raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters")

> Would it be better to use a simple ValueError here ? (UnicodeError
> is a subclass of ValueError, but the error doesn't really have
> something to do with Unicode conversions...)

Why does it not have to do with Unicode conversion? A conversion from
Unicode to ASCII was attempted, and failed.

I guess I would be more open to suggested changes if you had put them
into the patch manager at the time you've reviewed the patch...

Regards,
Martin



From tismer at tismer.com  Sun Dec  3 22:38:11 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sun, 03 Dec 2000 23:38:11 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
Message-ID: <3A2ABD43.AB56BD60@tismer.com>


"Martin v. Loewis" wrote:
> 
> > Isn't there a way to accomplish the desired effect without modifying
> > the instance? In the context of __findattr__, *we* know that we
> > don't want to get a recursive call.  Let's assume __getattr__ and
> > __setattr__ had yet another optional parameter: infindattr,
> > defaulting to 0.  We would than have to pass a positive value in
> > this context, which would object.c tell to not try to invoke
> > __findattr__ again.
> 
> Who is "we" here? The Python code implementing __findattr__? How would
> it pass a value to __setattr__? It doesn't call __setattr__, instead
> it has "self.__myfoo = x"...

Ouch - right! Sorry :)

> I agree that the current implementation is not thread-safe. To solve
> that, you'd need to associate with each instance not a single
> "infindattr" attribute, but a whole set of them - one per "thread of
> execution" (which would be a thread-id in most threading systems). Of
> course, that would need some cooperation from the any thread scheme
> (including uthreads), which would need to provide an identification
> for a "calling context".

Right, that is one possible way to do it. I also thought about
some alternatives, but they all sound too complicated to
justify them. Also I don't think this is only thread-related,
since mess can happen even with an explicit coroutine jmp.
Furthermore, how to deal with multiple attribute names?
The function works wrong if __findattr__ tries to inspect
another attribute.

IMO, the state of the current interpreter changes here
(or should do so), and this changed state needs to be carried
down with all subsequent function calls.

confused - ly chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From mal at lemburg.com  Sun Dec  3 23:51:10 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sun, 03 Dec 2000 23:51:10 +0100
Subject: [Python-Dev] Re: CVS: python/dist/src/Lib urllib.py,1.107,1.108
References: <200012032207.XAA03394@loewis.home.cs.tu-berlin.de>
Message-ID: <3A2ACE5E.A9F860A8@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > You should make this: 'ascii' -- encoding names are lower case per
> > convention (and the implementation has a short-cut to speed up
> > conversion to 'ascii' -- not for 'ASCII').
> 
> With conventions, it is a difficult story. I'm pretty certain that
> users typically see that particular american standard as ASCII (to the
> extend of calling it "a s c two"), not ascii.

It's a convention in the codec registry design and used as such
in the Unicode implementation.
 
> As for speed - feel free to change the code if you think it matters.

Hey... this was just a suggestion. I thought that you didn't
know of the internal short-cut and wanted to hint at it.
 
> > +             raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters")
> 
> > Would it be better to use a simple ValueError here ? (UnicodeError
> > is a subclass of ValueError, but the error doesn't really have
> > something to do with Unicode conversions...)
> 
> Why does it not have to do with Unicode conversion? A conversion from
> Unicode to ASCII was attempted, and failed.

Sure, but the fact that URLs have to be ASCII is not something
that is enforced by the Unicode implementation.
 
> I guess I would be more open to suggested changes if you had put them
> into the patch manager at the time you've reviewed the patch...

I didn't review the patch, only the summary...

Don't have much time to look into these things closely right now, so
all I can do is comment.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From barry at scottb.demon.co.uk  Mon Dec  4 01:55:32 2000
From: barry at scottb.demon.co.uk (Barry Scott)
Date: Mon, 4 Dec 2000 00:55:32 -0000
Subject: [Python-Dev] A house upon the sand
In-Reply-To: <20001130181937.B21596@ludwig.cnri.reston.va.us>
Message-ID: <000201c05d8c$e7a15b10$060210ac@private>

I fully support Greg Wards view. If string was removed I'd not
update the old code but add in my own string module.

Given the effort you guys went to to keep the C extension protocol the
same (in the context of crashing on importing a 1.5 dll into 2.0) I
amazed you think that string could be removed...

Could you split the lib into blessed and backward compatibility sections?
Then by some suitable mechanism I can choose the compatibility I need?

Oh and as for join obviously a method of a list...

	['thats','better'].join(' ')

		Barry




From fredrik at pythonware.com  Mon Dec  4 11:37:18 2000
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 4 Dec 2000 11:37:18 +0100
Subject: [Python-Dev] unit testing and Python regression test
References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us>
Message-ID: <00e701c05dde$2d77c240$0900a8c0@SPIFF>

andrew kuchling wrote:
> Someone remembered my post of 23 Nov, I see...  The only other test
> framework I know of is the unittest.py inside Quixote, written because
> we thought PyUnit was kind of clunky.

the pythonware teams agree -- we've been using an internal
reimplementation of Kent Beck's original Smalltalk work, but
we're switching to unittest.py.

> Obviously I think the Quixote unittest.py is the best choice for the stdlib.

+1 from here.

</F>




From mal at lemburg.com  Mon Dec  4 12:14:20 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 04 Dec 2000 12:14:20 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
		<3A28C8DF.E430484F@lemburg.com> <14889.10298.621133.961677@anthem.concentric.net>
Message-ID: <3A2B7C8C.D6B889EE@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "M" == M  <mal at lemburg.com> writes:
> 
>     M> The PEP does define when and how __findattr__() is called,
>     M> but makes no statement about what it should do or return...
> 
> Good point.  I've clarified that in the PEP.
> 
>     M> Here's a slightly different idea:
> 
>     M> Given the name, I would expect it to go look for an attribute
>     M> and then return the attribute and its container (this doesn't
>     M> seem to be what you have in mind here, though).
> 
> No, because some applications won't need a wrapped object.  E.g. in
> the Java bean example, it just returns the attribute (which is stored
> with a slightly different name).

I was thinking of a standardised helper which could then be
used for all kinds of attribute retrieval techniques. Acquisition
would be easy to do, access control too. In most cases __findattr__
would simply return (self, self.attrname).
 
>     M> An alternative approach given the semantics above would then be
>     M> to first try a __getattr__() lookup and revert to
>     M> __findattr__() in case this fails.
> 
> I don't think this is as useful.  What would that buy you that you
> can't already do today?

Forget that idea... *always* calling __findattr__ is the more
useful way, just like you intended.
 
> The key concept here is that you want to give the class first crack to
> interpose on every attribute access.  You want this hook to get called
> before anybody else can get at, or set, your attributes.  That gives
> you (the class) total control to implement whatever policy is useful.

Right.
 
>     M> I don't think there is any need to overload __setattr__() in
>     M> such a way, because you cannot be sure which object actually
>     M> gets the new attribute.
> 
>     M> By exposing the functionality using a new builtin, findattr(),
>     M> this could be used for all the examples you give too.
> 
> No, because then people couldn't use the object in the normal
> dot-notational way.

Uhm, why not ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From gvwilson at nevex.com  Mon Dec  4 15:40:58 2000
From: gvwilson at nevex.com (Greg Wilson)
Date: Mon, 4 Dec 2000 09:40:58 -0500
Subject: [Python-Dev] Q: Python standard library re-org plans/schedule?
In-Reply-To: <20001201145827.D16751@kronos.cnri.reston.va.us>
Message-ID: <NEBBIACCCGNFLECLCLMHCEADCBAA.gvwilson@nevex.com>

Hi, everyone.  A potential customer has asked whether there are any
plans to re-organize and rationalize the Python standard library.
If there are any firms plans, and a schedule (however tentative),
I'd be grateful for a pointer.

Thanks,
Greg



From barry at digicool.com  Mon Dec  4 16:13:23 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 4 Dec 2000 10:13:23 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
Message-ID: <14891.46227.785856.307437@anthem.concentric.net>

>>>>> "MvL" == Martin v Loewis <martin at loewis.home.cs.tu-berlin.de> writes:

    MvL> I agree that the current implementation is not
    MvL> thread-safe. To solve that, you'd need to associate with each
    MvL> instance not a single "infindattr" attribute, but a whole set
    MvL> of them - one per "thread of execution" (which would be a
    MvL> thread-id in most threading systems). Of course, that would
    MvL> need some cooperation from the any thread scheme (including
    MvL> uthreads), which would need to provide an identification for
    MvL> a "calling context".

I'm still catching up on several hundred emails over the weekend.  I
had a sneaking suspicion that infindattr wasn't thread-safe, so I'm
convinced this is a bug in the implementation.  One approach might be
to store the info in the thread state object (isn't that how the
recursive repr stop flag is stored?)  That would also save having to
allocate an extra int for every instance (yuck) but might impose a bit
more of a performance overhead.

I'll work more on this later today.
-Barry



From jeremy at alum.mit.edu  Mon Dec  4 16:23:10 2000
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 4 Dec 2000 10:23:10 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <00e701c05dde$2d77c240$0900a8c0@SPIFF>
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
	<20001201145827.D16751@kronos.cnri.reston.va.us>
	<00e701c05dde$2d77c240$0900a8c0@SPIFF>
Message-ID: <14891.46814.359333.76720@bitdiddle.concentric.net>

>>>>> "FL" == Fredrik Lundh <fredrik at pythonware.com> writes:

  FL> andrew kuchling wrote:
  >> Someone remembered my post of 23 Nov, I see...  The only other
  >> test framework I know of is the unittest.py inside Quixote,
  >> written because we thought PyUnit was kind of clunky.

  FL> the pythonware teams agree -- we've been using an internal
  FL> reimplementation of Kent Beck's original Smalltalk work, but
  FL> we're switching to unittest.py.

Can you provide any specifics about what you like about unittest.py
(perhaps as opposed to PyUnit)?

Jeremy



From guido at python.org  Mon Dec  4 16:20:11 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 04 Dec 2000 10:20:11 -0500
Subject: [Python-Dev] Q: Python standard library re-org plans/schedule?
In-Reply-To: Your message of "Mon, 04 Dec 2000 09:40:58 EST."
             <NEBBIACCCGNFLECLCLMHCEADCBAA.gvwilson@nevex.com> 
References: <NEBBIACCCGNFLECLCLMHCEADCBAA.gvwilson@nevex.com> 
Message-ID: <200012041520.KAA20979@cj20424-a.reston1.va.home.com>

> Hi, everyone.  A potential customer has asked whether there are any
> plans to re-organize and rationalize the Python standard library.
> If there are any firms plans, and a schedule (however tentative),
> I'd be grateful for a pointer.

Alas, none that I know of except the ineffable Python 3000
schedule. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Mon Dec  4 16:46:53 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 4 Dec 2000 10:46:53 -0500
Subject: [Python-Dev] Quixote unit testing docs (Was: unit testing)
In-Reply-To: <14891.46814.359333.76720@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Mon, Dec 04, 2000 at 10:23:10AM -0500
References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <00e701c05dde$2d77c240$0900a8c0@SPIFF> <14891.46814.359333.76720@bitdiddle.concentric.net>
Message-ID: <20001204104653.A19387@kronos.cnri.reston.va.us>

Prodded by Jeremy, I went and actually wrote some documentation for
the Quixote unittest.py; please see 
<URL:http://www.amk.ca/python/unittest.html>.

The HTML is from a manually hacked Library Reference, so ignore the
broken image links and other formatting goofyness.  In case anyone
needs it, the LaTeX is in /files/python/.  The plain text version
comes out to around 290 lines; I can post it to this list if that's
desired.

--amk




From pf at artcom-gmbh.de  Mon Dec  4 18:59:54 2000
From: pf at artcom-gmbh.de (Peter Funk)
Date: Mon, 4 Dec 2000 18:59:54 +0100 (MET)
Subject: Tim Peter's doctest compared to Quixote unit testing (was Re: [Python-Dev] Quixote unit testing docs)
In-Reply-To: <20001204104653.A19387@kronos.cnri.reston.va.us> from Andrew Kuchling at "Dec 4, 2000 10:46:53 am"
Message-ID: <m142zu6-000Dm8C@artcom0.artcom-gmbh.de>

Hi all,

Andrew Kuchling:
> ... I ... actually wrote some documentation for
> the Quixote unittest.py; please see 
> <URL:http://www.amk.ca/python/unittest.html>.
[...]
> comes out to around 290 lines; I can post it to this list if that's
> desired.

After reading Andrews docs, I think Quixote basically offers 
three additional features if compared with Tim Peters 'doctest':
 1. integration of Skip Montanaro's code coverage analysis. 
 2. the idea of Scenario objects useful to share the setup needed to
    test related functions or methods of a class (same start condition).
 3. Some useful functions to check whether the result returned
    by some test fullfills certain properties without having to be
    so explicite, as cut'n'paste from the interactive interpreter
    session would have been.

As I've pointed out before in private mail to Jeremy I've used Tim Peters 
'doctest.py' to accomplish all testing of Python apps in our company.

In doctest each doc string is an independent unit, which starts fresh.
Sometimes this leads to duplicated setup stuff, which is needed
to test each method of a set of related methods from a class.
This is distracting, if you intend the test cases to take their
double role of being at same time useful documentational examples 
for the intended use of the provided API.

Tim_one: Do you read this?  What do you think about the idea to add 
something like the following two functions to 'doctest':
use_module_scenario() -- imports all objects created and preserved during
    execution of the module doc string examples.
use_class_scenario() -- imports all objects created and preserved during 
    the execution of doc string examples of a class.  Only allowed in doc
    string examples of methods.  

This would allow easily to provide the same setup scenario to a group
of related test cases.

AFAI understand doctest handles test-shutdown automatically, iff
the doc string test examples leave no persistent resources behind.

Regards, Peter



From moshez at zadka.site.co.il  Tue Dec  5 04:31:18 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue, 05 Dec 2000 05:31:18 +0200
Subject: [Python-Dev] PEP 231, __findattr__() 
In-Reply-To: Message from barry@digicool.com (Barry A. Warsaw) 
   of "Mon, 04 Dec 2000 10:13:23 EST." <14891.46227.785856.307437@anthem.concentric.net> 
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>  <14891.46227.785856.307437@anthem.concentric.net> 
Message-ID: <20001205033118.9135CA817@darjeeling.zadka.site.co.il>

> I'm still catching up on several hundred emails over the weekend.  I
> had a sneaking suspicion that infindattr wasn't thread-safe, so I'm
> convinced this is a bug in the implementation.  One approach might be
> to store the info in the thread state object

I don't think this is a good idea -- continuations and coroutines might
mess it up. Maybe the right thing is to mess with the *compilation* of
__findattr__ so that it would call __setattr__ and __getattr__ with
special flags that stop them from calling __findattr__? This is 
ugly, but I can't think of a better way.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tismer at tismer.com  Mon Dec  4 19:35:19 2000
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 04 Dec 2000 20:35:19 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>  <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il>
Message-ID: <3A2BE3E7.60A8E220@tismer.com>


Moshe Zadka wrote:
> 
> > I'm still catching up on several hundred emails over the weekend.  I
> > had a sneaking suspicion that infindattr wasn't thread-safe, so I'm
> > convinced this is a bug in the implementation.  One approach might be
> > to store the info in the thread state object
> 
> I don't think this is a good idea -- continuations and coroutines might
> mess it up. Maybe the right thing is to mess with the *compilation* of
> __findattr__ so that it would call __setattr__ and __getattr__ with
> special flags that stop them from calling __findattr__? This is
> ugly, but I can't think of a better way.

Yeah, this is what I tried to say by "different machine state";
compiling different behavior in the case of a special method
is an interesting idea. It is limited somewhat, since the
changed system state is not inherited to called functions.
But if __findattr__ performs its one, single task in its
body alone, we are fine.

still-thinking-of-alternatives - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From tismer at tismer.com  Mon Dec  4 19:52:43 2000
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 04 Dec 2000 20:52:43 +0200
Subject: [Python-Dev] A house upon the sand
References: <000201c05d8c$e7a15b10$060210ac@private>
Message-ID: <3A2BE7FB.831F2F93@tismer.com>


Barry Scott wrote:
> 
> I fully support Greg Wards view. If string was removed I'd not
> update the old code but add in my own string module.
> 
> Given the effort you guys went to to keep the C extension protocol the
> same (in the context of crashing on importing a 1.5 dll into 2.0) I
> amazed you think that string could be removed...
> 
> Could you split the lib into blessed and backward compatibility sections?
> Then by some suitable mechanism I can choose the compatibility I need?
> 
> Oh and as for join obviously a method of a list...
> 
>         ['thats','better'].join(' ')

The above is the way as it is defined for JavaScript. But in
JavaScript, the list join method performs an implicit str()
on the list elements.
As has been discussed some time ago, Python's lists are
too versatile to justify a string-centric method.

Marc Andr? pointed out that one could do a reduction with the
semantics of the "+" operator, but Guido said that he wouldn't
like to see

      [2, 3, 5].join(7)

being reduced to 2+7+3+7+5 == 24.
That could only be avoided if there were a way to distinguish
numeric addition from concatenation.

but-I-could-live-with-it - ly y'rs - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From barry at digicool.com  Mon Dec  4 22:23:00 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 4 Dec 2000 16:23:00 -0500
Subject: [Python-Dev] PEP 231, __findattr__() 
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
	<14891.46227.785856.307437@anthem.concentric.net>
	<20001205033118.9135CA817@darjeeling.zadka.site.co.il>
Message-ID: <14892.2868.982013.313562@anthem.concentric.net>

>>>>> "CT" == Christian Tismer <tismer at tismer.com> writes:

    CT> You want most probably do this: __findattr__ should not be
    CT> invoked again for this instance, with this attribute name, for
    CT> this "thread", until you are done.

First, I think the rule should be "__findattr__ should not be invoked
again for this instance, in this thread, until you are done".
I.e. once in __findattr__, you want all subsequent attribute
references to bypass findattr, because presumably, your instance now
has complete control for all accesses in this thread.  You don't want
to limit it to just the currently named attribute.

Second, if "this thread" is defined as _PyThreadState_Current, then we
have a simple solution, as I mapped out earlier.  We do a
PyThreadState_GetDict() and store the instance in that dict on entry
to __findattr__ and remove it on exit from __findattr__.  If the
instance can be found in the current thread's dict, we bypass
__findattr__.

>>>>> "MZ" == Moshe Zadka <moshez at zadka.site.co.il> writes:

    MZ> I don't think this is a good idea -- continuations and
    MZ> coroutines might mess it up.

You might be right, but I'm not sure.

If we make __findattr__ thread safe according to the definition above,
and if uthread/coroutine/continuation safety can be accomplished by
the __findattr__ programmer's discipline, then I think that is enough.
IOW, if we can tell the __findattr__ author to not relinquish the
uthread explicitly during the __findattr__ call, we're cool.  Oh, and
as long as we're not somehow substantially reducing the utility of
__findattr__ by making that restriction.

What I worry about is re-entrancy that isn't under the programmer's
control, like the Real Thread-safety problem.

-Barry



From barry at digicool.com  Mon Dec  4 23:58:33 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 4 Dec 2000 17:58:33 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
	<14891.46227.785856.307437@anthem.concentric.net>
	<20001205033118.9135CA817@darjeeling.zadka.site.co.il>
	<14892.2868.982013.313562@anthem.concentric.net>
	<3A2C0E0D.E042D026@tismer.com>
Message-ID: <14892.8601.41178.81475@anthem.concentric.net>

>>>>> "CT" == Christian Tismer <tismer at tismer.com> writes:

    CT> Hmm. WHat do you think about Moshe's idea to change compiling
    CT> of the method? It has the nice advantage that there are no
    CT> Thread-safety problems by design. The only drawback is that
    CT> the contract of not-calling-myself only holds for this
    CT> function.

I'm not sure I understand what Moshe was proposing.  Moshe: are you
saying that we should change the way the compiler works, so that it
somehow recognizes this special case?  I'm not sure I like that
approach.  I think I want something more runtime-y, but I'm not sure
why (maybe just because I'm more comfortable mucking about in the
run-time than in the compiler).

-Barry



From guido at python.org  Tue Dec  5 00:16:17 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 04 Dec 2000 18:16:17 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: Your message of "Mon, 04 Dec 2000 16:23:00 EST."
             <14892.2868.982013.313562@anthem.concentric.net> 
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il>  
            <14892.2868.982013.313562@anthem.concentric.net> 
Message-ID: <200012042316.SAA23081@cj20424-a.reston1.va.home.com>

I'm unconvinced by the __findattr__ proposal as it now stands.

- Do you really think that JimF would do away with ExtensionClasses if
  __findattr__ was intruduced?  I kinda doubt it.  See [*footnote].
  It seems that *using* __findattr__ is expensive (even if *not* using
  is cheap :-).

- Why is deletion not supported?  What if you want to enforce a policy
  on deletions too?

- It's ugly to use the same call for get and set.  The examples
  indicate that it's not such a great idea: every example has *two*
  tests whether it's get or set.  To share a policy, the proper thing
  to do is to write a method that either get or set can use.

- I think it would be sufficient to *only* use __findattr__ for
  getattr -- __setattr__ and __delattr__ already have full control.
  The "one routine to implement the policy" argument doesn't really
  hold, I think.

- The PEP says that the "in-findattr" flag is set on the instance.
  We've already determined that this is not thread-safe.  This is not
  just a bug in the implementation -- it's a bug in the specification.
  I also find it ugly.  But if we decide to do this, it can go in the
  thread-state -- if we ever add coroutines, we have to decide on what
  stuff to move from the thread state to the coroutine state anyway.

- It's also easy to conceive situations where recursive __findattr__
  calls on the same instance in the same thread/coroutine are
  perfectly desirable -- e.g. when __findattr__ ends up calling a
  method that uses a lot of internal machinery of the class.  You
  don't want all the machinery to have to be aware of the fact that it
  may be called with __findattr__ on the stack and without it.  So
  perhaps it may be better to only treat the body of __findattr__
  itself special, as Moshe suggested.  What does Jython do here?

- The code examples require a *lot* of effort to understand.  These
  are complicated issues!  (I rewrote the Bean example using
  __getattr__ and __setattr__ and found no need for __findattr__; the
  __getattr__ version is simpler and easier to understand.  I'm still
  studying the other __findattr__ examples.)

- The PEP really isn't that long, except for the code examples.  I
  recommend reading the patch first -- the patch is probably shorter
  than any specification of the feature can be.

--Guido van Rossum (home page: http://www.python.org/~guido/)

[*footnote]

  There's an easy way (that few people seem to know) to cause
  __getattr__ to be called for virtually all attribute accesses: put
  *all* (user-visible) attributes in a sepate dictionary.  If you want
  to prevent access to this dictionary too (for Zope security
  enforcement), make it a global indexed by id() -- a
  destructor(__del__) can take care of deleting entries here.



From martin at loewis.home.cs.tu-berlin.de  Tue Dec  5 00:10:43 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 5 Dec 2000 00:10:43 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <14891.46227.785856.307437@anthem.concentric.net>
	(barry@digicool.com)
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net>
Message-ID: <200012042310.AAA00786@loewis.home.cs.tu-berlin.de>

> I'm still catching up on several hundred emails over the weekend.  I
> had a sneaking suspicion that infindattr wasn't thread-safe, so I'm
> convinced this is a bug in the implementation.  One approach might be
> to store the info in the thread state object (isn't that how the
> recursive repr stop flag is stored?)

Whether this works depends on how exactly the info is stored. A single
flag won't be sufficient, since multiple objects may have __findattr__
in progress in a given thread. With a set of instances, it would work,
though.

Regards,
Martin



From martin at loewis.home.cs.tu-berlin.de  Tue Dec  5 00:13:15 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 5 Dec 2000 00:13:15 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <20001205033118.9135CA817@darjeeling.zadka.site.co.il> (message
	from Moshe Zadka on Tue, 05 Dec 2000 05:31:18 +0200)
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>  <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il>
Message-ID: <200012042313.AAA00832@loewis.home.cs.tu-berlin.de>

> I don't think this is a good idea -- continuations and coroutines
> might mess it up.

If coroutines and continuations present operate preemptively, then
they should present themselves as an implementation of the thread API;
perhaps the thread API needs to be extended to allow for such a feature.

If yielding control is in the hands of the implementation, it would be
easy to outrule a context switch while findattr is in progress.

Regards,
Martin




From martin at loewis.home.cs.tu-berlin.de  Tue Dec  5 00:19:37 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 5 Dec 2000 00:19:37 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <14892.8601.41178.81475@anthem.concentric.net>
	(barry@digicool.com)
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
	<14891.46227.785856.307437@anthem.concentric.net>
	<20001205033118.9135CA817@darjeeling.zadka.site.co.il>
	<14892.2868.982013.313562@anthem.concentric.net>
	<3A2C0E0D.E042D026@tismer.com> <14892.8601.41178.81475@anthem.concentric.net>
Message-ID: <200012042319.AAA00877@loewis.home.cs.tu-berlin.de>

> I'm not sure I understand what Moshe was proposing.  Moshe: are you
> saying that we should change the way the compiler works, so that it
> somehow recognizes this special case?  I'm not sure I like that
> approach.  I think I want something more runtime-y, but I'm not sure
> why (maybe just because I'm more comfortable mucking about in the
> run-time than in the compiler).

I guess you are also uncomfortable with the problem that the
compile-time analysis cannot "see" through levels of indirection.
E.g. if findattr as

   return self.compute_attribute(real_attribute)

then compile-time analysis could figure out to call compute_attribute
directly. However, that method may be implemented as 

  def compute_attribute(self,name):
    return self.mapping[name]

where the access to mapping could not be detected statically.

Regards,
Martin




From tismer at tismer.com  Mon Dec  4 22:35:09 2000
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 04 Dec 2000 23:35:09 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
		<14891.46227.785856.307437@anthem.concentric.net>
		<20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net>
Message-ID: <3A2C0E0D.E042D026@tismer.com>


"Barry A. Warsaw" wrote:
> 
> >>>>> "CT" == Christian Tismer <tismer at tismer.com> writes:
> 
>     CT> You want most probably do this: __findattr__ should not be
>     CT> invoked again for this instance, with this attribute name, for
>     CT> this "thread", until you are done.
> 
> First, I think the rule should be "__findattr__ should not be invoked
> again for this instance, in this thread, until you are done".

Maybe this is better. Surely easier. :)

[ThreadState solution - well fine so far]

>     MZ> I don't think this is a good idea -- continuations and
>     MZ> coroutines might mess it up.
> 
> You might be right, but I'm not sure.
> 
> If we make __findattr__ thread safe according to the definition above,
> and if uthread/coroutine/continuation safety can be accomplished by
> the __findattr__ programmer's discipline, then I think that is enough.
> IOW, if we can tell the __findattr__ author to not relinquish the
> uthread explicitly during the __findattr__ call, we're cool.  Oh, and
> as long as we're not somehow substantially reducing the utility of
> __findattr__ by making that restriction.
> 
> What I worry about is re-entrancy that isn't under the programmer's
> control, like the Real Thread-safety problem.

Hmm. WHat do you think about Moshe's idea to change compiling
of the method? It has the nice advantage that there are no
Thread-safety problems by design. The only drawback is that
the contract of not-calling-myself only holds for this function.

I don't know how Threadstate scale up when there are more things
like these invented. Well, for the moment, the simple solution
with Stackless would just be to let the interpreter recurse
in this call, the same as it happens during __init__ and
anything else that isn't easily turned into tail-recursion.
It just blocks :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From barry at digicool.com  Tue Dec  5 03:54:23 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 4 Dec 2000 21:54:23 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
	<14891.46227.785856.307437@anthem.concentric.net>
	<20001205033118.9135CA817@darjeeling.zadka.site.co.il>
	<14892.2868.982013.313562@anthem.concentric.net>
	<200012042316.SAA23081@cj20424-a.reston1.va.home.com>
Message-ID: <14892.22751.921264.156010@anthem.concentric.net>

>>>>> "GvR" == Guido van Rossum <guido at python.org> writes:

    GvR> - Do you really think that JimF would do away with
    GvR> ExtensionClasses if __findattr__ was intruduced?  I kinda
    GvR> doubt it.  See [*footnote].  It seems that *using*
    GvR> __findattr__ is expensive (even if *not* using is cheap :-).

That's not even the real reason why JimF wouldn't stop using
ExtensionClass.  He's already got too much code invested in EC.
However EC can be a big pill to swallow for some applications because
it's a C extension (and because it has some surprising non-Pythonic
side effects).  In those situations, a pure Python approach, even
though slower, is useful.

    GvR> - Why is deletion not supported?  What if you want to enforce
    GvR> a policy on deletions too?

It could be, without much work.

    GvR> - It's ugly to use the same call for get and set.  The
    GvR> examples indicate that it's not such a great idea: every
    GvR> example has *two* tests whether it's get or set.  To share a
    GvR> policy, the proper thing to do is to write a method that
    GvR> either get or set can use.

I don't have strong feelings either way.

    GvR> - I think it would be sufficient to *only* use __findattr__
    GvR> for getattr -- __setattr__ and __delattr__ already have full
    GvR> control.  The "one routine to implement the policy" argument
    GvR> doesn't really hold, I think.

What about the ability to use "normal" x.name attribute access syntax
inside the hook?  Let me guess your answer. :)

    GvR> - The PEP says that the "in-findattr" flag is set on the
    GvR> instance.  We've already determined that this is not
    GvR> thread-safe.  This is not just a bug in the implementation --
    GvR> it's a bug in the specification.  I also find it ugly.  But
    GvR> if we decide to do this, it can go in the thread-state -- if
    GvR> we ever add coroutines, we have to decide on what stuff to
    GvR> move from the thread state to the coroutine state anyway.

Right.  That's where we've ended up in subsequent messages on this thread.

    GvR> - It's also easy to conceive situations where recursive
    GvR> __findattr__ calls on the same instance in the same
    GvR> thread/coroutine are perfectly desirable -- e.g. when
    GvR> __findattr__ ends up calling a method that uses a lot of
    GvR> internal machinery of the class.  You don't want all the
    GvR> machinery to have to be aware of the fact that it may be
    GvR> called with __findattr__ on the stack and without it.

Hmm, okay, I don't really understand your example.  I suppose I'm
envisioning __findattr__ as a way to provide an interface to clients
of the class.  Maybe it's a bean interface, maybe it's an acquisition
interface or an access control interface.  The internal machinery has
to know something about how that interface is implemented, so whether
__findattr__ is recursive or not doesn't seem to enter into it.

And also, allowing __findattr__ to be recursive will just impose
different constraints on the internal machinery methods, just like
__setattr__ currently does.  I.e. you better know that you're in
__setattr__ and not do self.name type things, or you'll recurse
forever. 

    GvR> So perhaps it may be better to only treat the body of
    GvR> __findattr__ itself special, as Moshe suggested.

Maybe I'm being dense, but I'm not sure exactly what this means, or
how you would do this.
    
    GvR> What does Jython do here?

It's not exactly equivalent, because Jython's __findattr__ can't call
back into Python.

    GvR> - The code examples require a *lot* of effort to understand.
    GvR> These are complicated issues!  (I rewrote the Bean example
    GvR> using __getattr__ and __setattr__ and found no need for
    GvR> __findattr__; the __getattr__ version is simpler and easier
    GvR> to understand.  I'm still studying the other __findattr__
    GvR> examples.)

Is it simpler because you separated out the set and get behavior?  If
__findattr__ only did getting, I think it would be a lot similar too
(but I'd still be interested in seeing your __getattr__ only
example).  The acquisition examples are complicated because I wanted
to support the same interface that EC's acquisition classes support.
All that detail isn't necessary for example code.

    GvR> - The PEP really isn't that long, except for the code
    GvR> examples.  I recommend reading the patch first -- the patch
    GvR> is probably shorter than any specification of the feature can
    GvR> be.

Would it be more helpful to remove the examples?  If so, where would
you put them?  It's certainly useful to have examples someplace I
think.

    GvR>   There's an easy way (that few people seem to know) to cause
    GvR> __getattr__ to be called for virtually all attribute
    GvR> accesses: put *all* (user-visible) attributes in a sepate
    GvR> dictionary.  If you want to prevent access to this dictionary
    GvR> too (for Zope security enforcement), make it a global indexed
    GvR> by id() -- a destructor(__del__) can take care of deleting
    GvR> entries here.

Presumably that'd be a module global, right?  Maybe within Zope that
could be protected, but outside of that, that global's always going to
be accessible.  So are methods, even if given private names.  And I
don't think that such code would be any more readable since instead of
self.name you'd see stuff like

    def __getattr__(self, name):
        global instdict
	mydict = instdict[id(self)]
	obj = mydict[name]
	...

    def __setattr__(self, name, val):
	global instdict
	mydict = instdict[id(self)]
	instdict[name] = val
	...

and that /might/ be a problem with Jython currently, because id()'s
may be reused.  And relying on __del__ may have unfortunate side
effects when viewed in conjunction with garbage collection.

You're probably still unconvinced <wink>, but are you dead-set against
it?  I can try implementing __findattr__() as a pre-__getattr__ hook
only.  Then we can live with the current __setattr__() restrictions
and see what the examples look like in that situation.

-Barry



From guido at python.org  Tue Dec  5 13:54:20 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 05 Dec 2000 07:54:20 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: Your message of "Mon, 04 Dec 2000 21:54:23 EST."
             <14892.22751.921264.156010@anthem.concentric.net> 
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com>  
            <14892.22751.921264.156010@anthem.concentric.net> 
Message-ID: <200012051254.HAA25502@cj20424-a.reston1.va.home.com>

> >>>>> "GvR" == Guido van Rossum <guido at python.org> writes:
> 
>     GvR> - Do you really think that JimF would do away with
>     GvR> ExtensionClasses if __findattr__ was intruduced?  I kinda
>     GvR> doubt it.  See [*footnote].  It seems that *using*
>     GvR> __findattr__ is expensive (even if *not* using is cheap :-).
> 
> That's not even the real reason why JimF wouldn't stop using
> ExtensionClass.  He's already got too much code invested in EC.
> However EC can be a big pill to swallow for some applications because
> it's a C extension (and because it has some surprising non-Pythonic
> side effects).  In those situations, a pure Python approach, even
> though slower, is useful.

Agreed.  But I'm still hoping to find the silver bullet that lets Jim
(and everybody else) do what ExtensionClass does without needing
another extension.

>     GvR> - Why is deletion not supported?  What if you want to enforce
>     GvR> a policy on deletions too?
> 
> It could be, without much work.

Then it should be -- except I prefer to do only getattr anyway, see
below.

>     GvR> - It's ugly to use the same call for get and set.  The
>     GvR> examples indicate that it's not such a great idea: every
>     GvR> example has *two* tests whether it's get or set.  To share a
>     GvR> policy, the proper thing to do is to write a method that
>     GvR> either get or set can use.
> 
> I don't have strong feelings either way.

What does Jython do?  I thought it only did set (hence the name :-).
I think there's no *need* for findattr to catch the setattr operation,
because __setattr__ *already* gets invoked on each set not just ones
where the attr doesn't yet exist.

>     GvR> - I think it would be sufficient to *only* use __findattr__
>     GvR> for getattr -- __setattr__ and __delattr__ already have full
>     GvR> control.  The "one routine to implement the policy" argument
>     GvR> doesn't really hold, I think.
> 
> What about the ability to use "normal" x.name attribute access syntax
> inside the hook?  Let me guess your answer. :)

Aha!  You got me there.  Clearly the REAL reason for wanting
__findattr__ is the no-recursive-calls rule -- which is also the most
uncooked feature...  Traditional getattr hooks don't need this as much
because they don't get called when the attribute already exists;
traditional setattr hooks deal with it by switching on the attribute
name.  The no-recursive-calls rule certainly SEEMS an attractive way
around this.  But I'm not sure that it really is...

I need to get my head around this more.  (The only reason I'm still
posting this reply is to test the new mailing lists setup via
mail.python.org.)

>     GvR> - The PEP says that the "in-findattr" flag is set on the
>     GvR> instance.  We've already determined that this is not
>     GvR> thread-safe.  This is not just a bug in the implementation --
>     GvR> it's a bug in the specification.  I also find it ugly.  But
>     GvR> if we decide to do this, it can go in the thread-state -- if
>     GvR> we ever add coroutines, we have to decide on what stuff to
>     GvR> move from the thread state to the coroutine state anyway.
> 
> Right.  That's where we've ended up in subsequent messages on this thread.
> 
>     GvR> - It's also easy to conceive situations where recursive
>     GvR> __findattr__ calls on the same instance in the same
>     GvR> thread/coroutine are perfectly desirable -- e.g. when
>     GvR> __findattr__ ends up calling a method that uses a lot of
>     GvR> internal machinery of the class.  You don't want all the
>     GvR> machinery to have to be aware of the fact that it may be
>     GvR> called with __findattr__ on the stack and without it.
> 
> Hmm, okay, I don't really understand your example.  I suppose I'm
> envisioning __findattr__ as a way to provide an interface to clients
> of the class.  Maybe it's a bean interface, maybe it's an acquisition
> interface or an access control interface.  The internal machinery has
> to know something about how that interface is implemented, so whether
> __findattr__ is recursive or not doesn't seem to enter into it.

But the class is also a client of itself, and not all cases where it
is a client of itself are inside a findattr call.  Take your bean
example.  Suppose your bean class also has a spam() method.  The
findattr code needs to account for this, e.g.:

    def __findattr__(self, name, *args):
	if name == "spam" and not args:
	    return self.spam
	...original body here...

Or you have to add a _get_spam() method:

    def _get_spam(self):
	return self.spam

Either solution gets tedious if there ar a lot of methods; instead,
findattr could check if the attr is defined on the class, and then
return that:

    def __findattr__(self, name, *args):
        if not args and name[0] != '_' and hasattr(self.__class__, name):
	    return getattr(self, name)
	...original body here...

Anyway, let's go back to the spam method.  Suppose it references
self.foo.  The findattr machinery will access it.  Fine.  But now
consider another attribute (bar) with _set_bar() and _get_bar()
methods that do a little more.  Maybe bar is really calculated from
the value of self.foo.  Then _get_bar cannot use self.foo (because
it's inside findattr so findattr won't resolve it, and self.foo
doesn't actually exist on the instance) so it has to use self.__myfoo.
Fine -- after all this is inside a _get_* handler, which knows it's
being called from findattr.  But what if, instead of needing self.foo,
_get_bar wants to call self.spam() in order?  Then self.spam() is
being called from inside findattr, so when it access self.foo,
findattr isn't used -- and it fails with an AttributeError!

Sorry for the long detour, but *that's* the problem I was referring
to.  I think the scenario is quite realistic.

> And also, allowing __findattr__ to be recursive will just impose
> different constraints on the internal machinery methods, just like
> __setattr__ currently does.  I.e. you better know that you're in
> __setattr__ and not do self.name type things, or you'll recurse
> forever. 

Actually, this is usually solved by having __setattr__ check for
specific names only, and for others do self.__dict__[name] = value;
that way, recursive __setattr__ calls are okay.  Similar for
__getattr__ (which has to raise AttributeError for unrecognized
names).

>     GvR> So perhaps it may be better to only treat the body of
>     GvR> __findattr__ itself special, as Moshe suggested.
> 
> Maybe I'm being dense, but I'm not sure exactly what this means, or
> how you would do this.

Read Moshe's messages (and Martin's replies) again.  I don't care that
much for it so I won't explain it again.

>     GvR> What does Jython do here?
> 
> It's not exactly equivalent, because Jython's __findattr__ can't call
> back into Python.

I'd say that Jython's __findattr__ is an entirely different beast than
what we have here.  Its min purpose in life appears to be to be a
getattr equivalent that returns NULL instead of raising an exception
when the attribute isn't found -- which is reasonable because from
within Java, testing for null is much cheaper than checking for an
exception, and you often need to look whether a given attribute exists
do some default action if not.  (In fact, I'd say that CPython could
also use a findattr of this kind...)

This is really too bad.  Based on the name similarity and things I
thought you'd said in private before, I thought that they would be
similar.  Then the experience with Jython would be a good argument for
adding a findattr hook to CPython.  But now that they are totally
different beasts it doesn't help at all.

>     GvR> - The code examples require a *lot* of effort to understand.
>     GvR> These are complicated issues!  (I rewrote the Bean example
>     GvR> using __getattr__ and __setattr__ and found no need for
>     GvR> __findattr__; the __getattr__ version is simpler and easier
>     GvR> to understand.  I'm still studying the other __findattr__
>     GvR> examples.)
> 
> Is it simpler because you separated out the set and get behavior?  If
> __findattr__ only did getting, I think it would be a lot similar too
> (but I'd still be interested in seeing your __getattr__ only
> example).

Here's my getattr example.  It's more lines of code, but cleaner IMHO:

    class Bean:
	def __init__(self, x):
	    self.__myfoo = x

	def __isprivate(self, name):
	    return name.startswith('_')

	def __getattr__(self, name):
	    if self.__isprivate(name):
		raise AttributeError, name
	    return getattr(self, "_get_" + name)()

	def __setattr__(self, name, value):
	    if self.__isprivate(name):
		self.__dict__[name] = value
	    else:
		return getattr(self, "_set_" + name)(value)

	def _set_foo(self, x):
	    self.__myfoo = x

	def _get_foo(self):
	    return self.__myfoo


    b = Bean(3)
    print b.foo
    b.foo = 9
    print b.foo

> The acquisition examples are complicated because I wanted
> to support the same interface that EC's acquisition classes support.
> All that detail isn't necessary for example code.

I *still* have to study the examples... :-(  Will do next.

>     GvR> - The PEP really isn't that long, except for the code
>     GvR> examples.  I recommend reading the patch first -- the patch
>     GvR> is probably shorter than any specification of the feature can
>     GvR> be.
> 
> Would it be more helpful to remove the examples?  If so, where would
> you put them?  It's certainly useful to have examples someplace I
> think.

No, my point is that the examples need more explanation.  Right now
the EC example is over 200 lines of brain-exploding code! :-)

>     GvR>   There's an easy way (that few people seem to know) to cause
>     GvR> __getattr__ to be called for virtually all attribute
>     GvR> accesses: put *all* (user-visible) attributes in a sepate
>     GvR> dictionary.  If you want to prevent access to this dictionary
>     GvR> too (for Zope security enforcement), make it a global indexed
>     GvR> by id() -- a destructor(__del__) can take care of deleting
>     GvR> entries here.
> 
> Presumably that'd be a module global, right?  Maybe within Zope that
> could be protected,

Yes.

> but outside of that, that global's always going to
> be accessible.  So are methods, even if given private names.

Aha!  Another think that I expect has been on your agenda for a long
time, but which isn't explicit in the PEP (AFAICT): findattr gives
*total* control over attribute access, unlike __getattr__ and
__setattr__ and private name mangling, which can all be defeated.

And this may be one of the things that Jim is after with
ExtensionClasses in Zope.  Although I believe that in DTML, he doesn't
trust this: he uses source-level (or bytecode-level) transformations
to turn all X.Y operations into a call into a security manager.

So I'm not sure that the argument is very strong.

> And I
> don't think that such code would be any more readable since instead of
> self.name you'd see stuff like
> 
>     def __getattr__(self, name):
>         global instdict
> 	mydict = instdict[id(self)]
> 	obj = mydict[name]
> 	...
> 
>     def __setattr__(self, name, val):
> 	global instdict
> 	mydict = instdict[id(self)]
> 	instdict[name] = val
> 	...
> 
> and that /might/ be a problem with Jython currently, because id()'s
> may be reused.  And relying on __del__ may have unfortunate side
> effects when viewed in conjunction with garbage collection.

Fair enough.  I withdraw the suggestion, and propose restricted
execution instead.  There, you can use Bastions -- which have problems
of their own, but you do get total control.

> You're probably still unconvinced <wink>, but are you dead-set against
> it?  I can try implementing __findattr__() as a pre-__getattr__ hook
> only.  Then we can live with the current __setattr__() restrictions
> and see what the examples look like in that situation.

I am dead-set against introducing a feature that I don't fully
understand.  Let's continue this discussion.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From bckfnn at worldonline.dk  Tue Dec  5 16:40:10 2000
From: bckfnn at worldonline.dk (Finn Bock)
Date: Tue, 05 Dec 2000 15:40:10 GMT
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <200012051254.HAA25502@cj20424-a.reston1.va.home.com>
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com>   <14892.22751.921264.156010@anthem.concentric.net>  <200012051254.HAA25502@cj20424-a.reston1.va.home.com>
Message-ID: <3a2d0c29.242749@smtp.worldonline.dk>

On Tue, 05 Dec 2000 07:54:20 -0500, you wrote:

>>     GvR> What does Jython do here?
>> 
>> It's not exactly equivalent, because Jython's __findattr__ can't call
>> back into Python.
>
>I'd say that Jython's __findattr__ is an entirely different beast than
>what we have here.  Its min purpose in life appears to be to be a
>getattr equivalent that returns NULL instead of raising an exception
>when the attribute isn't found -- which is reasonable because from
>within Java, testing for null is much cheaper than checking for an
>exception, and you often need to look whether a given attribute exists
>do some default action if not. 

Correct. It is also the method to override when making a new builtin
type and it will be called on such a type subclass regardless of the
presence of any __getattr__ hook and __dict__ content. So I think it
have some of the properties which Barry wants.


regards,
finn



From greg at cosc.canterbury.ac.nz  Wed Dec  6 00:07:06 2000
From: greg at cosc.canterbury.ac.nz (greg at cosc.canterbury.ac.nz)
Date: Wed, 06 Dec 2000 12:07:06 +1300 (NZDT)
Subject: Are you all mad? (Re: [Python-Dev] PEP 231, __findattr__())
In-Reply-To: <200012051254.HAA25502@cj20424-a.reston1.va.home.com>
Message-ID: <200012052307.MAA01082@s454.cosc.canterbury.ac.nz>

I can't believe you're even considering a magic
dynamically-scoped flag that invisibly changes the
semantics of fundamental operations. To me the
idea is utterly insane!

If I understand correctly, the problem is that if
you do something like

  def __findattr__(self, name):
    if name == 'spam':
      return self.__dict__['spam']

then self.__dict__ is going to trigger a recursive
__findattr__ call. 

It seems to me that if you're going to have some sort
of hook that is always called on any x.y reference,
you need some way of explicitly bypassing it and getting
at the underlying machinery.

I can think of a couple of ways:

1) Make the __dict__ attribute special, so that accessing
it always bypasses __findattr__.

2) Provide some other way of getting direct access to the
attributes of an object, e.g. new builtins called
peekattr() and pokeattr().

This assumes that you always know when you write a particular
access whether you want it to be a "normal" or "special"
one, so that you can use the appropriate mechanism.
Are there any cases where this is not true?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From barry at digicool.com  Wed Dec  6 03:20:40 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 5 Dec 2000 21:20:40 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
	<14891.46227.785856.307437@anthem.concentric.net>
	<20001205033118.9135CA817@darjeeling.zadka.site.co.il>
	<14892.2868.982013.313562@anthem.concentric.net>
	<200012042316.SAA23081@cj20424-a.reston1.va.home.com>
	<14892.22751.921264.156010@anthem.concentric.net>
	<200012051254.HAA25502@cj20424-a.reston1.va.home.com>
	<3a2d0c29.242749@smtp.worldonline.dk>
Message-ID: <14893.41592.701128.58110@anthem.concentric.net>

>>>>> "FB" == Finn Bock <bckfnn at worldonline.dk> writes:

    FB> Correct. It is also the method to override when making a new
    FB> builtin type and it will be called on such a type subclass
    FB> regardless of the presence of any __getattr__ hook and
    FB> __dict__ content. So I think it have some of the properties
    FB> which Barry wants.

We had a discussion about this PEP at our group meeting today.  Rather
than write it all twice, I'm going to try to update the PEP and patch
tonight.  I think what we came up with will solve most of the problems
raised, and will be implementable in Jython (I'll try to work up a
Jython patch too, if I don't fall asleep first :)

-Barry



From barry at digicool.com  Wed Dec  6 03:54:36 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 5 Dec 2000 21:54:36 -0500
Subject: Are you all mad? (Re: [Python-Dev] PEP 231, __findattr__())
References: <200012051254.HAA25502@cj20424-a.reston1.va.home.com>
	<200012052307.MAA01082@s454.cosc.canterbury.ac.nz>
Message-ID: <14893.43628.61063.905227@anthem.concentric.net>

>>>>> "greg" ==   <greg at cosc.canterbury.ac.nz> writes:

    | 1) Make the __dict__ attribute special, so that accessing
    | it always bypasses __findattr__.

You're not far from what I came up with right after our delicious
lunch.  We're going to invent a new protocol which passes __dict__
into the method as an argument.  That way self.__dict__ doesn't need
to be special cased at all because you can get at all the attributes
via a local!  So no recursion stop hack is necessary.

More in the updated PEP and patch.

-Barry



From dgoodger at bigfoot.com  Thu Dec  7 05:33:33 2000
From: dgoodger at bigfoot.com (David Goodger)
Date: Wed, 06 Dec 2000 23:33:33 -0500
Subject: [Python-Dev] unit testing and Python regression test
Message-ID: <B6547D4C.BE96%dgoodger@bigfoot.com>

There is another unit testing implementation out there, OmPyUnit, available
from:

    http://www.objectmentor.com/freeware/downloads.html

-- 
David Goodger    dgoodger at bigfoot.com    Open-source projects:
 - The Go Tools Project: http://gotools.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net (soon!)




From fdrake at users.sourceforge.net  Thu Dec  7 07:26:54 2000
From: fdrake at users.sourceforge.net (Fred L. Drake)
Date: Wed, 6 Dec 2000 22:26:54 -0800
Subject: [Python-Dev] [development doc updates]
Message-ID: <200012070626.WAA22103@orbital.p.sourceforge.net>

The development version of the documentation has been updated:

	http://python.sourceforge.net/devel-docs/


Lots of small changes, but most important, more DOM documentation:

	http://python.sourceforge.net/devel-docs/lib/module-xml.dom.html



From guido at python.org  Thu Dec  7 18:48:53 2000
From: guido at python.org (Guido van Rossum)
Date: Thu, 07 Dec 2000 12:48:53 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
Message-ID: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>

After perusing David Ascher's proposal, several versions of his
patches, and hundreds of email exchanged on this subject (almost all
of this dated April or May of 1998), I've produced a reasonably
semblance of PEP 207.  Get it from CVS or here on the web:

  http://python.sourceforge.net/peps/pep-0207.html

I'd like to hear your comments, praise, and criticisms!

The PEP still needs work; in particular, the minority point of view
back then (that comparisons should return only Boolean results) is not
adequately represented (but I *did* work in a reference to tabnanny,
to ensure Tim's support :-).

I'd like to work on a patch next, but I think there will be
interference with Neil's coercion patch.  I'm not sure how to resolve
that yet; maybe I'll just wait until Neil's coercion patch is checked
in.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Thu Dec  7 18:54:51 2000
From: guido at python.org (Guido van Rossum)
Date: Thu, 07 Dec 2000 12:54:51 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
Message-ID: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>

I'm maybe about three quarters on the way with writing PEP 230 -- far
enough along to be asking for comments.  Get it from CVS or go to:

  http://python.sourceforge.net/peps/pep-0230.html

A prototype implementation in Python is included in the PEP; I think
this shows that the implementation is not too complex (Paul Prescod's
fear about my proposal).

This is pretty close to what I proposed earlier (Nov 5), except that I
have added warning category classes (inspired by Paul's proposal).
This class also serves as the exception to be raised when warnings are
turned into exceptions.

Do I need to include a discussion of Paul's counter-proposal and why I
rejected it?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From Barrett at stsci.edu  Thu Dec  7 23:49:02 2000
From: Barrett at stsci.edu (Paul Barrett)
Date: Thu,  7 Dec 2000 17:49:02 -0500 (EST)
Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays
Message-ID: <14896.1191.240597.632888@nem-srvr.stsci.edu>

What is the status of PEP 209?  I see David Ascher is the champion of
this PEP, but nothing has been written up.  Is the intention of this
PEP to make the current Numeric a built-in feature of Python or to
re-implement and replace the current Numeric module?

The reason that I ask these questions is because I'm working on a
prototype of a new N-dimensional Array module which I call Numeric 2.
This new module will be much more extensible than the current Numeric.
For example, new array types and universal functions can be loaded or
imported on demand.  We also intend to implement a record (or
C-structure) type, because 1-D arrays or lists of records are a common
data structure for storing photon events in astronomy and related
fields.

The current Numeric does not handle record types efficiently,
particularly when the data type is not aligned and is in non-native
endian format.  To handle such data, temporary arrays must be created
and alignment and byte-swapping done on them.  Numeric 2 does such
pre- and post-processing inside the inner-most loop which is more
efficient in both time and memory.  It also does type conversion at
this level which is consistent with that proposed for PEP 208.

Since many scientific users would like direct access to the array data
via C pointers, we have investigated using the buffer object.  We have
not had much success with it, because of its implementation.  I have
scanned the python-dev mailing list for discussions of this issue and
found that it now appears to be deprecated.

My opinion on this is that a new _fundamental_ built-in type should be
created for memory allocation with features and an interface similar
to the _mmap_ object.  I'll call this a _malloc_ object.  This would
allow Numeric 2 to use either object interchangeably depending on the
circumstance.  The _string_ type could also benefit from this new
object by using a read-only version of it.  Since its an object, it's
memory area should be safe from inadvertent deletion.

Because of these and other new features in Numeric 2, I have a keen
interest in the status of PEPs 207, 208, 211, 225, and 228; and also
in the proposed buffer object.  

I'm willing to implement this new _malloc_ object if members of the
python-dev list are in agreement.  Actually I see no alternative,
given the current design of Numeric 2, since the Array class will
initially be written completely in Python and will need a mutable
memory buffer, while the _string_ type is meant to be a read-only
object.

All comments welcome.

 -- Paul

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218



From DavidA at ActiveState.com  Fri Dec  8 02:13:04 2000
From: DavidA at ActiveState.com (David Ascher)
Date: Thu, 7 Dec 2000 17:13:04 -0800 (Pacific Standard Time)
Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional
 Arrays
In-Reply-To: <14896.1191.240597.632888@nem-srvr.stsci.edu>
Message-ID: <Pine.WNT.4.21.0012071712410.1360-100000@loom>

On Thu, 7 Dec 2000, Paul Barrett wrote:

> What is the status of PEP 209?  I see David Ascher is the champion of
> this PEP, but nothing has been written up.  Is the intention of this

I put my name on the PEP just to make sure it wasn't forgotten.  If
someone wants to champion it, their name should go on it.

--david




From guido at python.org  Fri Dec  8 17:10:50 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 08 Dec 2000 11:10:50 -0500
Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays
In-Reply-To: Your message of "Thu, 07 Dec 2000 17:49:02 EST."
             <14896.1191.240597.632888@nem-srvr.stsci.edu> 
References: <14896.1191.240597.632888@nem-srvr.stsci.edu> 
Message-ID: <200012081610.LAA30679@cj20424-a.reston1.va.home.com>

> What is the status of PEP 209?  I see David Ascher is the champion of
> this PEP, but nothing has been written up.  Is the intention of this
> PEP to make the current Numeric a built-in feature of Python or to
> re-implement and replace the current Numeric module?

David has already explained why his name is on it -- basically,
David's name is on several PEPs but he doesn't currently have any time
to work on these, so other volunteers are most welcome to join.

It is my understanding that the current Numeric is sufficiently messy
in implementation and controversial in semantics that it would not be
a good basis to start from.

However, I do think that a basic multi-dimensional array object would
be a welcome addition to core Python.

> The reason that I ask these questions is because I'm working on a
> prototype of a new N-dimensional Array module which I call Numeric 2.
> This new module will be much more extensible than the current Numeric.
> For example, new array types and universal functions can be loaded or
> imported on demand.  We also intend to implement a record (or
> C-structure) type, because 1-D arrays or lists of records are a common
> data structure for storing photon events in astronomy and related
> fields.

I'm not familiar with the use of computers in astronomy and related
fields, so I'll take your word for that! :-)

> The current Numeric does not handle record types efficiently,
> particularly when the data type is not aligned and is in non-native
> endian format.  To handle such data, temporary arrays must be created
> and alignment and byte-swapping done on them.  Numeric 2 does such
> pre- and post-processing inside the inner-most loop which is more
> efficient in both time and memory.  It also does type conversion at
> this level which is consistent with that proposed for PEP 208.
> 
> Since many scientific users would like direct access to the array data
> via C pointers, we have investigated using the buffer object.  We have
> not had much success with it, because of its implementation.  I have
> scanned the python-dev mailing list for discussions of this issue and
> found that it now appears to be deprecated.

Indeed.  I think it's best to leave the buffer object out of your
implementation plans.  There are several problems with it, and one of
the backburner projects is to redesign it to be much more to the point
(providing less, not more functionality).

> My opinion on this is that a new _fundamental_ built-in type should be
> created for memory allocation with features and an interface similar
> to the _mmap_ object.  I'll call this a _malloc_ object.  This would
> allow Numeric 2 to use either object interchangeably depending on the
> circumstance.  The _string_ type could also benefit from this new
> object by using a read-only version of it.  Since its an object, it's
> memory area should be safe from inadvertent deletion.

Interesting.  I'm actually not sufficiently familiar with mmap to
comment.  But would the existing array module's array object be at all
useful?  You can get to the raw bytes in C (using the C buffer API,
which is not deprecated) and it is extensible.

> Because of these and other new features in Numeric 2, I have a keen
> interest in the status of PEPs 207, 208, 211, 225, and 228; and also
> in the proposed buffer object.  

Here are some quick comments on the mentioned PEPs.

207: Rich Comparisons.  This will go into Python 2.1.  (I just
finished the first draft of the PEP, please read it and comment.)

208: Reworking the Coercion Model.  This will go into Python 2.1.
Neil Schemenauer has mostly finished the patches already.  Please
comment.

211: Adding New Lineal Algebra Operators (Greg Wilson).  This is
unlikely to go into Python 2.1.  I don't like the idea much.  If you
disagree, please let me know!  (Also, a choice has to be made between
211 and 225; I don't want to accept both, so until 225 is rejected,
211 is in limbo.)

225: Elementwise/Objectwise Operators (Zhu, Lielens).  This will
definitely not go into Python 2.1.  It adds too many new operators.

228: Reworking Python's Numeric Model.  This is a total pie-in-the-sky
PEP, and this kind of change is not likely to happen before Python
3000.

> I'm willing to implement this new _malloc_ object if members of the
> python-dev list are in agreement.  Actually I see no alternative,
> given the current design of Numeric 2, since the Array class will
> initially be written completely in Python and will need a mutable
> memory buffer, while the _string_ type is meant to be a read-only
> object.

Would you be willing to take over authorship of PEP 209?  David Ascher
and the Numeric Python community will thank you.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Dec  8 19:43:39 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 08 Dec 2000 13:43:39 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: Your message of "Thu, 30 Nov 2000 17:46:52 EST."
             <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> 
Message-ID: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>

After the last round of discussion, I was left with the idea that the
best thing we could do to help destructive iteration is to introduce a
{}.popitem() that returns an arbitrary (key, value) pair and deletes
it.  I wrote about this:

> > One more concern: if you repeatedly remove the *first* item, the hash
> > table will start looking lobsided.  Since we don't resize the hash
> > table on deletes, maybe picking an item at random (but not using an
> > expensive random generator!) would be better.

and Tim replied:

> Which is the reason SETL doesn't specify *which* set item is removed:  if
> you always start looking at "the front" of a dict that's being consumed, the
> dict fills with turds without shrinking, you skip over them again and again,
> and consuming the entire dict is still quadratic time.
> 
> Unfortunately, while using a random start point is almost always quicker
> than that, the expected time for consuming the whole dict remains quadratic.
> 
> The clearest way around that is to save a per-dict search finger, recording
> where the last search left off.  Start from its current value.  Failure if
> it wraps around.  This is linear time in non-pathological cases (a
> pathological case is one in which it isn't linear time <wink>).

I've implemented this, except I use a static variable for the finger
intead of a per-dict finger.  I'm concerned about adding 4-8 extra
bytes to each dict object for a feature that most dictionaries never
need.  So, instead, I use a single shared finger.  This works just as
well as long as this is used for a single dictionary.  For multiple
dictionaries (either used by the same thread or in different threads),
it'll work almost as well, although it's possible to make up a
pathological example that would work qadratically.

An easy example of such a pathological example is to call popitem()
for two identical dictionaries in lock step.

Comments please!  We could:

- Live with the pathological cases.

- Forget the whole thing; and then also forget about firstkey()
  etc. which has the same problem only worse.

- Fix the algorithm. Maybe jumping criss-cross through the hash table
  like lookdict does would improve that; but I don't understand the
  math used for that ("Cycle through GF(2^n)-{0}" ???).

I've placed a patch on SourceForge:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102733&group_id=5470

The algorithm is:

static PyObject *
dict_popitem(dictobject *mp, PyObject *args)
{
	static int finger = 0;
	int i;
	dictentry *ep;
	PyObject *res;

	if (!PyArg_NoArgs(args))
		return NULL;
	if (mp->ma_used == 0) {
		PyErr_SetString(PyExc_KeyError,
				"popitem(): dictionary is empty");
		return NULL;
	}
	i = finger;
	if (i >= mp->ma_size)
		ir = 0;
	while ((ep = &mp->ma_table[i])->me_value == NULL) {
		i++;
		if (i >= mp->ma_size)
			i = 0;
	}
	finger = i+1;
	res = PyTuple_New(2);
	if (res != NULL) {
		PyTuple_SET_ITEM(res, 0, ep->me_key);
		PyTuple_SET_ITEM(res, 1, ep->me_value);
		Py_INCREF(dummy);
		ep->me_key = dummy;
		ep->me_value = NULL;
		mp->ma_used--;
	}
	return res;
}

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Dec  8 19:51:49 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 08 Dec 2000 13:51:49 -0500
Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use
Message-ID: <200012081851.NAA32254@cj20424-a.reston1.va.home.com>

Moshe proposes to add an overridable function sys.displayhook(obj)
which will be called by the interpreter for the PRINT_EXPR opcode,
instead of hardcoding the behavior.  The default implementation will
of course have the current behavior, but this makes it much simpler to
experiment with alternatives, e.g. using str() instead of repr() (or
to choose between str() and repr() based on the type).

Moshe has asked me to pronounce on this PEP.  I've thought about it,
and I'm now all for it.  Moshe (or anyone else), please submit a patch
to SF that shows the complete implementation!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Fri Dec  8 20:06:50 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 8 Dec 2000 14:06:50 -0500
Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEAJIDAA.tim.one@home.com>

[Guido, on sharing a search finger and getting worse-than-linear
 behavior in a simple test case]

See my reply on SourceForge (crossed in the mails).  I predict that fixing
this in an acceptable way (not bulletproof, but linear-time for all
predictably common cases) is a two-character change.

Surprise, although maybe I'm hallucinating (would someone please confirm?):
when I went to the SF patch manager page to look for your patch (using the
Open Patches view), I couldn't find it.  My guess is that if there are "too
many" patches to fit on one screen, then unlike the SF *bug* manager, you
don't get any indication that more patches exist or any control to go to the
next page.




From barry at digicool.com  Fri Dec  8 20:18:26 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Fri, 8 Dec 2000 14:18:26 -0500
Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCIEAJIDAA.tim.one@home.com>
Message-ID: <14897.13314.469255.853298@anthem.concentric.net>

>>>>> "TP" == Tim Peters <tim.one at home.com> writes:

    TP> Surprise, although maybe I'm hallucinating (would someone
    TP> please confirm?): when I went to the SF patch manager page to
    TP> look for your patch (using the Open Patches view), I couldn't
    TP> find it.  My guess is that if there are "too many" patches to
    TP> fit on one screen, then unlike the SF *bug* manager, you don't
    TP> get any indication that more patches exist or any control to
    TP> go to the next page.

I haven't checked recently, but this was definitely true a few weeks
ago.  I think I even submitted an admin request on it, but I don't
remember for sure.

-Barry



From Barrett at stsci.edu  Fri Dec  8 22:22:39 2000
From: Barrett at stsci.edu (Paul Barrett)
Date: Fri,  8 Dec 2000 16:22:39 -0500 (EST)
Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays
In-Reply-To: <200012081610.LAA30679@cj20424-a.reston1.va.home.com>
References: <14896.1191.240597.632888@nem-srvr.stsci.edu>
	<200012081610.LAA30679@cj20424-a.reston1.va.home.com>
Message-ID: <14897.10309.686024.254701@nem-srvr.stsci.edu>

Guido van Rossum writes:
 > > What is the status of PEP 209?  I see David Ascher is the champion of
 > > this PEP, but nothing has been written up.  Is the intention of this
 > > PEP to make the current Numeric a built-in feature of Python or to
 > > re-implement and replace the current Numeric module?
 > 
 > David has already explained why his name is on it -- basically,
 > David's name is on several PEPs but he doesn't currently have any time
 > to work on these, so other volunteers are most welcome to join.
 > 
 > It is my understanding that the current Numeric is sufficiently messy
 > in implementation and controversial in semantics that it would not be
 > a good basis to start from.

That is our (Rick, Perry, and I) belief also.

 > However, I do think that a basic multi-dimensional array object would
 > be a welcome addition to core Python.

That's re-assuring.

 > Indeed.  I think it's best to leave the buffer object out of your
 > implementation plans.  There are several problems with it, and one of
 > the backburner projects is to redesign it to be much more to the point
 > (providing less, not more functionality).

I agree and have already made the decision to leave it out.

 > > My opinion on this is that a new _fundamental_ built-in type should be
 > > created for memory allocation with features and an interface similar
 > > to the _mmap_ object.  I'll call this a _malloc_ object.  This would
 > > allow Numeric 2 to use either object interchangeably depending on the
 > > circumstance.  The _string_ type could also benefit from this new
 > > object by using a read-only version of it.  Since its an object, it's
 > > memory area should be safe from inadvertent deletion.
 > 
 > Interesting.  I'm actually not sufficiently familiar with mmap to
 > comment.  But would the existing array module's array object be at all
 > useful?  You can get to the raw bytes in C (using the C buffer API,
 > which is not deprecated) and it is extensible.

I tried using this but had problems.  I'll look into it again.

 > > Because of these and other new features in Numeric 2, I have a keen
 > > interest in the status of PEPs 207, 208, 211, 225, and 228; and also
 > > in the proposed buffer object.  
 > 
 > Here are some quick comments on the mentioned PEPs.

I've got these PEPs on my desk and will comment on them when I can.

 > > I'm willing to implement this new _malloc_ object if members of the
 > > python-dev list are in agreement.  Actually I see no alternative,
 > > given the current design of Numeric 2, since the Array class will
 > > initially be written completely in Python and will need a mutable
 > > memory buffer, while the _string_ type is meant to be a read-only
 > > object.
 > 
 > Would you be willing to take over authorship of PEP 209?  David Ascher
 > and the Numeric Python community will thank you.

Yes, I'd gladly wield vast and inconsiderate power over unsuspecting
pythoneers. ;-)

 -- Paul





From guido at python.org  Fri Dec  8 23:58:03 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 08 Dec 2000 17:58:03 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Thu, 07 Dec 2000 12:54:51 EST."
             <200012071754.MAA26557@cj20424-a.reston1.va.home.com> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> 
Message-ID: <200012082258.RAA02389@cj20424-a.reston1.va.home.com>

Nobody seems to care much about the warnings PEP so far.  What's up?
Are you all too busy buying presents for the holidays?  Then get me
some too, please? :-)

>   http://python.sourceforge.net/peps/pep-0230.html

I've now produced a prototype implementation for the C code:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102715&group_id=5470

Issues:

- This defines a C API PyErr_Warn(category, message) instead of
  Py_Warn(message, category) as the PEP proposes.  I actually like
  this better: it's consistent with PyErr_SetString() etc. rather than
  with the Python warn(message[, category]) function.

- This calls the Python module from C.  We'll have to see if this is
  fast enough.  I wish I could postpone the import of warnings.py
  until the first call to PyErr_Warn(), but unfortunately the warning
  category classes must be initialized first (so they can be passed
  into PyErr_Warn()).  The current version of warnings.py imports
  rather a lot of other modules (e.g. re and getopt); this can be
  reduced by placing those imports inside the functions that use them.

- All the issues listed in the PEP.

Please comment!

BTW: somebody overwrote the PEP on SourceForge with an older version.
Please remember to do a "cvs update" before running "make install" in
the peps directory!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gstein at lyra.org  Sat Dec  9 00:26:51 2000
From: gstein at lyra.org (Greg Stein)
Date: Fri, 8 Dec 2000 15:26:51 -0800
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Dec 08, 2000 at 01:43:39PM -0500
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
Message-ID: <20001208152651.H30644@lyra.org>

On Fri, Dec 08, 2000 at 01:43:39PM -0500, Guido van Rossum wrote:
>...
> Comments please!  We could:
> 
> - Live with the pathological cases.

I agree: live with it. The typical case will operate just fine.

> - Forget the whole thing; and then also forget about firstkey()
>   etc. which has the same problem only worse.

No opinion.

> - Fix the algorithm. Maybe jumping criss-cross through the hash table
>   like lookdict does would improve that; but I don't understand the
>   math used for that ("Cycle through GF(2^n)-{0}" ???).

No need. The keys were inserted randomly, so sequencing through is
effectively random. :-)

>...
> static PyObject *
> dict_popitem(dictobject *mp, PyObject *args)
> {
> 	static int finger = 0;
> 	int i;
> 	dictentry *ep;
> 	PyObject *res;
> 
> 	if (!PyArg_NoArgs(args))
> 		return NULL;
> 	if (mp->ma_used == 0) {
> 		PyErr_SetString(PyExc_KeyError,
> 				"popitem(): dictionary is empty");
> 		return NULL;
> 	}
> 	i = finger;
> 	if (i >= mp->ma_size)
> 		ir = 0;

Should be "i = 0"

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From tismer at tismer.com  Sat Dec  9 17:44:14 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sat, 09 Dec 2000 18:44:14 +0200
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
Message-ID: <3A32615E.D39B68D2@tismer.com>


Guido van Rossum wrote:
> 
> After the last round of discussion, I was left with the idea that the
> best thing we could do to help destructive iteration is to introduce a
> {}.popitem() that returns an arbitrary (key, value) pair and deletes
> it.  I wrote about this:
> 
> > > One more concern: if you repeatedly remove the *first* item, the hash
> > > table will start looking lobsided.  Since we don't resize the hash
> > > table on deletes, maybe picking an item at random (but not using an
> > > expensive random generator!) would be better.
> 
> and Tim replied:
> 
> > Which is the reason SETL doesn't specify *which* set item is removed:  if
> > you always start looking at "the front" of a dict that's being consumed, the
> > dict fills with turds without shrinking, you skip over them again and again,
> > and consuming the entire dict is still quadratic time.
> >
> > Unfortunately, while using a random start point is almost always quicker
> > than that, the expected time for consuming the whole dict remains quadratic.
> >
> > The clearest way around that is to save a per-dict search finger, recording
> > where the last search left off.  Start from its current value.  Failure if
> > it wraps around.  This is linear time in non-pathological cases (a
> > pathological case is one in which it isn't linear time <wink>).
> 
> I've implemented this, except I use a static variable for the finger
> intead of a per-dict finger.  I'm concerned about adding 4-8 extra
> bytes to each dict object for a feature that most dictionaries never
> need.  So, instead, I use a single shared finger.  This works just as
> well as long as this is used for a single dictionary.  For multiple
> dictionaries (either used by the same thread or in different threads),
> it'll work almost as well, although it's possible to make up a
> pathological example that would work qadratically.
> 
> An easy example of such a pathological example is to call popitem()
> for two identical dictionaries in lock step.
> 
> Comments please!  We could:
> 
> - Live with the pathological cases.
> 
> - Forget the whole thing; and then also forget about firstkey()
>   etc. which has the same problem only worse.
> 
> - Fix the algorithm. Maybe jumping criss-cross through the hash table
>   like lookdict does would improve that; but I don't understand the
>   math used for that ("Cycle through GF(2^n)-{0}" ???).

That algorithm is really a gem which you should know,
so let me try to explain it.


Intro: A little story about finite field theory (very basic).
-------------------------------------------------------------

For every prime p and every power p^n, there
exists a Galois Field ( GF(p^n) ), which is
a finite field.
The additive group is called "elementary Abelian",
it is commutative, and it looks a little like a
vector space, since addition works in cycles modulo p
for every p cell.
The multiplicative group is cyclic, and it never
touches 0. Cyclic groups are generated by a single
primitive element. The powers of that element make up all the
other elements. For all elements of the multiplication
group GF(p^n)* the equality    x^(p^n -1) == 1 . A generator
element is therefore a primitive (p^n-1)th root of unity.


From nas at arctrix.com  Sat Dec  9 12:30:06 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Sat, 9 Dec 2000 03:30:06 -0800
Subject: [Python-Dev] PEP 208 and __coerce__
Message-ID: <20001209033006.A3737@glacier.fnational.com>

While working on the implementation of PEP 208, I discovered that
__coerce__ has some surprising properties.  Initially I
implemented __coerce__ so that the numberic operation currently
being performed was called on the values returned by __coerce__.
This caused test_class to blow up due to code like this:

    class Test:
        def __coerce__(self, other):
            return (self, other)

The 2.0 "solves" this by not calling __coerce__ again if the
objects returned by __coerce__ are instances.  This has the
effect of making code like:

    class A:
        def __coerce__(self, other):
            return B(), other

    class B:
        def __coerce__(self, other):
            return 1, other

    A() + 1

fail to work in the expected way.  The question is: how should
__coerce__ work?  One option is to leave it work the way it does
in 2.0.  Alternatively, I could change it so that if coerce
returns (self, *) then __coerce__ is not called again.


  Neil



From mal at lemburg.com  Sat Dec  9 19:49:29 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 09 Dec 2000 19:49:29 +0100
Subject: [Python-Dev] PEP 208 and __coerce__
References: <20001209033006.A3737@glacier.fnational.com>
Message-ID: <3A327EB9.BD2CA3CC@lemburg.com>

Neil Schemenauer wrote:
> 
> While working on the implementation of PEP 208, I discovered that
> __coerce__ has some surprising properties.  Initially I
> implemented __coerce__ so that the numberic operation currently
> being performed was called on the values returned by __coerce__.
> This caused test_class to blow up due to code like this:
> 
>     class Test:
>         def __coerce__(self, other):
>             return (self, other)
> 
> The 2.0 "solves" this by not calling __coerce__ again if the
> objects returned by __coerce__ are instances.  This has the
> effect of making code like:
> 
>     class A:
>         def __coerce__(self, other):
>             return B(), other
> 
>     class B:
>         def __coerce__(self, other):
>             return 1, other
> 
>     A() + 1
> 
> fail to work in the expected way.  The question is: how should
> __coerce__ work?  One option is to leave it work the way it does
> in 2.0.  Alternatively, I could change it so that if coerce
> returns (self, *) then __coerce__ is not called again.

+0 -- the idea behind the PEP 208 is to get rid off the 
centralized coercion mechanism, so fixing it to allow yet
more obscure variants should be carefully considered. 

I see __coerce__ et al. as old style mechanisms -- operator methods
have much more information available to do the right thing
than the single bottelneck __coerce__.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Sat Dec  9 21:49:04 2000
From: tim.one at home.com (Tim Peters)
Date: Sat, 9 Dec 2000 15:49:04 -0500
Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECEIDAA.tim.one@home.com>

[Guido]
> ...
> I've implemented this, except I use a static variable for the finger
> intead of a per-dict finger.  I'm concerned about adding 4-8 extra
> bytes to each dict object for a feature that most dictionaries never
> need.

It's a bit ironic that dicts are guaranteed to be at least 1/3 wasted space
<wink>.  Let's pick on Christian's idea to reclaim a few bytes of that.

> So, instead, I use a single shared finger.  This works just as
> well as long as this is used for a single dictionary.  For multiple
> dictionaries (either used by the same thread or in different threads),
> it'll work almost as well, although it's possible to make up a
> pathological example that would work qadratically.
>
> An easy example of such a pathological example is to call popitem()
> for two identical dictionaries in lock step.

Please see my later comments attached to the patch:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102733&group_id=5470

In short, for me (truly) identical dicts perform well with or without my
suggestion, while dicts cloned via dict.copy() perform horribly with or
without my suggestion (their internal structures differ); still curious as
to whether that's also true for you (am I looking at a Windows bug?  I don't
see how, but it's possible ...).  In any case, my suggestion turned out to
be worthless on my box.

Playing around via simulations suggests that a shared finger is going to be
disastrous when consuming more than one dict unless they have identical
internal structure (not just compare equal).  As soon as they get a little
out of synch, it just gets worse with each succeeding probe.

> Comments please!  We could:
>
> - Live with the pathological cases.

How boring <wink>.

> - Forget the whole thing; and then also forget about firstkey()
>   etc. which has the same problem only worse.

I don't know that this is an important idea for dicts in general (it is
important for sets) -- it's akin to an xrange for dicts.  But then I've had
more than one real-life program that built giant dicts then ran out of
memory trying to iterate over them!  I'd like to fix that.

> - Fix the algorithm. Maybe jumping criss-cross through the hash table
>   like lookdict does would improve that; but I don't understand the
>   math used for that ("Cycle through GF(2^n)-{0}" ???).

Christian explained that well (thanks!).

However, I still don't see any point to doing that business in .popitem():
when inserting keys, the jitterbug probe sequence has the crucial benefit of
preventing primary clustering when keys collide.  But when we consume a
dict, we just want to visit every slot as quickly as possible.

[Christian]
> Appendix, on the use of finger:
> -------------------------------
>
> Instead of using a global finger variable, you can do the
> following (involving a cast from object to int) :
>
> - if the 0'th slot of the dict is non-empty:
>   return this element and insert the dummy element
>   as key. Set the value field to the Dictionary Algorithm
>   would give for the removed object's hash. This is the
>   next finger.
> - else:
>   treat the value field of the 0'th slot as the last finger.
>   If it is zero, initialize it with 2^n-1.
>   Repetitively use the DA until you find an entry. Save
>   the finger in slot 0 again.
>
> This dosn't cost an extra slot, and even when the dictionary
> is written between removals, the chance to loose the finger
> is just 1:(2^n-1) on every insertion.

I like that, except:

1) As above, I don't believe the GF business buys anything over
   a straightforward search when consuming a dict.

2) Overloading the value field bristles with problems, in part
   because it breaks the invariant that a slot is unused if
   and only if the value field is NULL, in part because C
   doesn't guarantee that you can get away with casting an
   arbitrary int to a pointer and back again.

None of the problems in #2 arise if we abuse the me_hash field instead, so
the attached does that.  Here's a typical run of Guido's test case using
this (on an 866MHz machine w/ 256Mb RAM -- the early values jump all over
the place from run to run):

run = 0
log2size = 10 size = 1024
    7.4 usec per item to build (total 0.008 sec)
    3.4 usec per item to destroy twins (total 0.003 sec)
log2size = 11 size = 2048
    6.7 usec per item to build (total 0.014 sec)
    3.4 usec per item to destroy twins (total 0.007 sec)
log2size = 12 size = 4096
    7.0 usec per item to build (total 0.029 sec)
    3.7 usec per item to destroy twins (total 0.015 sec)
log2size = 13 size = 8192
    7.1 usec per item to build (total 0.058 sec)
    5.9 usec per item to destroy twins (total 0.048 sec)
log2size = 14 size = 16384
    14.7 usec per item to build (total 0.241 sec)
    6.4 usec per item to destroy twins (total 0.105 sec)
log2size = 15 size = 32768
    12.2 usec per item to build (total 0.401 sec)
    3.9 usec per item to destroy twins (total 0.128 sec)
log2size = 16 size = 65536
    7.8 usec per item to build (total 0.509 sec)
    4.0 usec per item to destroy twins (total 0.265 sec)
log2size = 17 size = 131072
    7.9 usec per item to build (total 1.031 sec)
    4.1 usec per item to destroy twins (total 0.543 sec)

The last one is over 100 usec per item using the original patch (with or
without my first suggestion).

if-i-were-a-betting-man-i'd-say-"bingo"-ly y'rs  - tim


Drop-in replacement for the popitem in the patch:

static PyObject *
dict_popitem(dictobject *mp, PyObject *args)
{
	int i = 0;
	dictentry *ep;
	PyObject *res;

	if (!PyArg_NoArgs(args))
		return NULL;
	if (mp->ma_used == 0) {
		PyErr_SetString(PyExc_KeyError,
				"popitem(): dictionary is empty");
		return NULL;
	}
	/* Set ep to "the first" dict entry with a value.  We abuse the hash
	 * field of slot 0 to hold a search finger:
	 * If slot 0 has a value, use slot 0.
	 * Else slot 0 is being used to hold a search finger,
	 * and we use its hash value as the first index to look.
	 */
	ep = &mp->ma_table[0];
	if (ep->me_value == NULL) {
		i = (int)ep->me_hash;
		/* The hash field may be uninitialized trash, or it
		 * may be a real hash value, or it may be a legit
		 * search finger, or it may be a once-legit search
		 * finger that's out of bounds now because it
		 * wrapped around or the table shrunk -- simply
		 * make sure it's in bounds now.
		 */
		if (i >= mp->ma_size || i < 1)
			i = 1;	/* skip slot 0 */
		while ((ep = &mp->ma_table[i])->me_value == NULL) {
			i++;
			if (i >= mp->ma_size)
				i = 1;
		}
	}
	res = PyTuple_New(2);
	if (res != NULL) {
		PyTuple_SET_ITEM(res, 0, ep->me_key);
		PyTuple_SET_ITEM(res, 1, ep->me_value);
		Py_INCREF(dummy);
		ep->me_key = dummy;
		ep->me_value = NULL;
		mp->ma_used--;
	}
	assert(mp->ma_table[0].me_value == NULL);
	mp->ma_table[0].me_hash = i + 1;  /* next place to start */
	return res;
}




From tim.one at home.com  Sat Dec  9 22:09:30 2000
From: tim.one at home.com (Tim Peters)
Date: Sat, 9 Dec 2000 16:09:30 -0500
Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEECEIDAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECEIDAA.tim.one@home.com>

> 	assert(mp->ma_table[0].me_value == NULL);
> 	mp->ma_table[0].me_hash = i + 1;  /* next place to start */

Ack, those two lines should move up into the "if (res != NULL)" block.

errors-are-error-prone-ly y'rs  - tim




From gvwilson at nevex.com  Sun Dec 10 17:11:09 2000
From: gvwilson at nevex.com (Greg Wilson)
Date: Sun, 10 Dec 2000 11:11:09 -0500
Subject: [Python-Dev] re: So You Want to Write About Python?
Message-ID: <NEBBIACCCGNFLECLCLMHCEFLCBAA.gvwilson@nevex.com>

Hi, folks.  Jon Erickson (Doctor Dobb's Journal), Frank Willison (O'Reilly),
and I (professional loose cannon) are doing a workshop at IPC on writing
books
and magazine articles about Python.  It would be great to have a few
articles
(in various stages of their lives) and/or book proposals from people on this
list to use as examples.  So, if you think the world oughta know about the
things you're doing, and would like to use this to help get yourself
motivated
to start writing, please drop me a line.  I'm particularly interested in:

- the real-world issues involved in moving to Unicode

- non-trivial XML processing using SAX and DOM (where "non-trivial" means
  "including namespaces, entity references, error handling, and all that")

- the theory and practice of stackless, generators, and continuations

- the real-world tradeoffs between the various memory management schemes
  that are now available for Python

- feature comparisons of various Foobars that can be used with Python (where
  "Foobar" could be "GUI toolkit", "IDE", "web scripting toolkit", or just
  about anything else)

- performance analysis and tuning of Python itself (as an example of how you
  speed up real applications --- this is something that matters a lot in the
  real world, but tends to get forgotten in school)

- just about anything else that you wish someone had written for you before
  you started your last big project

Thanks,
Greg




From paul at prescod.net  Sun Dec 10 19:02:27 2000
From: paul at prescod.net (Paul Prescod)
Date: Sun, 10 Dec 2000 10:02:27 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com>
Message-ID: <3A33C533.ABA27C7C@prescod.net>

Guido van Rossum wrote:
> 
> Nobody seems to care much about the warnings PEP so far.  What's up?
> Are you all too busy buying presents for the holidays?  Then get me
> some too, please? :-)

My opinions:

 * it should be a built-in or keyword, not a function in "sys". Warning
is supposed to be as easy as possible so people will do it often.
<irrelevant_aside>sys.argv and sys.stdout annoy me as it is.</>

 * the term "level" applied to warnings typically means "warning level"
as in -W1 -W2 -Wall. Proposal: call it "stacklevel" or something.

 * this level idea gives rise to another question. What if I want to see
the full stack context of a warning? Do I have to implement a whole new
warning output hook? It seems like I should be able to specify this as a
command line option alongside the action.

 * I prefer ":*:*:" to ":::" for leaving parts of the warning spec out.

 * should there be a sys.formatwarning? What if I want to redirect
warnings to a socket -- I'd like to use the standard formatting
machinery. Or vice versa, I might want to change the formatting but not
override the destination.

 * there should be a "RuntimeWarning" -- base category for warnings
about dubious runtime behaviors (e.g. integer division truncated value)

 * it should be possible to strip warnings as an optimization step. That
may require interpreter and syntax support.

 * warnings will usually be tied to tests which the user will want to be
able to optimize out also. (e.g. if __debug__ and type(foo)==StringType:
warn "Should be Unicode!")


I propose:

	>>> warn conditional, message[, category]

to be very parallel with 

	>>> assert conditional, message

I'm not proposing the use of the assert keyword anymore, but I am trying
to reuse the syntax for familiarity. Perhaps -Wstrip would strip
warnings out of the bytecode.

 Paul Prescod



From nas at arctrix.com  Sun Dec 10 14:46:46 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 10 Dec 2000 05:46:46 -0800
Subject: [Python-Dev] Reference implementation for PEP 208 (coercion)
Message-ID: <20001210054646.A5219@glacier.fnational.com>

Sourceforge unloads are not working.  The lastest version of the
patch for PEP 208 is here:

    http://arctrix.com/nas/python/coerce-6.0.diff

Operations on instances now call __coerce__ if it exists.  I
think the patch is now complete.  Converting other builtin types
to "new style numbers" can be done with a separate patch.

  Neil



From guido at python.org  Sun Dec 10 23:17:08 2000
From: guido at python.org (Guido van Rossum)
Date: Sun, 10 Dec 2000 17:17:08 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Sun, 10 Dec 2000 10:02:27 PST."
             <3A33C533.ABA27C7C@prescod.net> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com>  
            <3A33C533.ABA27C7C@prescod.net> 
Message-ID: <200012102217.RAA12550@cj20424-a.reston1.va.home.com>

> My opinions:
> 
>  * it should be a built-in or keyword, not a function in "sys". Warning
> is supposed to be as easy as possible so people will do it often.

Disagree.  Warnings are there mostly for the Python system to warn the
Python programmer.  The most heavy use will come from the standard
library, not from user code.

> <irrelevant_aside>sys.argv and sys.stdout annoy me as it is.</>

Too bad.

>  * the term "level" applied to warnings typically means "warning level"
> as in -W1 -W2 -Wall. Proposal: call it "stacklevel" or something.

Good point.

>  * this level idea gives rise to another question. What if I want to see
> the full stack context of a warning? Do I have to implement a whole new
> warning output hook? It seems like I should be able to specify this as a
> command line option alongside the action.

Turn warnings into errors and you'll get a full traceback.  If you
really want a full traceback without exiting, some creative use of
sys._getframe() and the traceback module will probably suit you well.

>  * I prefer ":*:*:" to ":::" for leaving parts of the warning spec out.

I don't.

>  * should there be a sys.formatwarning? What if I want to redirect
> warnings to a socket -- I'd like to use the standard formatting
> machinery. Or vice versa, I might want to change the formatting but not
> override the destination.

Good point.  I'm changing this to:

def showwarning(message, category, filename, lineno, file=None):
    """Hook to frite a warning to a file; replace if you like."""

and

def formatwarning(message, category, filename, lineno):
    """Hook to format a warning the standard way."""

>  * there should be a "RuntimeWarning" -- base category for warnings
> about dubious runtime behaviors (e.g. integer division truncated value)

OK.

>  * it should be possible to strip warnings as an optimization step. That
> may require interpreter and syntax support.

I don't see the point of this.  I think this comes from our different
views on who should issue warnings.

>  * warnings will usually be tied to tests which the user will want to be
> able to optimize out also. (e.g. if __debug__ and type(foo)==StringType:
> warn "Should be Unicode!")
> 
> I propose:
> 
> 	>>> warn conditional, message[, category]

Sorry, this is not worth a new keyword.

> to be very parallel with 
> 
> 	>>> assert conditional, message
> 
> I'm not proposing the use of the assert keyword anymore, but I am trying
> to reuse the syntax for familiarity. Perhaps -Wstrip would strip
> warnings out of the bytecode.

Why?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at effbot.org  Mon Dec 11 01:16:25 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Mon, 11 Dec 2000 01:16:25 +0100
Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use
References: <200012081851.NAA32254@cj20424-a.reston1.va.home.com>
Message-ID: <000901c06307$9a814d60$3c6340d5@hagrid>

Guido wrote:
> Moshe proposes to add an overridable function sys.displayhook(obj)
> which will be called by the interpreter for the PRINT_EXPR opcode,
> instead of hardcoding the behavior.  The default implementation will
> of course have the current behavior, but this makes it much simpler to
> experiment with alternatives, e.g. using str() instead of repr() (or
> to choose between str() and repr() based on the type).

hmm.  instead of patching here and there, what's stopping us
from doing it the right way?  I'd prefer something like:

    import code

    class myCLI(code.InteractiveConsole):
        def displayhook(self, data):
            # non-standard display hook
            print str(data)

    sys.setcli(myCLI())

(in other words, why not move the *entire* command line interface
over to Python code)

</F>




From guido at python.org  Mon Dec 11 03:24:20 2000
From: guido at python.org (Guido van Rossum)
Date: Sun, 10 Dec 2000 21:24:20 -0500
Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use
In-Reply-To: Your message of "Mon, 11 Dec 2000 01:16:25 +0100."
             <000901c06307$9a814d60$3c6340d5@hagrid> 
References: <200012081851.NAA32254@cj20424-a.reston1.va.home.com>  
            <000901c06307$9a814d60$3c6340d5@hagrid> 
Message-ID: <200012110224.VAA12844@cj20424-a.reston1.va.home.com>

> Guido wrote:
> > Moshe proposes to add an overridable function sys.displayhook(obj)
> > which will be called by the interpreter for the PRINT_EXPR opcode,
> > instead of hardcoding the behavior.  The default implementation will
> > of course have the current behavior, but this makes it much simpler to
> > experiment with alternatives, e.g. using str() instead of repr() (or
> > to choose between str() and repr() based on the type).

Effbot regurgitates:
> hmm.  instead of patching here and there, what's stopping us
> from doing it the right way?  I'd prefer something like:
> 
>     import code
> 
>     class myCLI(code.InteractiveConsole):
>         def displayhook(self, data):
>             # non-standard display hook
>             print str(data)
> 
>     sys.setcli(myCLI())
> 
> (in other words, why not move the *entire* command line interface
> over to Python code)

Indeed, this is why I've been hesitant to bless Moshe's hack.  I
finally decided to go for it because I don't see this redesign of the
CLI happening anytime soon.  In order to do it right, it would require
a redesign of the parser input handling, which is probably the oldest
code in Python (short of the long integer math, which predates Python
by several years).  The current code module is a hack, alas, and
doesn't always get it right the same way as the *real* CLI does
things.

So, rather than wait forever for the perfect solution, I think it's
okay to settle for less sooner.  "Now is better than never."

--Guido van Rossum (home page: http://www.python.org/~guido/)



From paulp at ActiveState.com  Mon Dec 11 07:59:29 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 10 Dec 2000 22:59:29 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com>  
	            <3A33C533.ABA27C7C@prescod.net> <200012102217.RAA12550@cj20424-a.reston1.va.home.com>
Message-ID: <3A347B51.ADB3F12C@ActiveState.com>

Guido van Rossum wrote:
> 
>...
> 
> Disagree.  Warnings are there mostly for the Python system to warn the
> Python programmer.  The most heavy use will come from the standard
> library, not from user code.

Most Python code is part of some library or another. It may not be the
standard library but its still a library. Perl and Java both make
warnings (especially about deprecation) very easy *for user code*.

> >  * it should be possible to strip warnings as an optimization step. That
> > may require interpreter and syntax support.
> 
> I don't see the point of this.  I think this comes from our different
> views on who should issue warnings.

Everyone who creates a reusable library will want to issue warnings.
That is to say, most serious Python programmers.

Anyhow, let's presume that it is only the standard library that issues
warnings (for arguments sake). What if I have a speed-critical module
that triggers warnings in an inner loop. Turning off the warning doesn't
turn off the overhead of the warning infrastructure. I should be able to
turn off the overhead easily -- ideally from the Python command line.
And I still feel that part of that "overhead" is in the code that tests
to determine whether to issue the warnings. There should be a way to
turn off that overhead also.

 Paul



From paulp at ActiveState.com  Mon Dec 11 08:23:17 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 10 Dec 2000 23:23:17 -0800
Subject: [Python-Dev] Online help PEP
Message-ID: <3A3480E5.C2577AE6@ActiveState.com>

PEP: ???
Title: Python Online Help
Version: $Revision: 1.0 $
Author: paul at prescod.net, paulp at activestate.com (Paul Prescod)
Status: Draft
Type: Standards Track
Python-Version: 2.1
Status: Incomplete

Abstract

    This PEP describes a command-line driven online help facility
    for Python. The facility should be able to build on existing
    documentation facilities such as the Python documentation 
    and docstrings. It should also be extensible for new types and
    modules.

Interactive use:

    Simply typing "help" describes the help function (through repr 
    overloading).

    "help" can also be used as a function:

    The function takes the following forms of input:

        help( "string" ) -- built-in topic or global
        help( <ob> ) -- docstring from object or type
        help( "doc:filename" ) -- filename from Python documentation

    If you ask for a global, it can be a fully-qualfied name such as 
    help("xml.dom").

    You can also use the facility from a command-line

    python --help if

    In either situation, the output does paging similar to the "more"
    command. 

Implementation

    The help function is implemented in an onlinehelp module which is
    demand-loaded.

    There should be options for fetching help information from
environments 
    other than the command line through the onlinehelp module:

        onelinehelp.gethelp(object_or_string) -> string

    It should also be possible to override the help display function by
    assigning to onlinehelp.displayhelp(object_or_string).

    The module should be able to extract module information from either 
    the HTML or LaTeX versions of the Python documentation. Links should
    be accommodated in a "lynx-like" manner. 

    Over time, it should also be able to recognize when docstrings are 
    in "special" syntaxes like structured text, HTML and LaTeX and
decode 
    them appropriately.

    A prototype implementation is available with the Python source 
    distribution as nondist/sandbox/doctools/onlinehelp.py.

Built-in Topics

    help( "intro" )  - What is Python? Read this first!
    help( "keywords" )  - What are the keywords?
    help( "syntax" )  - What is the overall syntax?
    help( "operators" )  - What operators are available?
    help( "builtins" )  - What functions, types, etc. are built-in?
    help( "modules" )  - What modules are in the standard library?
    help( "copyright" )  - Who owns Python?
    help( "moreinfo" )  - Where is there more information?
    help( "changes" )  - What changed in Python 2.0?
    help( "extensions" )  - What extensions are installed?
    help( "faq" )  - What questions are frequently asked?
    help( "ack" )  - Who has done work on Python lately?

Security Issues

    This module will attempt to import modules with the same names as
    requested topics. Don't use the modules if you are not confident
that 
    everything in your pythonpath is from a trusted source.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:



From tim.one at home.com  Mon Dec 11 08:36:57 2000
From: tim.one at home.com (Tim Peters)
Date: Mon, 11 Dec 2000 02:36:57 -0500
Subject: [Python-Dev] FW: [Python-Help] indentation
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDPIDAA.tim.one@home.com>

While we're talking about pluggable CLIs, I share this fellow's confusion
over IDLE's CLI variant:  block code doesn't "look right" under IDLE because
sys.ps2 doesn't exist under IDLE.  Some days you can't make *anybody* happy
<wink>.

-----Original Message-----
...

Subject: [Python-Help] indentation
Sent: Sunday, December 10, 2000 7:32 AM

...

My Problem has to do with identation:

I put the following to idle:

>>> if not 1:
	print 'Hallo'
	else:

SyntaxError: invalid syntax

I get the Message above.

I know that else must be 4 spaces to the left, but idle doesn't let me do
this.

I have only the alternative to put to most left point. But than I
disturb the block structure and I get again the error message.

I want to have it like this:

>>> if not 1:
	print 'Hallo'
    else:

Can you help me?

...




From fredrik at pythonware.com  Mon Dec 11 12:36:53 2000
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 11 Dec 2000 12:36:53 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com>
Message-ID: <033701c06366$ab746580$0900a8c0@SPIFF>

christian wrote:
> That algorithm is really a gem which you should know,
> so let me try to explain it.

I think someone just won the "brain exploder 2000" award ;-)

to paraphrase Bertrand Russell,

    "Mathematics may be defined as the subject where I never
    know what you are talking about, nor whether what you are
    saying is true"

cheers /F




From thomas at xs4all.net  Mon Dec 11 13:12:09 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 11 Dec 2000 13:12:09 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <033701c06366$ab746580$0900a8c0@SPIFF>; from fredrik@pythonware.com on Mon, Dec 11, 2000 at 12:36:53PM +0100
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF>
Message-ID: <20001211131208.G4396@xs4all.nl>

On Mon, Dec 11, 2000 at 12:36:53PM +0100, Fredrik Lundh wrote:
> christian wrote:
> > That algorithm is really a gem which you should know,
> > so let me try to explain it.

> I think someone just won the "brain exploder 2000" award ;-)

By acclamation, I'd expect. I know it was the best laugh I had since last
week's Have I Got News For You, even though trying to understand it made me
glad I had boring meetings to recuperate in ;)

Highschool-dropout-ly y'rs,

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mal at lemburg.com  Mon Dec 11 13:33:18 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 11 Dec 2000 13:33:18 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF>
Message-ID: <3A34C98E.7C42FD24@lemburg.com>

Fredrik Lundh wrote:
> 
> christian wrote:
> > That algorithm is really a gem which you should know,
> > so let me try to explain it.
> 
> I think someone just won the "brain exploder 2000" award ;-)
> 
> to paraphrase Bertrand Russell,
> 
>     "Mathematics may be defined as the subject where I never
>     know what you are talking about, nor whether what you are
>     saying is true"

Hmm, I must have missed that one... care to repost ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tismer at tismer.com  Mon Dec 11 14:49:48 2000
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 11 Dec 2000 15:49:48 +0200
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF>
Message-ID: <3A34DB7C.FF7E82CE@tismer.com>


Fredrik Lundh wrote:
> 
> christian wrote:
> > That algorithm is really a gem which you should know,
> > so let me try to explain it.
> 
> I think someone just won the "brain exploder 2000" award ;-)
> 
> to paraphrase Bertrand Russell,
> 
>     "Mathematics may be defined as the subject where I never
>     know what you are talking about, nor whether what you are
>     saying is true"

:-))

Well, I was primarily targeting Guido, who said that he
came from math, and one cannot study math without standing
a basic algebra course, I think. I tried my best to explain
it for those who know at least how groups, fields, rings
and automorphisms work. Going into more details of the
theory would be off-topic for python-dev, but I will try
it in an upcoming DDJ article.

As you might have guessed, I didn't do this just for fun.
It is the old game of explaining what is there, convincing
everybody that you at least know what you are talking about,
and then three days later coming up with an improved
application of the theory.

Today is Monday, 2 days left. :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From guido at python.org  Mon Dec 11 16:12:24 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 10:12:24 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: Your message of "Mon, 11 Dec 2000 15:49:48 +0200."
             <3A34DB7C.FF7E82CE@tismer.com> 
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF>  
            <3A34DB7C.FF7E82CE@tismer.com> 
Message-ID: <200012111512.KAA23622@cj20424-a.reston1.va.home.com>

> Fredrik Lundh wrote:
> > 
> > christian wrote:
> > > That algorithm is really a gem which you should know,
> > > so let me try to explain it.
> > 
> > I think someone just won the "brain exploder 2000" award ;-)
> > 
> > to paraphrase Bertrand Russell,
> > 
> >     "Mathematics may be defined as the subject where I never
> >     know what you are talking about, nor whether what you are
> >     saying is true"
> 
> :-))
> 
> Well, I was primarily targeting Guido, who said that he
> came from math, and one cannot study math without standing
> a basic algebra course, I think. I tried my best to explain
> it for those who know at least how groups, fields, rings
> and automorphisms work. Going into more details of the
> theory would be off-topic for python-dev, but I will try
> it in an upcoming DDJ article.

I do have a math degree, but it is 18 years old and I had to give up
after the first paragraph of your explanation.  It made me vividly
recall the first and only class on Galois Theory that I ever took --
after one hour I realized that this was not for me and I didn't have a
math brain after all.  I went back to the basement where the software
development lab was (i.e. a row of card punches :-).

> As you might have guessed, I didn't do this just for fun.
> It is the old game of explaining what is there, convincing
> everybody that you at least know what you are talking about,
> and then three days later coming up with an improved
> application of the theory.
> 
> Today is Monday, 2 days left. :-)

I'm very impressed.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Dec 11 16:15:02 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 10:15:02 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Sun, 10 Dec 2000 22:59:29 PST."
             <3A347B51.ADB3F12C@ActiveState.com> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <200012102217.RAA12550@cj20424-a.reston1.va.home.com>  
            <3A347B51.ADB3F12C@ActiveState.com> 
Message-ID: <200012111515.KAA23764@cj20424-a.reston1.va.home.com>

[me]
> > Disagree.  Warnings are there mostly for the Python system to warn the
> > Python programmer.  The most heavy use will come from the standard
> > library, not from user code.

[Paul Prescod]
> Most Python code is part of some library or another. It may not be the
> standard library but its still a library. Perl and Java both make
> warnings (especially about deprecation) very easy *for user code*.

Hey.  I'm not making it impossible to use warnings.  I'm making it
very easy.  All you have to do is put "from warnings import warn" at
the top of your library module.  Requesting a built-in or even a new
statement is simply excessive.

> > >  * it should be possible to strip warnings as an optimization step. That
> > > may require interpreter and syntax support.
> > 
> > I don't see the point of this.  I think this comes from our different
> > views on who should issue warnings.
> 
> Everyone who creates a reusable library will want to issue warnings.
> That is to say, most serious Python programmers.
> 
> Anyhow, let's presume that it is only the standard library that issues
> warnings (for arguments sake). What if I have a speed-critical module
> that triggers warnings in an inner loop. Turning off the warning doesn't
> turn off the overhead of the warning infrastructure. I should be able to
> turn off the overhead easily -- ideally from the Python command line.
> And I still feel that part of that "overhead" is in the code that tests
> to determine whether to issue the warnings. There should be a way to
> turn off that overhead also.

So rewrite your code so that it doesn't trigger the warning.  When you
get a warning, you're doing something that could be done in a better
way.  So don't whine about the performance.

It's a quality of implementation issue whether C code that tests for
issues that deserve warnings can do the test without slowing down code
that doesn't deserve a warning.  Ditto for standard library code.

Here's an example.  I expect there will eventually (not in 2.1 yet!)
warnings in the deprecated string module.  If you get such a warning
in a time-critical piece of code, the solution is to use string
methods -- not to while about the performance of the backwards
compatibility code.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Mon Dec 11 17:02:29 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 11 Dec 2000 11:02:29 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>
	<200012082258.RAA02389@cj20424-a.reston1.va.home.com>
	<3A33C533.ABA27C7C@prescod.net>
Message-ID: <14900.64149.910989.998348@anthem.concentric.net>

Some of my thoughts after reading the PEP and Paul/Guido's exchange.

- A function in the warn module is better than one in the sys module.
  "from warnings import warn" is good enough to not warrant a
  built-in.  I get the sense that the PEP description is behind
  Guido's currently implementation here.

- When PyErr_Warn() returns 1, does that mean a warning has been
  transmuted into an exception, or some other exception occurred
  during the setting of the warning?  (I think I know, but the PEP
  could be clearer here).

- It would be nice if lineno can be a range specification.  Other
  matches are based on regexps -- think of this as a line number
  regexp.

- Why not do setupwarnings() in site.py?

- Regexp matching on messages should be case insensitive.

- The second argument to sys.warn() or PyErr_Warn() can be any class,
  right?  If so, it's easy for me to have my own warning classes.
  What if I want to set up my own warnings filters?  Maybe if `action'
  could be a callable as well as a string.  Then in my IDE, I could
  set that to "mygui.popupWarningsDialog".

-Barry



From guido at python.org  Mon Dec 11 16:57:33 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 10:57:33 -0500
Subject: [Python-Dev] Online help PEP
In-Reply-To: Your message of "Sun, 10 Dec 2000 23:23:17 PST."
             <3A3480E5.C2577AE6@ActiveState.com> 
References: <3A3480E5.C2577AE6@ActiveState.com> 
Message-ID: <200012111557.KAA24266@cj20424-a.reston1.va.home.com>

I approve of the general idea.  Barry, please assign a PEP number.

> PEP: ???
> Title: Python Online Help
> Version: $Revision: 1.0 $
> Author: paul at prescod.net, paulp at activestate.com (Paul Prescod)
> Status: Draft
> Type: Standards Track
> Python-Version: 2.1
> Status: Incomplete
> 
> Abstract
> 
>     This PEP describes a command-line driven online help facility
>     for Python. The facility should be able to build on existing
>     documentation facilities such as the Python documentation 
>     and docstrings. It should also be extensible for new types and
>     modules.
> 
> Interactive use:
> 
>     Simply typing "help" describes the help function (through repr 
>     overloading).

Cute -- like license, copyright, credits I suppose.

>     "help" can also be used as a function:
> 
>     The function takes the following forms of input:
> 
>         help( "string" ) -- built-in topic or global

Why does a global require string quotes?

>         help( <ob> ) -- docstring from object or type
>         help( "doc:filename" ) -- filename from Python documentation

I'm missing

          help() -- table of contents

I'm not sure if the table of contents should be printed by the repr
output.

>     If you ask for a global, it can be a fully-qualfied name such as 
>     help("xml.dom").

Why are the string quotes needed?  When are they useful?

>     You can also use the facility from a command-line
> 
>     python --help if

Is this really useful?  Sounds like Perlism to me.

>     In either situation, the output does paging similar to the "more"
>     command. 

Agreed.  But how to implement paging in a platform-dependent manner?
On Unix, os.system("more") or "$PAGER" is likely to work.  On Windows,
I suppose we could use its MORE, although that's pretty braindead.  On
the Mac?  Also, inside IDLE or Pythonwin, invoking the system pager
isn't a good idea.

> Implementation
> 
>     The help function is implemented in an onlinehelp module which is
>     demand-loaded.

What does "demand-loaded" mean in a Python context?

>     There should be options for fetching help information from
>     environments other than the command line through the onlinehelp
>     module:
> 
>         onelinehelp.gethelp(object_or_string) -> string

Good idea.

>     It should also be possible to override the help display function by
>     assigning to onlinehelp.displayhelp(object_or_string).

Good idea.  Pythonwin and IDLE could use this.  But I'd like it to
work at least "okay" if they don't.

>     The module should be able to extract module information from either 
>     the HTML or LaTeX versions of the Python documentation. Links should
>     be accommodated in a "lynx-like" manner. 

I think this is beyond the scope.  The LaTeX isn't installed anywhere
(and processing would be too much work).  The HTML is installed only
on Windows, where there already is a way to get it to pop up in your
browser (actually two: it's in the Start menu, and also in IDLE's Help
menu).

>     Over time, it should also be able to recognize when docstrings are 
>     in "special" syntaxes like structured text, HTML and LaTeX and
>     decode them appropriately.

A standard syntax for docstrings is under development, PEP 216.  I
don't agree with the proposal there, but in any case the help PEP
should not attempt to legalize a different format than PEP 216.

>     A prototype implementation is available with the Python source 
>     distribution as nondist/sandbox/doctools/onlinehelp.py.

Neat.  I noticed that in a 24-line screen, the pagesize must be set to
21 to avoid stuff scrolling off the screen.  Maybe there's an off-by-3
error somewhere?

I also noticed that it always prints '1' when invoked as a function.
The new license pager in site.py avoids this problem.

help("operators") and several others raise an
AttributeError('handledocrl').

The "lynx-line links" don't work.

> Built-in Topics
> 
>     help( "intro" )  - What is Python? Read this first!
>     help( "keywords" )  - What are the keywords?
>     help( "syntax" )  - What is the overall syntax?
>     help( "operators" )  - What operators are available?
>     help( "builtins" )  - What functions, types, etc. are built-in?
>     help( "modules" )  - What modules are in the standard library?
>     help( "copyright" )  - Who owns Python?
>     help( "moreinfo" )  - Where is there more information?
>     help( "changes" )  - What changed in Python 2.0?
>     help( "extensions" )  - What extensions are installed?
>     help( "faq" )  - What questions are frequently asked?
>     help( "ack" )  - Who has done work on Python lately?

I think it's naive to expect this help facility to replace browsing
the website or the full documentation package.  There should be one
entry that says to point your browser there (giving the local
filesystem URL if available), and that's it.  The rest of the online
help facility should be concerned with exposing doc strings.

> Security Issues
> 
>     This module will attempt to import modules with the same names as
>     requested topics. Don't use the modules if you are not confident
>     that everything in your pythonpath is from a trusted source.

Yikes!  Another reason to avoid the "string" -> global variable
option.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Dec 11 17:53:37 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 11:53:37 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 11:02:29 EST."
             <14900.64149.910989.998348@anthem.concentric.net> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net>  
            <14900.64149.910989.998348@anthem.concentric.net> 
Message-ID: <200012111653.LAA24545@cj20424-a.reston1.va.home.com>

> Some of my thoughts after reading the PEP and Paul/Guido's exchange.
> 
> - A function in the warn module is better than one in the sys module.
>   "from warnings import warn" is good enough to not warrant a
>   built-in.  I get the sense that the PEP description is behind
>   Guido's currently implementation here.

Yes.  I've updated the PEP to match the (2nd) implementation.

> - When PyErr_Warn() returns 1, does that mean a warning has been
>   transmuted into an exception, or some other exception occurred
>   during the setting of the warning?  (I think I know, but the PEP
>   could be clearer here).

I've clarified this now: it returns 1 in either case.  You have to do
exception handling in either case.  I'm not telling why -- you don't
need to know.  The caller of PyErr_Warn() should not attempt to catch
the exception -- if that's your intent, you shouldn't be calling
PyErr_Warn().  And PyErr_Warn() is complicated enough that it has to
allow raising an exception.

> - It would be nice if lineno can be a range specification.  Other
>   matches are based on regexps -- think of this as a line number
>   regexp.

Too much complexity already.

> - Why not do setupwarnings() in site.py?

See the PEP and the current implementation.  The delayed-loading of
the warnings module means that we have to save the -W options as
sys.warnoptions.  (This also makes them work when multiple
interpreters are used -- they all get the -W options.)

> - Regexp matching on messages should be case insensitive.

Good point!  Done in my version of the code.

> - The second argument to sys.warn() or PyErr_Warn() can be any class,
>   right?

Almost.  It must be derived from __builtin__.Warning.

>   If so, it's easy for me to have my own warning classes.
>   What if I want to set up my own warnings filters?  Maybe if `action'
>   could be a callable as well as a string.  Then in my IDE, I could
>   set that to "mygui.popupWarningsDialog".

No, for that purpose you would override warnings.showwarning().

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Mon Dec 11 17:58:39 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 11 Dec 2000 17:58:39 +0100
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <14900.64149.910989.998348@anthem.concentric.net>; from barry@digicool.com on Mon, Dec 11, 2000 at 11:02:29AM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net>
Message-ID: <20001211175839.H4396@xs4all.nl>

On Mon, Dec 11, 2000 at 11:02:29AM -0500, Barry A. Warsaw wrote:

> - A function in the warn module is better than one in the sys module.
>   "from warnings import warn" is good enough to not warrant a
>   built-in.  I get the sense that the PEP description is behind
>   Guido's currently implementation here.

+1 on this. I have a response to Guido's first posted PEP on my laptop, but
due to a weekend in Germany wasn't able to post it before he updated the
PEP. I guess I can delete the arguments for this, now ;) but lets just say I
think 'sys' is being a bit overused, and the case of a function in sys and
its data in another module is just plain silly.

> - When PyErr_Warn() returns 1, does that mean a warning has been
>   transmuted into an exception, or some other exception occurred
>   during the setting of the warning?  (I think I know, but the PEP
>   could be clearer here).

How about returning 1 for 'warning turned into exception' and -1 for 'normal
exception' ? It would be slightly more similar to other functions if '-1'
meant 'exception', and it would be easy to put in an if statement -- and
still allow C code to ignore the produced error, if it wanted to.

> - It would be nice if lineno can be a range specification.  Other
>   matches are based on regexps -- think of this as a line number
>   regexp.

+0 on this... I'm not sure if such fine-grained control is really necessary.
I liked the hint at 'per function' granularity, but I realise it's tricky to
do right, what with naming issues and all that. 

> - Regexp matching on messages should be case insensitive.

How about being able to pass in compiled regexp objects as well as strings ?
I haven't looked at the implementation at all, so I'm not sure how expensive
it would be, but it might also be nice to have users (= programmers) pass in
an object with its own 'match' method, so you can 'interactively' decide
whether or not to raise an exception, popup a window, and what not. Sort of
like letting 'action' be a callable, which I think is a good idea as well.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Mon Dec 11 18:11:02 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 12:11:02 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 17:58:39 +0100."
             <20001211175839.H4396@xs4all.nl> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net>  
            <20001211175839.H4396@xs4all.nl> 
Message-ID: <200012111711.MAA24818@cj20424-a.reston1.va.home.com>

> > - When PyErr_Warn() returns 1, does that mean a warning has been
> >   transmuted into an exception, or some other exception occurred
> >   during the setting of the warning?  (I think I know, but the PEP
> >   could be clearer here).
> 
> How about returning 1 for 'warning turned into exception' and -1 for 'normal
> exception' ? It would be slightly more similar to other functions if '-1'
> meant 'exception', and it would be easy to put in an if statement -- and
> still allow C code to ignore the produced error, if it wanted to.

Why would you want this?  The user clearly said that they wanted the
exception!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at effbot.org  Mon Dec 11 18:13:10 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Mon, 11 Dec 2000 18:13:10 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34C98E.7C42FD24@lemburg.com>
Message-ID: <009a01c06395$a9da3220$3c6340d5@hagrid>

> Hmm, I must have missed that one... care to repost ?

doesn't everyone here read the daily URL?

here's a link:
http://mail.python.org/pipermail/python-dev/2000-December/010913.html

</F>




From barry at digicool.com  Mon Dec 11 18:18:04 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 11 Dec 2000 12:18:04 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>
	<200012082258.RAA02389@cj20424-a.reston1.va.home.com>
	<3A33C533.ABA27C7C@prescod.net>
	<14900.64149.910989.998348@anthem.concentric.net>
	<200012111653.LAA24545@cj20424-a.reston1.va.home.com>
Message-ID: <14901.3149.109401.151742@anthem.concentric.net>

>>>>> "GvR" == Guido van Rossum <guido at python.org> writes:

    GvR> I've clarified this now: it returns 1 in either case.  You
    GvR> have to do exception handling in either case.  I'm not
    GvR> telling why -- you don't need to know.  The caller of
    GvR> PyErr_Warn() should not attempt to catch the exception -- if
    GvR> that's your intent, you shouldn't be calling PyErr_Warn().
    GvR> And PyErr_Warn() is complicated enough that it has to allow
    GvR> raising an exception.

Makes sense.

    >> - It would be nice if lineno can be a range specification.
    >> Other matches are based on regexps -- think of this as a line
    >> number regexp.

    GvR> Too much complexity already.

Okay, no biggie I think.

    >> - Why not do setupwarnings() in site.py?

    GvR> See the PEP and the current implementation.  The
    GvR> delayed-loading of the warnings module means that we have to
    GvR> save the -W options as sys.warnoptions.  (This also makes
    GvR> them work when multiple interpreters are used -- they all get
    GvR> the -W options.)

Cool.

    >> - Regexp matching on messages should be case insensitive.

    GvR> Good point!  Done in my version of the code.

Cool.

    >> - The second argument to sys.warn() or PyErr_Warn() can be any
    >> class, right?

    GvR> Almost.  It must be derived from __builtin__.Warning.

__builtin__.Warning == exceptions.Warning, right?

    >> If so, it's easy for me to have my own warning classes.  What
    >> if I want to set up my own warnings filters?  Maybe if `action'
    >> could be a callable as well as a string.  Then in my IDE, I
    >> could set that to "mygui.popupWarningsDialog".

    GvR> No, for that purpose you would override
    GvR> warnings.showwarning().

Cool.

Looks good.
-Barry



From thomas at xs4all.net  Mon Dec 11 19:04:56 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 11 Dec 2000 19:04:56 +0100
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012111711.MAA24818@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 12:11:02PM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> <200012111711.MAA24818@cj20424-a.reston1.va.home.com>
Message-ID: <20001211190455.I4396@xs4all.nl>

On Mon, Dec 11, 2000 at 12:11:02PM -0500, Guido van Rossum wrote:

> > How about returning 1 for 'warning turned into exception' and -1 for 'normal
> > exception' ? It would be slightly more similar to other functions if '-1'
> > meant 'exception', and it would be easy to put in an if statement -- and
> > still allow C code to ignore the produced error, if it wanted to.

> Why would you want this?  The user clearly said that they wanted the
> exception!

The difference is that in one case, the user will see the original
warning-turned-exception, and in the other she won't -- the warning will be
lost. At best she'll see (by looking at the traceback) the code intended to
give a warning (that might or might not have been turned into an exception)
and failed. The warning code might decide to do something aditional to
notify the user of the thing it intended to warn about, which ended up as a
'real' exception because of something else.

It's no biggy, obviously, except that if you change your mind it will be
hard to add it without breaking code. Even if you explicitly state the
return value should be tested for boolean value, not greater-than-zero
value.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Mon Dec 11 19:16:58 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 13:16:58 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 19:04:56 +0100."
             <20001211190455.I4396@xs4all.nl> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> <200012111711.MAA24818@cj20424-a.reston1.va.home.com>  
            <20001211190455.I4396@xs4all.nl> 
Message-ID: <200012111816.NAA25214@cj20424-a.reston1.va.home.com>

> > > How about returning 1 for 'warning turned into exception' and -1 for 'normal
> > > exception' ? It would be slightly more similar to other functions if '-1'
> > > meant 'exception', and it would be easy to put in an if statement -- and
> > > still allow C code to ignore the produced error, if it wanted to.
> 
> > Why would you want this?  The user clearly said that they wanted the
> > exception!
> 
> The difference is that in one case, the user will see the original
> warning-turned-exception, and in the other she won't -- the warning will be
> lost. At best she'll see (by looking at the traceback) the code intended to
> give a warning (that might or might not have been turned into an exception)
> and failed.

Yes -- this is a standard convention in Python.  if there's a bug in
code that is used to raise or handle an exception, you get a traceback
from that bug.

> The warning code might decide to do something aditional to
> notify the user of the thing it intended to warn about, which ended up as a
> 'real' exception because of something else.

Nah.  The warning code shouldn't worry about that.  If there's a bug
in PyErr_Warn(), that should get top priority until it's fixed.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Mon Dec 11 19:12:56 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 11 Dec 2000 19:12:56 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34C98E.7C42FD24@lemburg.com> <009a01c06395$a9da3220$3c6340d5@hagrid>
Message-ID: <3A351928.3A41C970@lemburg.com>

Fredrik Lundh wrote:
> 
> > Hmm, I must have missed that one... care to repost ?
> 
> doesn't everyone here read the daily URL?

No time for pull logic... only push logic ;-)

> here's a link:
> http://mail.python.org/pipermail/python-dev/2000-December/010913.html

Thanks.

A very nice introduction indeed. The only thing which
didn't come through in the first reading: why do we need
GF(p^n)'s in the first place ? The second reading then made this
clear: we need to assure that by iterating through the set of
possible coefficients we can actually reach all slots in the
dictionary... a gem indeed.

Now if we could only figure out an equally simple way of
producing perfect hash functions on-the-fly we could eliminate
the need for the PyObject_Compare()s... ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Mon Dec 11 21:22:55 2000
From: tim.one at home.com (Tim Peters)
Date: Mon, 11 Dec 2000 15:22:55 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <033701c06366$ab746580$0900a8c0@SPIFF>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEFJIDAA.tim.one@home.com>

[/F, on Christian's GF tutorial]
> I think someone just won the "brain exploder 2000" award ;-)

Well, anyone can play.  When keys collide, what we need is a function f(i)
such that repeating
    i = f(i)
visits every int in (0, 2**N) exactly once before setting i back to its
initial value, for a fixed N and where the first i is in (0, 2**N).  This is
the quickest:

def f(i):
    i -= 1
    if i == 0:
        i = 2**N-1
    return i

Unfortunately, this leads to performance-destroying "primary collisions"
(see Knuth, or any other text w/ a section on hashing).

Other *good* possibilities include a pseudo-random number generator of
maximal period, or viewing the ints in (0, 2**N) as bit vectors indicating
set membership and generating all subsets of an N-element set in a Grey code
order.

The *form* of the function dictobject.c actually uses is:

def f(i):
    i <<= 1
    if i >= 2**N:
       i ^= MAGIC_CONSTANT_DEPENDING_ON_N
    return i

which is suitably non-linear and as fast as the naive method.  Given the
form of the function, you don't need any theory at all to find a value for
MAGIC_CONSTANT_DEPENDING_ON_N that simply works.  In fact, I verified all
the magic values in dictobject.c via brute force, because the person who
contributed the original code botched the theory slightly and gave us some
values that didn't work.  I'll rely on the theory if and only if we have to
extend this to 64-bit machines someday:  I'm too old to wait for a brute
search of a space with 2**64 elements <wink>.

mathematics-is-a-battle-against-mortality-ly y'rs  - tim




From greg at cosc.canterbury.ac.nz  Mon Dec 11 22:46:11 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 12 Dec 2000 10:46:11 +1300 (NZDT)
Subject: [Python-Dev] Online help PEP
In-Reply-To: <200012111557.KAA24266@cj20424-a.reston1.va.home.com>
Message-ID: <200012112146.KAA01771@s454.cosc.canterbury.ac.nz>

Guido:
> Paul Prescod:
> > In either situation, the output does paging similar to the "more"
> > command. 
> Agreed.

Only if it can be turned off! I usually prefer to use the
scrolling capabilities of whatever shell window I'm using
rather than having some program's own idea of how to do
paging forced upon me when I don't want it.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From moshez at zadka.site.co.il  Tue Dec 12 07:33:02 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue, 12 Dec 2000 08:33:02 +0200 (IST)
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
Message-ID: <20001212063302.05E0BA82E@darjeeling.zadka.site.co.il>

On Mon, 11 Dec 2000 15:22:55 -0500, "Tim Peters" <tim.one at home.com> wrote:

> Well, anyone can play.  When keys collide, what we need is a function f(i)
> such that repeating
>     i = f(i)
> visits every int in (0, 2**N) exactly once before setting i back to its
> initial value, for a fixed N and where the first i is in (0, 2**N).  

OK, maybe this is me being *real* stupid, but why? Why not [0, 2**n)?
Did 0 harm you in your childhood, and you're trying to get back? <0 wink>.

If we had an affine operation, instead of a linear one, we could have 
[0, 2**n). I won't repeat the proof here but changing

> def f(i):
>     i <<= 1
      i^=1 # This is the line I added
>     if i >= 2**N:
>        i ^= MAGIC_CONSTANT_DEPENDING_ON_N
>     return i

Makes you waltz all over [0, 2**n) if the original made you comple 
(0, 2**n). 

if-i'm-wrong-then-someone-should-shoot-me-to-save-me-the-embarrasment-ly y'rs,
Z.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tim.one at home.com  Mon Dec 11 23:38:56 2000
From: tim.one at home.com (Tim Peters)
Date: Mon, 11 Dec 2000 17:38:56 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <20001212063302.05E0BA82E@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEGBIDAA.tim.one@home.com>

[Tim]
> Well, anyone can play.  When keys collide, what we need is a
> function f(i) such that repeating
>     i = f(i)
> visits every int in (0, 2**N) exactly once before setting i back to its
> initial value, for a fixed N and where the first i is in (0, 2**N).

[Moshe Zadka]
> OK, maybe this is me being *real* stupid, but why? Why not [0, 2**n)?
> Did 0 harm you in your childhood, and you're trying to get
> back? <0 wink>.

We don't need f at all unless we've already determined there's a collision
at some index h.  The i sequence is used to offset h (mod 2**N).  An
increment of 0 would waste time (h+0 == h, but we've already done a full
compare on the h'th table entry and already determined it wasn't equal to
what we're looking for).

IOW, there are only 2**N-1 slots still of interest by the time f is needed.

> If we had an affine operation, instead of a linear one, we could have
> [0, 2**n). I won't repeat the proof here but changing
>
> def f(i):
>     i <<= 1
>     i^=1 # This is the line I added
>     if i >= 2**N:
>         i ^= MAGIC_CONSTANT_DEPENDING_ON_N
>     return i
>
> Makes you waltz all over [0, 2**n) if the original made you comple
> (0, 2**n).

But, Moshe!  The proof would have been the most interesting part <wink>.




From gstein at lyra.org  Tue Dec 12 01:15:50 2000
From: gstein at lyra.org (Greg Stein)
Date: Mon, 11 Dec 2000 16:15:50 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012111653.LAA24545@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 11:53:37AM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com>
Message-ID: <20001211161550.Y7732@lyra.org>

On Mon, Dec 11, 2000 at 11:53:37AM -0500, Guido van Rossum wrote:
>...
> > - The second argument to sys.warn() or PyErr_Warn() can be any class,
> >   right?
> 
> Almost.  It must be derived from __builtin__.Warning.

Since you must do "from warnings import warn" before using the warnings,
then I think it makes sense to put the Warning classes into the warnings
module. (e.g. why increase the size of the builtins?)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From guido at python.org  Tue Dec 12 01:39:31 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 19:39:31 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 16:15:50 PST."
             <20001211161550.Y7732@lyra.org> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com>  
            <20001211161550.Y7732@lyra.org> 
Message-ID: <200012120039.TAA02983@cj20424-a.reston1.va.home.com>

> Since you must do "from warnings import warn" before using the warnings,
> then I think it makes sense to put the Warning classes into the warnings
> module. (e.g. why increase the size of the builtins?)

I don't particularly care whether the Warning category classes are
builtins, but I can't declare them in the warnings module.  Typical
use from C is:

    if (PyErr_Warn(PyExc_DeprecationWarning,
		   "the strop module is deprecated"))
            return NULL;

PyErr_Warn() imports the warnings module on its first call.  But the
value of PyExc_DeprecationWarning c.s. must be available *before* the
first call, so they can't be imported from the warnings module!

My first version imported warnings at the start of the program, but
this almost doubled the start-up time, hence the design where the
module is imported only when needed.

The most convenient place to create the Warning category classes is in
the _exceptions module; doing it the easiest way there means that they
are automatically exported to __builtin__.  This doesn't bother me
enough to try and hide them.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gstein at lyra.org  Tue Dec 12 02:11:02 2000
From: gstein at lyra.org (Greg Stein)
Date: Mon, 11 Dec 2000 17:11:02 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012120039.TAA02983@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 07:39:31PM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com>
Message-ID: <20001211171102.C7732@lyra.org>

On Mon, Dec 11, 2000 at 07:39:31PM -0500, Guido van Rossum wrote:
> > Since you must do "from warnings import warn" before using the warnings,
> > then I think it makes sense to put the Warning classes into the warnings
> > module. (e.g. why increase the size of the builtins?)
> 
> I don't particularly care whether the Warning category classes are
> builtins, but I can't declare them in the warnings module.  Typical
> use from C is:
> 
>     if (PyErr_Warn(PyExc_DeprecationWarning,
> 		   "the strop module is deprecated"))
>             return NULL;
> 
> PyErr_Warn() imports the warnings module on its first call.  But the
> value of PyExc_DeprecationWarning c.s. must be available *before* the
> first call, so they can't be imported from the warnings module!

Do the following:

pywarn.h or pyerrors.h:

#define PyWARN_DEPRECATION "DeprecationWarning"

     ...
     if (PyErr_Warn(PyWARN_DEPRECATION,
 		   "the strop module is deprecated"))
             return NULL;

The PyErr_Warn would then use the string to dynamically look up / bind to
the correct value from the warnings module. By using the symbolic constant,
you will catch typos in the C code (e.g. if people passed raw strings, then
a typo won't be found until runtime; using symbols will catch the problem at
compile time).

The above strategy will allow for fully-delayed loading, and for all the
warnings to be located in the "warnings" module.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From guido at python.org  Tue Dec 12 02:21:41 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 20:21:41 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 17:11:02 PST."
             <20001211171102.C7732@lyra.org> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com>  
            <20001211171102.C7732@lyra.org> 
Message-ID: <200012120121.UAA04576@cj20424-a.reston1.va.home.com>

> > PyErr_Warn() imports the warnings module on its first call.  But the
> > value of PyExc_DeprecationWarning c.s. must be available *before* the
> > first call, so they can't be imported from the warnings module!
> 
> Do the following:
> 
> pywarn.h or pyerrors.h:
> 
> #define PyWARN_DEPRECATION "DeprecationWarning"
> 
>      ...
>      if (PyErr_Warn(PyWARN_DEPRECATION,
>  		   "the strop module is deprecated"))
>              return NULL;
> 
> The PyErr_Warn would then use the string to dynamically look up / bind to
> the correct value from the warnings module. By using the symbolic constant,
> you will catch typos in the C code (e.g. if people passed raw strings, then
> a typo won't be found until runtime; using symbols will catch the problem at
> compile time).
> 
> The above strategy will allow for fully-delayed loading, and for all the
> warnings to be located in the "warnings" module.

Yeah, that would be a possibility, if it was deemed evil that the
warnings appear in __builtin__.  I don't see what's so evil about
that.

(There's also the problem that the C code must be able to create new
warning categories, as long as they are derived from the Warning base
class.  Your approach above doesn't support this.  I'm sure you can
figure a way around that too.  But I prefer to hear why you think it's
needed first.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gstein at lyra.org  Tue Dec 12 02:26:00 2000
From: gstein at lyra.org (Greg Stein)
Date: Mon, 11 Dec 2000 17:26:00 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012120121.UAA04576@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 08:21:41PM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com> <20001211171102.C7732@lyra.org> <200012120121.UAA04576@cj20424-a.reston1.va.home.com>
Message-ID: <20001211172600.E7732@lyra.org>

On Mon, Dec 11, 2000 at 08:21:41PM -0500, Guido van Rossum wrote:
>...
> > The above strategy will allow for fully-delayed loading, and for all the
> > warnings to be located in the "warnings" module.
> 
> Yeah, that would be a possibility, if it was deemed evil that the
> warnings appear in __builtin__.  I don't see what's so evil about
> that.
> 
> (There's also the problem that the C code must be able to create new
> warning categories, as long as they are derived from the Warning base
> class.  Your approach above doesn't support this.  I'm sure you can
> figure a way around that too.  But I prefer to hear why you think it's
> needed first.)

I'm just attempting to avoid dumping more names into __builtins__ is all. I
don't believe there is anything intrinsically bad about putting more names
in there, but avoiding the kitchen-sink metaphor for __builtins__ has got to
be a Good Thing :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From guido at python.org  Tue Dec 12 14:43:59 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 12 Dec 2000 08:43:59 -0500
Subject: [Python-Dev] Request review of gdbm patch
Message-ID: <200012121343.IAA06713@cj20424-a.reston1.va.home.com>

I'm asking for a review of the patch to gdbm at 

http://sourceforge.net/patch/?func=detailpatch&patch_id=102638&group_id=5470

I asked the author for clarification and this is what I got.

Can anybody suggest what to do?  His mail doesn't give me much
confidence in the patch. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Tue, 12 Dec 2000 13:24:13 +0100
From:    Damjan <arhiv at freemail.org.mk>
To:      Guido van Rossum <guido at python.org>
Subject: Re: your gdbm patch for Python

On Mon, Dec 11, 2000 at 03:51:03PM -0500, Guido van Rossum wrote:
> I'm looking at your patch at SourceForge:

First, I'm sorry it was such a mess of a patch, but I could't figure it out how
to send a more elaborate comment. (But then again, I would't have an email from
Guido van Rossum in my mail-box, to show of my friends :)

> and wondering two things:
> 
> (1) what does the patch do?
> 
> (2) why does the patch remove the 'f' / GDBM_FAST option?

 From the gdbm info page:
     ...The following may also be
     logically or'd into the database flags: GDBM_SYNC, which causes
     all database operations to be synchronized to the disk, and
     GDBM_NOLOCK, which prevents the library from performing any
     locking on the database file.  The option GDBM_FAST is now
     obsolete, since `gdbm' defaults to no-sync mode...
     ^^^^^^^^
(1) My patch adds two options to the gdbm.open(..) function. These are 'u' for
GDBM_NOLOCK, and 's' for GDBM_SYNC.

(2) GDBM_FAST is obsolete because gdbm defaults to GDBM_FAST, so it's removed.

I'm also thinking about adding a lock and unlock methods to the gdbm object,
but it seems that a gdbm database can only be locked and not unlocked.


- -- 
Damjan Georgievski		|           ???????????? ??????????????????????
Skopje, Macedonia		|           ????????????, ????????????????????

------- End of Forwarded Message




From mal at lemburg.com  Tue Dec 12 14:49:40 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 12 Dec 2000 14:49:40 +0100
Subject: [Python-Dev] Codec aliasing and naming
Message-ID: <3A362CF4.2082A606@lemburg.com>

I just wanted to inform you of a change I plan for the standard
encodings search function to enable better support for aliasing
of encoding names.

The current implementation caches the aliases returned from the
codecs .getaliases() function in the encodings lookup cache
rather than in the alias cache. As a consequence, the hyphen to
underscore mapping is not applied to the aliases. A codec would
have to return a list of all combinations of names with hyphens
and underscores in order to emulate the standard lookup 
behaviour.

I have a ptach which fixes this and also assures that aliases
cannot be overwritten by codecs which register at some later
point in time. This assures that we won't run into situations
where a codec import suddenly overrides behaviour of previously
active codecs.

I would also like to propose the use of a new naming scheme
for codecs which enables drop-in installation. As discussed
on the i18n-sig list, people would like to install codecs
without having the users to call a codec registration function
or to touch site.py.

The standard search function in the encodings package has a
nice property (which I only noticed after the fact ;) which
allows using Python package names in the encoding names,
e.g. you can install a package 'japanese' and the access the
codecs in that package using 'japanese.shiftjis' without
having to bother registering a new codec search function
for the package -- the encodings package search function
will redirect the lookup to the 'japanese' package.

Using package names in the encoding name has several
advantages:
* you know where the codec comes from
* you can have mutliple codecs for the same encoding
* drop-in installation without registration is possible
* the need for a non-default encoding package is visible in the
  source code
* you no longer need to drop new codecs into the Python
  standard lib

Perhaps someone could add a note about this possibility
to the codec docs ?!

If noone objects, I'll apply the patch for the enhanced alias
support later today.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at python.org  Tue Dec 12 14:57:01 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 12 Dec 2000 08:57:01 -0500
Subject: [Python-Dev] Codec aliasing and naming
In-Reply-To: Your message of "Tue, 12 Dec 2000 14:49:40 +0100."
             <3A362CF4.2082A606@lemburg.com> 
References: <3A362CF4.2082A606@lemburg.com> 
Message-ID: <200012121357.IAA06846@cj20424-a.reston1.va.home.com>

> Perhaps someone could add a note about this possibility
> to the codec docs ?!

You can check it in yourself or mail it to Fred or submit it to SF...
I don't expect anyone else will jump in and document this properly.

> If noone objects, I'll apply the patch for the enhanced alias
> support later today.

Fine with me (but I don't use codecs -- where's the Dutch language
support? :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Tue Dec 12 15:38:20 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 12 Dec 2000 15:38:20 +0100
Subject: [Python-Dev] Codec aliasing and naming
References: <3A362CF4.2082A606@lemburg.com> <200012121357.IAA06846@cj20424-a.reston1.va.home.com>
Message-ID: <3A36385C.60C7F2B@lemburg.com>

Guido van Rossum wrote:
> 
> > Perhaps someone could add a note about this possibility
> > to the codec docs ?!
> 
> You can check it in yourself or mail it to Fred or submit it to SF...
> I don't expect anyone else will jump in and document this properly.

I'll submit a bug report so that this doesn't get lost in
the archives. Don't have time for it myself... alas, noone
really does seem to have time these days ;-)
 
> > If noone objects, I'll apply the patch for the enhanced alias
> > support later today.
> 
> Fine with me (but I don't use codecs -- where's the Dutch language
> support? :-).

OK. 

About the Dutch language support: this would make a nice
Christmas fun-project... a new standard module which interfaces
to babel.altavista.com (hmm, they don't list Dutch as a supported
language yet, but maybe if we bug them enough... ;).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From paulp at ActiveState.com  Tue Dec 12 19:11:13 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Tue, 12 Dec 2000 10:11:13 -0800
Subject: [Python-Dev] Online help PEP
References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com>
Message-ID: <3A366A41.1A14EFD4@ActiveState.com>

Guido van Rossum wrote:
> 
>...
> >         help( "string" ) -- built-in topic or global
> 
> Why does a global require string quotes?

It doesn't, but if you happen to say 

help( "dir" ) instead of help( dir ), I think it should do the right
thing.

> I'm missing
> 
>           help() -- table of contents
> 
> I'm not sure if the table of contents should be printed by the repr
> output.

I don't see any benefit in having different behaviors for help and
help().

> >     If you ask for a global, it can be a fully-qualfied name such as
> >     help("xml.dom").
> 
> Why are the string quotes needed?  When are they useful?

When you haven't imported the thing you are asking about. Or when the
string comes from another UI like an editor window, command line or web
form.

> >     You can also use the facility from a command-line
> >
> >     python --help if
> 
> Is this really useful?  Sounds like Perlism to me.

I'm just trying to make it easy to quickly get answers to Python
questions. I could totally see someone writing code in VIM switching to
a bash window to type:

python --help os.path.dirname

That's alot easier than:

$ python
Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import os
>>> help(os.path.dirname)

And what does it hurt?

> >     In either situation, the output does paging similar to the "more"
> >     command.
> 
> Agreed.  But how to implement paging in a platform-dependent manner?
> On Unix, os.system("more") or "$PAGER" is likely to work.  On Windows,
> I suppose we could use its MORE, although that's pretty braindead.  On
> the Mac?  Also, inside IDLE or Pythonwin, invoking the system pager
> isn't a good idea.

The current implementation does paging internally. You could override it
to use the system pager (or no pager).

> What does "demand-loaded" mean in a Python context?

When you "touch" the help object, it loads the onlinehelp module which
has the real implementation. The thing in __builtins__ is just a
lightweight proxy.

> >     It should also be possible to override the help display function by
> >     assigning to onlinehelp.displayhelp(object_or_string).
> 
> Good idea.  Pythonwin and IDLE could use this.  But I'd like it to
> work at least "okay" if they don't.

Agreed. 

> >     The module should be able to extract module information from either
> >     the HTML or LaTeX versions of the Python documentation. Links should
> >     be accommodated in a "lynx-like" manner.
> 
> I think this is beyond the scope.  

Well, we have to do one of:

 * re-write a subset of the docs in a form that can be accessed from the
command line
 * access the existing docs in a form that's installed
 * auto-convert the docs into a form that's compatible

I've already implemented HTML parsing and LaTeX parsing is actually not
that far off. I just need impetus to finish a LaTeX-parsing project I
started on my last vacation.

The reason that LaTeX is interesting is because it would be nice to be
able to move documentation from existing LaTeX files into docstrings.

> The LaTeX isn't installed anywhere
> (and processing would be too much work).  
> The HTML is installed only
> on Windows, where there already is a way to get it to pop up in your
> browser (actually two: it's in the Start menu, and also in IDLE's Help
> menu).

If the documentation becomes an integral part of the Python code, then
it will be installed. It's ridiculous that it isn't already.
ActivePython does install the docs on all platforms.

> A standard syntax for docstrings is under development, PEP 216.  I
> don't agree with the proposal there, but in any case the help PEP
> should not attempt to legalize a different format than PEP 216.

I won't hold my breath for a standard Python docstring format. I've gone
out of my way to make the code format independent..

> Neat.  I noticed that in a 24-line screen, the pagesize must be set to
> 21 to avoid stuff scrolling off the screen.  Maybe there's an off-by-3
> error somewhere?

Yes.

> I also noticed that it always prints '1' when invoked as a function.
> The new license pager in site.py avoids this problem.

Okay.

> help("operators") and several others raise an
> AttributeError('handledocrl').

Fixed.

> The "lynx-line links" don't work.

I don't think that's implemented yet.

> I think it's naive to expect this help facility to replace browsing
> the website or the full documentation package.  There should be one
> entry that says to point your browser there (giving the local
> filesystem URL if available), and that's it.  The rest of the online
> help facility should be concerned with exposing doc strings.

I don't want to replace the documentation. But there is no reason we
should set out to make it incomplete. If its integrated with the HTML
then people can choose whatever access mechanism is easiest for them
right now

I'm trying hard not to be "naive". Realistically, nobody is going to
write a million docstrings between now and Python 2.1. It is much more
feasible to leverage the existing documentation that Fred and others
have spent months on.

> > Security Issues
> > 
> >     This module will attempt to import modules with the same names as
> >     requested topics. Don't use the modules if you are not confident
> >     that everything in your pythonpath is from a trusted source.
> Yikes!  Another reason to avoid the "string" -> global variable
> option.

I don't think we should lose that option. People will want to look up
information from non-executable environments like command lines, GUIs
and web pages. Perhaps you can point me to techniques for extracting
information from Python modules and packages without executing them.

 Paul



From guido at python.org  Tue Dec 12 21:46:09 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 12 Dec 2000 15:46:09 -0500
Subject: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE
Message-ID: <200012122046.PAA16915@cj20424-a.reston1.va.home.com>

------- Forwarded Message

Date:    Tue, 12 Dec 2000 12:38:20 -0800
From:    noreply at sourceforge.net
To:      noreply at sourceforge.net
Subject: SourceForge: PROJECT DOWNTIME NOTICE

ATTENTION SOURCEFORGE PROJECT ADMINISTRATORS

This update is being sent to project administrators only and contains
important information regarding your project. Please read it in its
entirety.


INFRASTRUCTURE UPGRADE, EXPANSION AND RELOCATION

As noted in the sitewide email sent this week, the SourceForge.net
infrastructure is being upgraded (and relocated). As part of this
projects, plans are underway to further increase capacity and
responsiveness.

We are scheduling the relocation of the systems serving project
subdomain web pages.


IMPORTANT:

This move will affect you in the following ways:

1. Service and availability of SourceForge.net and the development
tools provided will continue uninterupted.

2. Project page webservers hosting subdomains
(yourprojectname.sourceforge.net) will be down Friday December 15 from
9PM PST (12AM EST) until 3AM PST.

3. CVS will be unavailable (read only part of the time) from 7PM
until 3AM PST

4. Mailing lists and mail aliases will be unavailable until 3AM PST


- ---------------------
This email was sent from sourceforge.net. To change your email receipt
preferences, please visit the site and edit your account via the
"Account Maintenance" link.

Direct any questions to admin at sourceforge.net, or reply to this email.

------- End of Forwarded Message




From greg at cosc.canterbury.ac.nz  Tue Dec 12 23:42:01 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 13 Dec 2000 11:42:01 +1300 (NZDT)
Subject: [Python-Dev] Online help PEP
In-Reply-To: <3A366A41.1A14EFD4@ActiveState.com>
Message-ID: <200012122242.LAA01902@s454.cosc.canterbury.ac.nz>

Paul Prescod:
> Guido:
> > Why are the string quotes needed?  When are they useful?
> When you haven't imported the thing you are asking about.

It would be interesting if the quoted form allowed you to
extract doc info from a module *without* having the side
effect of importing it.

This could no doubt be done for pure Python modules.
Would be rather tricky for extension modules, though,
I expect.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+




From barry at digicool.com  Wed Dec 13 03:21:36 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 12 Dec 2000 21:21:36 -0500
Subject: [Python-Dev] Two new PEPs, 232 & 233
Message-ID: <14902.56624.20961.768525@anthem.concentric.net>

I've just uploaded two new PEPs.  232 is a revision of my pre-PEP era
function attribute proposal.  233 is Paul Prescod's proposal for an
on-line help facility.

http://python.sourceforge.net/peps/pep-0232.html
http://python.sourceforge.net/peps/pep-0233.html

Let the games begin,
-Barry



From tim.one at home.com  Wed Dec 13 04:34:35 2000
From: tim.one at home.com (Tim Peters)
Date: Tue, 12 Dec 2000 22:34:35 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEGBIDAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEJHIDAA.tim.one@home.com>

[Moshe Zadka]
> If we had an affine operation, instead of a linear one, we could have
> [0, 2**n). I won't repeat the proof here but changing
>
> def f(i):
>     i <<= 1
>     i^=1 # This is the line I added
>     if i >= 2**N:
>         i ^= MAGIC_CONSTANT_DEPENDING_ON_N
>     return i
>
> Makes you waltz all over [0, 2**n) if the original made you comple
> (0, 2**n).

[Tim]
> But, Moshe!  The proof would have been the most interesting part <wink>.

Turns out the proof would have been intensely interesting, as you can see by
running the attached with and without the new line commented out.

don't-ever-trust-a-theoretician<wink>-ly y'rs  - tim


N = 2
MAGIC_CONSTANT_DEPENDING_ON_N = 7

def f(i):
    i <<= 1
    # i^=1 # This is the line I added
    if i >= 2**N:
        i ^= MAGIC_CONSTANT_DEPENDING_ON_N
    return i

i = 1
for nothing in range(4):
    print i,
    i = f(i)
print i




From amk at mira.erols.com  Wed Dec 13 04:55:33 2000
From: amk at mira.erols.com (A.M. Kuchling)
Date: Tue, 12 Dec 2000 22:55:33 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
Message-ID: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>

At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
received a patch from Thomas Gellekum that adds support for the panel
library that will add another 500 lines.  I'd like to split the C file
into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
get #included from the master _cursesmodule.c file.  

Do the powers that be approve of this idea?

--amk



From tim.one at home.com  Wed Dec 13 04:54:20 2000
From: tim.one at home.com (Tim Peters)
Date: Tue, 12 Dec 2000 22:54:20 -0500
Subject: FW: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE
Message-ID: <LNBBLJKPBEHFEDALKOLCKEJKIDAA.tim.one@home.com>

FYI, looks like SourceForge is scheduled to be unusable in a span covering
late Friday thru early Saturday (OTT -- One True Time, defined by the clocks
in Guido's house).

-----Original Message-----
From: python-dev-admin at python.org [mailto:python-dev-admin at python.org]On
Behalf Of Guido van Rossum
Sent: Tuesday, December 12, 2000 3:46 PM
To: python-dev at python.org
Subject: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE



------- Forwarded Message

Date:    Tue, 12 Dec 2000 12:38:20 -0800
From:    noreply at sourceforge.net
To:      noreply at sourceforge.net
Subject: SourceForge: PROJECT DOWNTIME NOTICE

ATTENTION SOURCEFORGE PROJECT ADMINISTRATORS

This update is being sent to project administrators only and contains
important information regarding your project. Please read it in its
entirety.


INFRASTRUCTURE UPGRADE, EXPANSION AND RELOCATION

As noted in the sitewide email sent this week, the SourceForge.net
infrastructure is being upgraded (and relocated). As part of this
projects, plans are underway to further increase capacity and
responsiveness.

We are scheduling the relocation of the systems serving project
subdomain web pages.


IMPORTANT:

This move will affect you in the following ways:

1. Service and availability of SourceForge.net and the development
tools provided will continue uninterupted.

2. Project page webservers hosting subdomains
(yourprojectname.sourceforge.net) will be down Friday December 15 from
9PM PST (12AM EST) until 3AM PST.

3. CVS will be unavailable (read only part of the time) from 7PM
until 3AM PST

4. Mailing lists and mail aliases will be unavailable until 3AM PST


---------------------
This email was sent from sourceforge.net. To change your email receipt
preferences, please visit the site and edit your account via the
"Account Maintenance" link.

Direct any questions to admin at sourceforge.net, or reply to this email.

------- End of Forwarded Message


_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
http://www.python.org/mailman/listinfo/python-dev




From esr at thyrsus.com  Wed Dec 13 05:29:17 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 12 Dec 2000 23:29:17 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>; from amk@mira.erols.com on Tue, Dec 12, 2000 at 10:55:33PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
Message-ID: <20001212232917.A22839@thyrsus.com>

A.M. Kuchling <amk at mira.erols.com>:
> At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
> received a patch from Thomas Gellekum that adds support for the panel
> library that will add another 500 lines.  I'd like to split the C file
> into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
> get #included from the master _cursesmodule.c file.  
> 
> Do the powers that be approve of this idea?

I doubt I qualify as a power that be, but I'm certainly +1 on panel support.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The biggest hypocrites on gun control are those who live in upscale
developments with armed security guards -- and who want to keep other
people from having guns to defend themselves.  But what about
lower-income people living in high-crime, inner city neighborhoods?
Should such people be kept unarmed and helpless, so that limousine
liberals can 'make a statement' by adding to the thousands of gun laws
already on the books?"
	--Thomas Sowell



From fdrake at acm.org  Wed Dec 13 07:24:01 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 13 Dec 2000 01:24:01 -0500 (EST)
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
Message-ID: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>

A.M. Kuchling writes:
 > At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
 > received a patch from Thomas Gellekum that adds support for the panel
 > library that will add another 500 lines.  I'd like to split the C file
 > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
 > get #included from the master _cursesmodule.c file.  

  Would it be reasonable to add panel support as a second extension
module?  Is there really a need for them to be in the same module,
since the panel library is a separate library?


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From gstein at lyra.org  Wed Dec 13 08:58:38 2000
From: gstein at lyra.org (Greg Stein)
Date: Tue, 12 Dec 2000 23:58:38 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>; from amk@mira.erols.com on Tue, Dec 12, 2000 at 10:55:33PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
Message-ID: <20001212235838.T8951@lyra.org>

On Tue, Dec 12, 2000 at 10:55:33PM -0500, A.M. Kuchling wrote:
> At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
> received a patch from Thomas Gellekum that adds support for the panel
> library that will add another 500 lines.  I'd like to split the C file
> into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
> get #included from the master _cursesmodule.c file.  

Why should they be #included? I thought that we can build multiple .c files
into a module...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From gstein at lyra.org  Wed Dec 13 09:05:05 2000
From: gstein at lyra.org (Greg Stein)
Date: Wed, 13 Dec 2000 00:05:05 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects dictobject.c,2.68,2.69
In-Reply-To: <200012130102.RAA31828@slayer.i.sourceforge.net>; from tim_one@users.sourceforge.net on Tue, Dec 12, 2000 at 05:02:49PM -0800
References: <200012130102.RAA31828@slayer.i.sourceforge.net>
Message-ID: <20001213000505.U8951@lyra.org>

On Tue, Dec 12, 2000 at 05:02:49PM -0800, Tim Peters wrote:
> Update of /cvsroot/python/python/dist/src/Objects
> In directory slayer.i.sourceforge.net:/tmp/cvs-serv31776/python/dist/src/objects
> 
> Modified Files:
> 	dictobject.c 
> Log Message:
> Bring comments up to date (e.g., they still said the table had to be
> a prime size, which is in fact never true anymore ...).
>...
> --- 55,78 ----
>   
>   /*
> ! There are three kinds of slots in the table:
> ! 
> ! 1. Unused.  me_key == me_value == NULL
> !    Does not hold an active (key, value) pair now and never did.  Unused can
> !    transition to Active upon key insertion.  This is the only case in which
> !    me_key is NULL, and is each slot's initial state.
> ! 
> ! 2. Active.  me_key != NULL and me_key != dummy and me_value != NULL
> !    Holds an active (key, value) pair.  Active can transition to Dummy upon
> !    key deletion.  This is the only case in which me_value != NULL.
> ! 
> ! 3. Dummy.  me_key == dummy && me_value == NULL
> !    Previously held an active (key, value) pair, but that was deleted and an
> !    active pair has not yet overwritten the slot.  Dummy can transition to
> !    Active upon key insertion.  Dummy slots cannot be made Unused again
> !    (cannot have me_key set to NULL), else the probe sequence in case of
> !    collision would have no way to know they were once active.

4. The popitem finger.


:-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From moshez at zadka.site.co.il  Wed Dec 13 20:19:53 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Wed, 13 Dec 2000 21:19:53 +0200 (IST)
Subject: [Python-Dev] Splitting up _cursesmodule
Message-ID: <20001213191953.7208DA82E@darjeeling.zadka.site.co.il>

On Tue, 12 Dec 2000 23:29:17 -0500, "Eric S. Raymond" <esr at thyrsus.com> wrote:
> A.M. Kuchling <amk at mira.erols.com>:
> > At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
> > received a patch from Thomas Gellekum that adds support for the panel
> > library that will add another 500 lines.  I'd like to split the C file
> > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
> > get #included from the master _cursesmodule.c file.  
> > 
> > Do the powers that be approve of this idea?
> 
> I doubt I qualify as a power that be, but I'm certainly +1 on panel support.
 
I'm +1 on panel support, but that seems the wrong solution. Why not
have several C moudles (_curses_panel,...) and manage a more unified
namespace with the Python wrapper modules?

/curses/panel.py -- from _curses_panel import *
etc.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From akuchlin at mems-exchange.org  Wed Dec 13 13:44:23 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 13 Dec 2000 07:44:23 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 01:24:01AM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
Message-ID: <20001213074423.A30348@kronos.cnri.reston.va.us>

[CC'ing Thomas Gellekum <tg at melaten.rwth-aachen.de>]

On Wed, Dec 13, 2000 at 01:24:01AM -0500, Fred L. Drake, Jr. wrote:
>  Would it be reasonable to add panel support as a second extension
>module?  Is there really a need for them to be in the same module,
>since the panel library is a separate library?

Quite possibly, though the patch isn't structured that way.  The panel
module will need access to the type object for the curses window
object, so it'll have to ensure that _curses is already imported, but
that's no problem.

Thomas, do you feel capable of implementing it as a separate module,
or should I work on it?  Probably a _cursesmodule.h header will have
to be created to make various definitions available to external users
of the basic objects in _curses.  (Bonus: this means that the menu and
form libraries, if they ever get wrapped, can be separate modules, too.)

--amk




From tg at melaten.rwth-aachen.de  Wed Dec 13 15:00:46 2000
From: tg at melaten.rwth-aachen.de (Thomas Gellekum)
Date: 13 Dec 2000 15:00:46 +0100
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: Andrew Kuchling's message of "Wed, 13 Dec 2000 07:44:23 -0500"
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
	<14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
	<20001213074423.A30348@kronos.cnri.reston.va.us>
Message-ID: <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>

Andrew Kuchling <akuchlin at mems-exchange.org> writes:

> [CC'ing Thomas Gellekum <tg at melaten.rwth-aachen.de>]
> 
> On Wed, Dec 13, 2000 at 01:24:01AM -0500, Fred L. Drake, Jr. wrote:
> >  Would it be reasonable to add panel support as a second extension
> >module?  Is there really a need for them to be in the same module,
> >since the panel library is a separate library?
> 
> Quite possibly, though the patch isn't structured that way.  The panel
> module will need access to the type object for the curses window
> object, so it'll have to ensure that _curses is already imported, but
> that's no problem.

You mean as separate modules like

import curses
import panel

? Hm. A panel object is associated with a window object, so it's
created from a window method. This means you'd need to add
window.new_panel() to PyCursesWindow_Methods[] and
curses.update_panels(), curses.panel_above() and curses.panel_below()
(or whatever they're called after we're through discussing this ;-))
to PyCurses_Methods[].

Also, the curses.panel_{above,below}() wrappers need access to the
list_of_panels via find_po().

> Thomas, do you feel capable of implementing it as a separate module,
> or should I work on it?

It's probably finished a lot sooner when you do it; OTOH, it would be
fun to try it. Let's carry this discussion a bit further.

>  Probably a _cursesmodule.h header will have
> to be created to make various definitions available to external
> users of the basic objects in _curses.

That's easy. The problem is that we want to extend those basic objects
in _curses.

>  (Bonus: this means that the
> menu and form libraries, if they ever get wrapped, can be separate
> modules, too.)

Sure, if we solve this for panel, the others are a SMOP. :-)

tg



From guido at python.org  Wed Dec 13 15:31:52 2000
From: guido at python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 09:31:52 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src README,1.106,1.107
In-Reply-To: Your message of "Wed, 13 Dec 2000 06:14:35 PST."
             <200012131414.GAA20849@slayer.i.sourceforge.net> 
References: <200012131414.GAA20849@slayer.i.sourceforge.net> 
Message-ID: <200012131431.JAA21243@cj20424-a.reston1.va.home.com>

> + --with-cxx=<compiler>: Some C++ compilers require that main() is
> +         compiled with the C++ if there is any C++ code in the application.
> +         Specifically, g++ on a.out systems may require that to support
> +         construction of global objects. With this option, the main() function
> +         of Python will be compiled with <compiler>; use that only if you
> +         plan to use C++ extension modules, and if your compiler requires
> +         compilation of main() as a C++ program.

Thanks for documenting this; see my continued reservation in the
(reopened) bug report.

Another question remains regarding the docs though: why is it bad to
always compile main.c with a C++ compiler?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at acm.org  Wed Dec 13 16:19:01 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 13 Dec 2000 10:19:01 -0500 (EST)
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
	<14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
	<20001213074423.A30348@kronos.cnri.reston.va.us>
	<kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>
Message-ID: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>

Thomas Gellekum writes:
 > You mean as separate modules like
 > 
 > import curses
 > import panel

  Or better yet:

	import curses
	import curses.panel

 > ? Hm. A panel object is associated with a window object, so it's
 > created from a window method. This means you'd need to add
 > window.new_panel() to PyCursesWindow_Methods[] and
 > curses.update_panels(), curses.panel_above() and curses.panel_below()
 > (or whatever they're called after we're through discussing this ;-))
 > to PyCurses_Methods[].

  Do these new functions have to be methods on the window objects, or
can they be functions in the new module that take a window as a
parameter?  The underlying window object can certainly provide slots
for the use of the panel (forms, ..., etc.) bindings, and simply
initialize them to NULL (or whatever) for newly created windows.

 > Also, the curses.panel_{above,below}() wrappers need access to the
 > list_of_panels via find_po().

  There's no reason that underlying utilities can't be provided by
_curses using a CObject.  The Extending & Embedding manual has a
section on using CObjects to provide a C API to a module without
having to link to it directly.

 > That's easy. The problem is that we want to extend those basic objects
 > in _curses.

  Again, I'm curious about the necessity of this.  I suspect it can be
avoided.  I think the approach I've hinted at above will allow you to
avoid this, and will allow the panel (forms, ...) support to be added
simply by adding additional modules as they are written and the
underlying libraries are installed on the host.
  I know the question of including these modules in the core
distribution has come up before, but the resurgence in interest in
these makes me want to bring it up again:  Does the curses package
(and the associated C extension(s)) belong in the standard library, or
does it make sense to spin out a distutils-based package?  I've no
objection to them being in the core, but it seems that the release
cycle may want to diverge from Python's.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From guido at python.org  Wed Dec 13 16:48:50 2000
From: guido at python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 10:48:50 -0500
Subject: [Python-Dev] Online help PEP
In-Reply-To: Your message of "Tue, 12 Dec 2000 10:11:13 PST."
             <3A366A41.1A14EFD4@ActiveState.com> 
References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com>  
            <3A366A41.1A14EFD4@ActiveState.com> 
Message-ID: <200012131548.KAA21344@cj20424-a.reston1.va.home.com>

[Paul's PEP]
> > >         help( "string" ) -- built-in topic or global

[me]
> > Why does a global require string quotes?

[Paul]
> It doesn't, but if you happen to say 
> 
> help( "dir" ) instead of help( dir ), I think it should do the right
> thing.

Fair enough.

> > I'm missing
> > 
> >           help() -- table of contents
> > 
> > I'm not sure if the table of contents should be printed by the repr
> > output.
> 
> I don't see any benefit in having different behaviors for help and
> help().

Having the repr() overloading invoke the pager is dangerous.  The beta
version of the license command did this, and it caused some strange
side effects, e.g. vars(__builtins__) would start reading from input
and confuse the users.  The new version's repr() returns the desired
string if it's less than a page, and 'Type license() to see the full
license text' if the pager would need to be invoked.

> > >     If you ask for a global, it can be a fully-qualfied name such as
> > >     help("xml.dom").
> > 
> > Why are the string quotes needed?  When are they useful?
> 
> When you haven't imported the thing you are asking about. Or when the
> string comes from another UI like an editor window, command line or web
> form.

The implied import is a major liability.  If you can do this without
importing (e.g. by source code inspection), fine.  Otherwise, you
might issue some kind of message like "you must first import XXX.YYY".

> > >     You can also use the facility from a command-line
> > >
> > >     python --help if
> > 
> > Is this really useful?  Sounds like Perlism to me.
> 
> I'm just trying to make it easy to quickly get answers to Python
> questions. I could totally see someone writing code in VIM switching to
> a bash window to type:
> 
> python --help os.path.dirname
> 
> That's alot easier than:
> 
> $ python
> Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32
> Type "copyright", "credits" or "license" for more information.
> >>> import os
> >>> help(os.path.dirname)
> 
> And what does it hurt?

The hurt is code bloat in the interpreter and creeping featurism.  If
you need command line access to the docs (which may be a reasonable
thing to ask for, although to me it sounds backwards :-), it's better
to provide a separate command, e.g. pythondoc.  (Analog to perldoc.)

> > >     In either situation, the output does paging similar to the "more"
> > >     command.
> > 
> > Agreed.  But how to implement paging in a platform-dependent manner?
> > On Unix, os.system("more") or "$PAGER" is likely to work.  On Windows,
> > I suppose we could use its MORE, although that's pretty braindead.  On
> > the Mac?  Also, inside IDLE or Pythonwin, invoking the system pager
> > isn't a good idea.
> 
> The current implementation does paging internally. You could override it
> to use the system pager (or no pager).

Yes.  Please add that option to the PEP.

> > What does "demand-loaded" mean in a Python context?
> 
> When you "touch" the help object, it loads the onlinehelp module which
> has the real implementation. The thing in __builtins__ is just a
> lightweight proxy.

Please suggest an implementation.

> > >     It Should Also Be Possible To Override The Help Display Function By
> > >     Assigning To Onlinehelp.Displayhelp(Object_Or_String).
> > 
> > Good Idea.  Pythonwin And Idle Could Use This.  But I'D Like It To
> > Work At Least "Okay" If They Don'T.
> 
> Agreed. 

Glad You'Re So Agreeable. :)

> > >     The Module Should Be Able To Extract Module Information From Either
> > >     The Html Or Latex Versions Of The Python Documentation. Links Should
> > >     Be Accommodated In A "Lynx-Like" Manner.
> > 
> > I Think This Is Beyond The Scope.  
> 
> Well, We Have To Do One Of:
> 
>  * Re-Write A Subset Of The Docs In A Form That Can Be Accessed From The
> Command Line
>  * Access The Existing Docs In A Form That'S Installed
>  * Auto-Convert The Docs Into A Form That'S Compatible

I Really Don'T Think That This Tool Should Attempt To Do Everything.

If Someone *Really* Wants To Browse The Existing (Large) Doc Set In A
Terminal Emulation Window, Let Them Use Lynx And Point It To The
Documentation Set.  (I Agree That The Html Docs Should Be Installed,
By The Way.)

> I'Ve Already Implemented Html Parsing And Latex Parsing Is Actually Not
> That Far Off. I Just Need Impetus To Finish A Latex-Parsing Project I
> Started On My Last Vacation.

A Latex Parser Would Be Most Welcome -- If It Could Replace
Latex2html!  That Old Perl Program Is Really Ready For Retirement.
(Ask Fred.)

> The Reason That Latex Is Interesting Is Because It Would Be Nice To Be
> Able To Move Documentation From Existing Latex Files Into Docstrings.

That'S What Some People Think.  I Disagree That It Would Be Either
Feasible Or A Good Idea To Put All Documentation For A Typical Module
In Its Doc Strings.

> > The Latex Isn'T Installed Anywhere
> > (And Processing Would Be Too Much Work).  
> > The Html Is Installed Only
> > On Windows, Where There Already Is A Way To Get It To Pop Up In Your
> > Browser (Actually Two: It'S In The Start Menu, And Also In Idle'S Help
> > Menu).
> 
> If The Documentation Becomes An Integral Part Of The Python Code, Then
> It Will Be Installed. It'S Ridiculous That It Isn'T Already.

Why Is That Ridiculous?  It'S Just As Easy To Access Them Through The
Web For Most People.  If It'S Not, They Are Available In Easily
Downloadable Tarballs Supporting A Variety Of Formats.  That'S Just
Too Much To Be Included In The Standard Rpms.  (Also, Latex2html
Requires So Much Hand-Holding, And Is So Slow, That It'S Really Not A
Good Idea To Let "Make Install" Install The Html By Default.)

> Activepython Does Install The Docs On All Platforms.

Great.  More Power To You.

> > A Standard Syntax For Docstrings Is Under Development, Pep 216.  I
> > Don'T Agree With The Proposal There, But In Any Case The Help Pep
> > Should Not Attempt To Legalize A Different Format Than Pep 216.
> 
> I Won'T Hold My Breath For A Standard Python Docstring Format. I'Ve Gone
> Out Of My Way To Make The Code Format Independent..

To Tell You The Truth, I'M Not Holding My Breath Either. :-)  So your
code should just dump the doc string on stdout without interpreting it
in any way (except for paging).

> > Neat.  I noticed that in a 24-line screen, the pagesize must be set to
> > 21 to avoid stuff scrolling off the screen.  Maybe there's an off-by-3
> > error somewhere?
> 
> Yes.

It's buggier than just that.  The output of the pager prints an extra
"| " at the start of each page except for the first, and the first
page is a line longer than subsequent pages.

BTW, another bug: try help(cgi).  It's nice that it gives the default
value for arguments, but the defaults for FieldStorage.__init__ happen
to include os.environ.  Its entire value is dumped -- which causes the
pager to be off (it wraps over about 20 lines for me).  I think you
may have to truncate long values a bit, e.g. by using the repr module.

> > I also noticed that it always prints '1' when invoked as a function.
> > The new license pager in site.py avoids this problem.
> 
> Okay.

Where's the check-in? :-)

> > help("operators") and several others raise an
> > AttributeError('handledocrl').
> 
> Fixed.
> 
> > The "lynx-line links" don't work.
> 
> I don't think that's implemented yet.

I'm not sure what you intended to implement there.  I prefer to see
the raw URLs, then I can do whatever I normally do to paste them into
my preferred webbrowser (which *not* lynx :-).

> > I think it's naive to expect this help facility to replace browsing
> > the website or the full documentation package.  There should be one
> > entry that says to point your browser there (giving the local
> > filesystem URL if available), and that's it.  The rest of the online
> > help facility should be concerned with exposing doc strings.
> 
> I don't want to replace the documentation. But there is no reason we
> should set out to make it incomplete. If its integrated with the HTML
> then people can choose whatever access mechanism is easiest for them
> right now
> 
> I'm trying hard not to be "naive". Realistically, nobody is going to
> write a million docstrings between now and Python 2.1. It is much more
> feasible to leverage the existing documentation that Fred and others
> have spent months on.

I said above, and I'll say it again: I think the majority of people
would prefer to use their standard web browser to read the standard
docs.  It's not worth the effort to try to make those accessible
through help().  In fact, I'd encourage the development of a
command-line-invoked help facility that shows doc strings in the
user's preferred web browser -- the webbrowser module makes this
trivial.

> > > Security Issues
> > > 
> > >     This module will attempt to import modules with the same names as
> > >     requested topics. Don't use the modules if you are not confident
> > >     that everything in your pythonpath is from a trusted source.
> > Yikes!  Another reason to avoid the "string" -> global variable
> > option.
> 
> I don't think we should lose that option. People will want to look up
> information from non-executable environments like command lines, GUIs
> and web pages. Perhaps you can point me to techniques for extracting
> information from Python modules and packages without executing them.

I don't know specific tools, but any serious docstring processing tool
ends up parsing the source code for this very reason, so there's
probably plenty of prior art.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at acm.org  Wed Dec 13 17:07:22 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 13 Dec 2000 11:07:22 -0500 (EST)
Subject: [Python-Dev] Online help PEP
In-Reply-To: <200012131548.KAA21344@cj20424-a.reston1.va.home.com>
References: <3A3480E5.C2577AE6@ActiveState.com>
	<200012111557.KAA24266@cj20424-a.reston1.va.home.com>
	<3A366A41.1A14EFD4@ActiveState.com>
	<200012131548.KAA21344@cj20424-a.reston1.va.home.com>
Message-ID: <14903.40634.569192.704368@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > A Latex Parser Would Be Most Welcome -- If It Could Replace
 > Latex2html!  That Old Perl Program Is Really Ready For Retirement.
 > (Ask Fred.)

  Note that Doc/tools/sgmlconv/latex2esis.py already includes a
moderate start at a LaTeX parser.  Paragraph marking is done as a
separate step in Doc/tools/sgmlconv/docfixer.py, but I'd like to push
that down into the LaTeX handler.
  (Note that these tools are mostly broken at the moment, except for
latex2esis.py, which does most of what I need other than paragraph
marking.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From Barrett at stsci.edu  Wed Dec 13 17:34:40 2000
From: Barrett at stsci.edu (Paul Barrett)
Date: Wed, 13 Dec 2000 11:34:40 -0500 (EST)
Subject: [Python-Dev] Reference implementation for PEP 208 (coercion)
In-Reply-To: <20001210054646.A5219@glacier.fnational.com>
References: <20001210054646.A5219@glacier.fnational.com>
Message-ID: <14903.41669.883591.420446@nem-srvr.stsci.edu>

Neil Schemenauer writes:
 > Sourceforge unloads are not working.  The lastest version of the
 > patch for PEP 208 is here:
 > 
 >     http://arctrix.com/nas/python/coerce-6.0.diff
 > 
 > Operations on instances now call __coerce__ if it exists.  I
 > think the patch is now complete.  Converting other builtin types
 > to "new style numbers" can be done with a separate patch.

My one concern about this patch is whether the non-commutativity of
operators is preserved.  This issue being important for matrix
operations (not to be confused with element-wise array operations).

 -- Paul





From guido at python.org  Wed Dec 13 17:45:12 2000
From: guido at python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 11:45:12 -0500
Subject: [Python-Dev] Reference implementation for PEP 208 (coercion)
In-Reply-To: Your message of "Wed, 13 Dec 2000 11:34:40 EST."
             <14903.41669.883591.420446@nem-srvr.stsci.edu> 
References: <20001210054646.A5219@glacier.fnational.com>  
            <14903.41669.883591.420446@nem-srvr.stsci.edu> 
Message-ID: <200012131645.LAA21719@cj20424-a.reston1.va.home.com>

> Neil Schemenauer writes:
>  > Sourceforge unloads are not working.  The lastest version of the
>  > patch for PEP 208 is here:
>  > 
>  >     http://arctrix.com/nas/python/coerce-6.0.diff
>  > 
>  > Operations on instances now call __coerce__ if it exists.  I
>  > think the patch is now complete.  Converting other builtin types
>  > to "new style numbers" can be done with a separate patch.
> 
> My one concern about this patch is whether the non-commutativity of
> operators is preserved.  This issue being important for matrix
> operations (not to be confused with element-wise array operations).

Yes, this is preserved.  (I'm spending most of my waking hours
understanding this patch -- it is a true piece of wizardry.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Wed Dec 13 18:38:00 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 13 Dec 2000 18:38:00 +0100
Subject: [Python-Dev] Reference implementation for PEP 208 (coercion)
References: <20001210054646.A5219@glacier.fnational.com>  
	            <14903.41669.883591.420446@nem-srvr.stsci.edu> <200012131645.LAA21719@cj20424-a.reston1.va.home.com>
Message-ID: <3A37B3F7.5640FAFC@lemburg.com>

Guido van Rossum wrote:
> 
> > Neil Schemenauer writes:
> >  > Sourceforge unloads are not working.  The lastest version of the
> >  > patch for PEP 208 is here:
> >  >
> >  >     http://arctrix.com/nas/python/coerce-6.0.diff
> >  >
> >  > Operations on instances now call __coerce__ if it exists.  I
> >  > think the patch is now complete.  Converting other builtin types
> >  > to "new style numbers" can be done with a separate patch.
> >
> > My one concern about this patch is whether the non-commutativity of
> > operators is preserved.  This issue being important for matrix
> > operations (not to be confused with element-wise array operations).
> 
> Yes, this is preserved.  (I'm spending most of my waking hours
> understanding this patch -- it is a true piece of wizardry.)

The fact that coercion didn't allow detection of parameter
order was the initial cause for my try at fixing it back then.
I was confronted with the fact that at C level there was no way
to tell whether the operands were in the order left, right or
right, left -- as a result I used a gross hack in mxDateTime
to still make this work...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Wed Dec 13 22:01:46 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 13 Dec 2000 16:01:46 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 10:19:01AM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>
Message-ID: <20001213160146.A24753@thyrsus.com>

Fred L. Drake, Jr. <fdrake at acm.org>:
>   I know the question of including these modules in the core
> distribution has come up before, but the resurgence in interest in
> these makes me want to bring it up again:  Does the curses package
> (and the associated C extension(s)) belong in the standard library, or
> does it make sense to spin out a distutils-based package?  I've no
> objection to them being in the core, but it seems that the release
> cycle may want to diverge from Python's.

Curses needs to be in the core for political reasons.  Specifically, 
to support CML2 without requiring any extra packages or downloads
beyond the stock Python interpreter.

And what makes CML2 so constrained and so important?  It's my bid to
replace the Linux kernel's configuration machinery.  It has many
advantages over the existing config system, but the linux developers
are *very* resistant to adding things to the kernel's minimum build
kit.  Python alone may prove too much for them to swallow (though
there are hopeful signs they will); Python plus a separately
downloadable curses module would definitely be too much.

Guido attaches sufficient importance to getting Python into the kernel
build machinery that he approved adding ncurses to the standard modules
on that basis.  This would be a huge design win for us, raising Python's
visibility considerably.

So curses must stay in the core.  I don't have a requirement for
panels; my present curses front end simulates them. But if panels were
integrated into the core I could simplify the front-end code
significantly.  Every line I can remove from my stuff (even if it, in
effect, is just migrating into the Python core) makes it easier to
sell CML2 into the kernel.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Experience should teach us to be most on our guard to protect liberty
when the government's purposes are beneficient...  The greatest dangers
to liberty lurk in insidious encroachment by men of zeal, well meaning
but without understanding."
	-- Supreme Court Justice Louis Brandeis



From jheintz at isogen.com  Wed Dec 13 22:10:32 2000
From: jheintz at isogen.com (John D. Heintz)
Date: Wed, 13 Dec 2000 15:10:32 -0600
Subject: [Python-Dev] Announcing ZODB-Corba code release
Message-ID: <3A37E5C8.7000800@isogen.com>

Here is the first release of code that exposes a ZODB database through 
CORBA (omniORB).

The code is functioning, the docs are sparse, and it should work on your 
machines.  ;-)

I am only going to be in town for the next two days, then I will be 
unavailable until Jan 1.

See http://www.zope.org/Members/jheintz/ZODB_CORBA_Connection to 
download the code.

It's not perfect, but it works for me.

Enjoy,
John


-- 
. . . . . . . . . . . . . . . . . . . . . . . .

John D. Heintz | Senior Engineer

1016 La Posada Dr. | Suite 240 | Austin TX 78752
T 512.633.1198 | jheintz at isogen.com

w w w . d a t a c h a n n e l . c o m




From guido at python.org  Wed Dec 13 22:19:01 2000
From: guido at python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 16:19:01 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: Your message of "Wed, 13 Dec 2000 16:01:46 EST."
             <20001213160146.A24753@thyrsus.com> 
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>  
            <20001213160146.A24753@thyrsus.com> 
Message-ID: <200012132119.QAA11060@cj20424-a.reston1.va.home.com>

> So curses must stay in the core.  I don't have a requirement for
> panels; my present curses front end simulates them. But if panels were
> integrated into the core I could simplify the front-end code
> significantly.  Every line I can remove from my stuff (even if it, in
> effect, is just migrating into the Python core) makes it easier to
> sell CML2 into the kernel.

On the other hand you may want to be conservative.  You already have
to require Python 2.0 (I presume).  The panel stuff will be available
in 2.1 at the earliest.  You probably shouldn't throw out your panel
emulation until your code has already been accepted...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin at loewis.home.cs.tu-berlin.de  Wed Dec 13 22:56:27 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 13 Dec 2000 22:56:27 +0100
Subject: [Python-Dev] CVS: python/dist/src README,1.106,1.107
Message-ID: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de>

> Another question remains regarding the docs though: why is it bad to
> always compile main.c with a C++ compiler?

For the whole thing to work, it may also be necessary to link the
entire application with a C++ compiler; that in turn may bind to the
C++ library. Linking with the system's C++ library means that the
Python executable cannot be as easily exchanged between installations
of the operating system - you'd also need to have the right version of
the C++ library to run it. If the C++ library is static, that may also
increase the size of the executable.

I can't really point to a specific problem that would occur on a
specific system I use if main() was compiled with a C++
compiler. However, on the systems I use (Windows, Solaris, Linux), you
can build C++ extension modules even if Python was not compiled as a
C++ application.

On Solaris and Windows, you'd also have to chose the C++ compiler you
want to use (MSVC++, SunPro CC, or g++); in turn, different C++
runtime systems would be linked into the application.

Regards,
Martin



From esr at thyrsus.com  Wed Dec 13 23:03:59 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 13 Dec 2000 17:03:59 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012132119.QAA11060@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 04:19:01PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com>
Message-ID: <20001213170359.A24915@thyrsus.com>

Guido van Rossum <guido at python.org>:
> > So curses must stay in the core.  I don't have a requirement for
> > panels; my present curses front end simulates them. But if panels were
> > integrated into the core I could simplify the front-end code
> > significantly.  Every line I can remove from my stuff (even if it, in
> > effect, is just migrating into the Python core) makes it easier to
> > sell CML2 into the kernel.
> 
> On the other hand you may want to be conservative.  You already have
> to require Python 2.0 (I presume).  The panel stuff will be available
> in 2.1 at the earliest.  You probably shouldn't throw out your panel
> emulation until your code has already been accepted...

Yes, that's how I am currently expecting it to play out -- but if the 2.4.0
kernel is delayed another six months, I'd change my mind.  I'll explain this,
because python-dev people should grok what the surrounding politics and timing 
are.

I actually debated staying with 1.5.2 as a base version.  What changed
my mind was two things.  One: by going to 2.0 I could drop close to 600
lines and three entire support modules from CML2, slimming down its 
footprint in the kernel tree significantly (by more than 10% of the 
entire code volume, actually).

Second: CML2 is not going to be seriously evaluated until 2.4.0 final is out.
Linus made this clear when I demoed it for him at LWE.  My best guess about 
when that will happen is late January into Februrary.  By the time Red Hat
issues its next distro after that (probably May or thenabouts) it's a safe
bet 2.0 will be on it, and everywhere else.

But if the 2.4.0 kernel slips another six months yet again, and our
2.1 commes out relatively quickly (like, just before the 9th Python
Conference :-)) then we *might* have time to get 2.1 into the distros
before CML2 gets the imprimatur.

So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel
will be delayed yet again :-).
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Ideology, politics and journalism, which luxuriate in failure, are
impotent in the face of hope and joy.
	-- P. J. O'Rourke



From nas at arctrix.com  Wed Dec 13 16:37:45 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 07:37:45 -0800
Subject: [Python-Dev] CVS: python/dist/src README,1.106,1.107
In-Reply-To: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Wed, Dec 13, 2000 at 10:56:27PM +0100
References: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de>
Message-ID: <20001213073745.C17148@glacier.fnational.com>

These are issues to consider for Python 3000 as well.  AFAKI, C++
ABIs are a nighmare.

  Neil



From fdrake at acm.org  Wed Dec 13 23:29:25 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 13 Dec 2000 17:29:25 -0500 (EST)
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001213170359.A24915@thyrsus.com>
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
	<14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
	<20001213074423.A30348@kronos.cnri.reston.va.us>
	<kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>
	<14903.37733.339921.131872@cj42289-a.reston1.va.home.com>
	<20001213160146.A24753@thyrsus.com>
	<200012132119.QAA11060@cj20424-a.reston1.va.home.com>
	<20001213170359.A24915@thyrsus.com>
Message-ID: <14903.63557.282592.796169@cj42289-a.reston1.va.home.com>

Eric S. Raymond writes:
 > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel
 > will be delayed yet again :-).

  Politics aside, I think development of curses-related extensions
like panels and forms doesn't need to be delayed.  I've posted what I
think are relavant technical comments already, and leave it up to the
developers of any new modules to get them written -- I don't know
enough curses to offer any help there.
  Regardless of how the curses package is distributed and deployed, I
don't see any reason to delay development in its existing location in
the Python CVS repository.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From nas at arctrix.com  Wed Dec 13 16:41:54 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 07:41:54 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001213170359.A24915@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 13, 2000 at 05:03:59PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com>
Message-ID: <20001213074154.D17148@glacier.fnational.com>

On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote:
> CML2 is not going to be seriously evaluated until 2.4.0 final
> is out.  Linus made this clear when I demoed it for him at LWE.
> My best guess about when that will happen is late January into
> Februrary.  By the time Red Hat issues its next distro after
> that (probably May or thenabouts) it's a safe bet 2.0 will be
> on it, and everywhere else.

I don't think that is a very safe bet.  Python 2.0 missed the
Debian Potato boat.  I have no idea when Woody is expected to be
released but I expect it may take longer than that if history is
any indication.

  Neil



From guido at python.org  Thu Dec 14 00:03:31 2000
From: guido at python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 18:03:31 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: Your message of "Wed, 13 Dec 2000 07:41:54 PST."
             <20001213074154.D17148@glacier.fnational.com> 
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com>  
            <20001213074154.D17148@glacier.fnational.com> 
Message-ID: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>

> I don't think that is a very safe bet.  Python 2.0 missed the
> Debian Potato boat.

This may have had to do more with the unresolved GPL issues.  I
recently received a mail from Stallman indicating that an agreement
with CNRI has been reached; they have agreed (in principle, at least)
to specific changes to the CNRI license that will defuse the
choice-of-law clause when it is combined with GPL-licensed code "in a
non-separable way".  A glitch here is that the BeOpen license probably
has to be changed too, but I believe that that's all doable.

> I have no idea when Woody is expected to be
> released but I expect it may take longer than that if history is
> any indication.

And who or what is Woody?

Feeling-left-out,

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gstein at lyra.org  Thu Dec 14 00:16:09 2000
From: gstein at lyra.org (Greg Stein)
Date: Wed, 13 Dec 2000 15:16:09 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com>
Message-ID: <20001213151609.E8951@lyra.org>

On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote:
>...
> > I have no idea when Woody is expected to be
> > released but I expect it may take longer than that if history is
> > any indication.
> 
> And who or what is Woody?

One of the Debian releases. Dunno if it is the "next" release, but there ya
go.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From gstein at lyra.org  Thu Dec 14 00:18:34 2000
From: gstein at lyra.org (Greg Stein)
Date: Wed, 13 Dec 2000 15:18:34 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001213170359.A24915@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 13, 2000 at 05:03:59PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com>
Message-ID: <20001213151834.F8951@lyra.org>

On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote:
>...
> So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel
> will be delayed yet again :-).

The kernel is not going to be delayed that much. Linus wants it to go out
this month. Worst case, I could see January. But no way on six months.

But as Fred said: that should not change panels going into the curses
support at all. You can always have a "compat.py" module in CML2 that
provides functionality for prior-to-2.1 releases of Python.

I'd also be up for a separate _curses_panels module, loaded into the curses
package.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From esr at thyrsus.com  Thu Dec 14 00:33:02 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 13 Dec 2000 18:33:02 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001213151834.F8951@lyra.org>; from gstein@lyra.org on Wed, Dec 13, 2000 at 03:18:34PM -0800
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213151834.F8951@lyra.org>
Message-ID: <20001213183302.A25160@thyrsus.com>

Greg Stein <gstein at lyra.org>:
> On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote:
> >...
> > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel
> > will be delayed yet again :-).
> 
> The kernel is not going to be delayed that much. Linus wants it to go out
> this month. Worst case, I could see January. But no way on six months.

I know what Linus wants.  That's why I'm estimating end of January or 
earlier Februrary -- the man's error curve on these estimates has a 
certain, er, *consistency* about it.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Alcohol still kills more people every year than all `illegal' drugs put
together, and Prohibition only made it worse.  Oppose the War On Some Drugs!



From nas at arctrix.com  Wed Dec 13 18:18:48 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 09:18:48 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com>
Message-ID: <20001213091848.A17326@glacier.fnational.com>

On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote:
> > I don't think that is a very safe bet.  Python 2.0 missed the
> > Debian Potato boat.
> 
> This may have had to do more with the unresolved GPL issues.

I can't remember the exact dates but I think Debian Potato was
frozen before Python 2.0 was released.  Once a Debian release is
frozen packages are not upgraded except under unusual
circumstances.

> I recently received a mail from Stallman indicating that an
> agreement with CNRI has been reached; they have agreed (in
> principle, at least) to specific changes to the CNRI license
> that will defuse the choice-of-law clause when it is combined
> with GPL-licensed code "in a non-separable way".  A glitch here
> is that the BeOpen license probably has to be changed too, but
> I believe that that's all doable.

This is great news.

> > I have no idea when Woody is expected to be
> > released but I expect it may take longer than that if history is
> > any indication.
> 
> And who or what is Woody?

Woody would be another character from the Pixar movie "Toy Story"
(just like Rex, Bo, Potato, Slink, and Hamm).  I believe Bruce
Perens used to work a Pixar.  Debian uses a code name for the
development release until a release number is assigned.  This
avoids some problems but has the disadvantage of confusing people
who are not familiar with Debian.  I should have said "the next
stable release of Debian".


  Neil (aka nas at debian.org)



From akuchlin at mems-exchange.org  Thu Dec 14 01:26:32 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 13 Dec 2000 19:26:32 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 10:19:01AM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>
Message-ID: <20001213192632.A30585@kronos.cnri.reston.va.us>

On Wed, Dec 13, 2000 at 10:19:01AM -0500, Fred L. Drake, Jr. wrote:
>  Do these new functions have to be methods on the window objects, or
>can they be functions in the new module that take a window as a
>parameter?  The underlying window object can certainly provide slots

Panels and windows have a 1-1 association, but they're separate
objects.  The window.new_panel function could become just a method
which takes a window as its first argument; it would only need the
TypeObject for PyCursesWindow, in order to do typechecking.

> > Also, the curses.panel_{above,below}() wrappers need access to the
> > list_of_panels via find_po().

The list_of_panels is used only in the curses.panel module, so it
could be private to that module, since only panel-related functions
care about it.  

I'm ambivalent about the list_of_panels.  It's a linked list storing
(PyWindow, PyPanel) pairs.  Probably it should use a dictionary
instead of implementing a little list, just to reduce the amount of
code.

>does it make sense to spin out a distutils-based package?  I've no
>objection to them being in the core, but it seems that the release
>cycle may want to diverge from Python's.

Consensus seemed to be to leave it in; I'd have no objection to
removing it, but either course is fine with me.

So, I suggest we create _curses_panel.c, which would be available as
curses.panel.  (A panel.py module could then add any convenience
functions that are required.)

Thomas, do you want to work on this, or should I?

--amk



From nas at arctrix.com  Wed Dec 13 18:43:06 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 09:43:06 -0800
Subject: [Python-Dev] OT: Debian and Python
In-Reply-To: <20001214010534.M4396@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 14, 2000 at 01:05:34AM +0100
References: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl>
Message-ID: <20001213094306.C17326@glacier.fnational.com>

On Thu, Dec 14, 2000 at 01:05:34AM +0100, Thomas Wouters wrote:
> Note to the debian-pythoneers: woody still carries Python 1.5.2, not 2.0.
> Someone created a separate set of 2.0-packages, but they didn't include
> readline and gdbm support because of the licencing issues. (Posted on c.l.py
> sometime this week.)

I've had Python packages for Debian stable for a while.  I guess
I should have posted a link:

    http://arctrix.com/nas/python/debian/

Most useful modules are enabled.

> I'm *almost* tempted enough to learn enough about
> dpkg/.deb files to build my own licence-be-damned set

Its quite easy.  Debian source packages are basicly a diff.
Applying the diff will create a "debian" directory and in that
directory will be a makefile called "rules".  Use the target
"binary" to create new binary packages.  Good things to know are
that you must be in the source directory when you run the
makefile (ie. ./debian/rules binary).  You should be running a
shell under fakeroot to get the install permissions right
(running "fakeroot" will do).  You need to have the Debian
developer tools installed.  There is a list somewhere on
debian.org.  "apt-get source <packagename>" will get, extract and
patch a package ready for tweaking and building (handy for
getting stuff from unstable to run on stable).  

This is too off topic for python-dev.  If anyone needs more info
they can email me directly.

  Neil



From thomas at xs4all.net  Thu Dec 14 01:05:34 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 14 Dec 2000 01:05:34 +0100
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com>
Message-ID: <20001214010534.M4396@xs4all.nl>

On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote:
> > I don't think that is a very safe bet.  Python 2.0 missed the Debian
> > Potato boat.
> 
> This may have had to do more with the unresolved GPL issues.

This is very likely. Debian is very licence -- or at least GPL -- aware.
Which is a pity, really, because I already prefer it over RedHat in all
other cases (and RedHat is also pretty licence aware, just less piously,
devoutly, beyond-practicality-IMHO dedicated to the GPL.)

> > I have no idea when Woody is expected to be released but I expect it may
> > take longer than that if history is any indication.

BTW, I believe Debian uses a fairly steady release schedule, something like
an unstable->stable switch every year or 6 months or so ? I seem to recall
seeing something like that on the debian website, but can't check right now.

> And who or what is Woody?

Woody is Debian's current development branch, the current bearer of the
alias 'unstable'. It'll become Debian 2.3 (I believe, I don't pay attention
to version numbers, I just run unstable :) once it's stabilized. 'potato' is
the previous development branch, and currently the 'stable' branch. You can
compare them with 'rawhide' and 'redhat-7.0', respectively :)

(With the enormous difference that you can upgrade your debian install to a
new version (even the devel version, or update your machine to the latest
devel snapshot) while you are using it, without having to reboot ;)

Note to the debian-pythoneers: woody still carries Python 1.5.2, not 2.0.
Someone created a separate set of 2.0-packages, but they didn't include
readline and gdbm support because of the licencing issues. (Posted on c.l.py
sometime this week.) I'm *almost* tempted enough to learn enough about
dpkg/.deb files to build my own licence-be-damned set, but it'd be a lot of
work to mirror the current debian 1.5.2 set of packages (which include
numeric, imaging, mxTools, GTK/GNOME, and a shitload of 3rd party modules)
in 2.0. Ponder, maybe it could be done semi-automatically, from the
src-deb's of those packages.

By the way, in woody, there are 52 packages with 'python' in the name, and
32 with 'perl' in the name... Pity all of my perl-hugging hippy-friends are
still blindly using RedHat, and refuse to listen to my calls from the
Debian/Python-dark-side :-)

Oh, and the names 'woody' and 'potato' came from the movie Toy Story, in
case you wondered ;)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From esr at snark.thyrsus.com  Thu Dec 14 01:46:37 2000
From: esr at snark.thyrsus.com (Eric S. Raymond)
Date: Wed, 13 Dec 2000 19:46:37 -0500
Subject: [Python-Dev] Business related to the upcoming Python conference
Message-ID: <200012140046.TAA25289@snark.thyrsus.com>

I'm sending this to python-dev because I believe most or all of the
reviewers for my PC9 paper are on this list.  Paul, would you please
forward to any who were not?

First, my humble apologies for not having got my PC9 reviews in on time.
I diligently read my assigned papers early, but I couldn't do the
reviews early because of technical problems with my Foretec account --
and then I couldn't do them late because the pre-deadline crunch
happened while I was on a ten-day speaking and business trip in Japan
and California, with mostly poor or nonexistent Internet access.

Matters were not helped by a nasty four-month-old problem in my
personal life coming to a head right in the middle of the trip.  Nor
by the fact that the trip included the VA Linux Systems annual
stockholders' meeting and the toughest Board of Directors' meeting in
my tenure.  We had to hammer out a strategic theory of what to do now
that the dot-com companies who used to be our best companies aren't
getting funded any more.  Unfortunately, it's at times like this that
Board members earn their stock options.  Management oversight.
Fiduciary responsibility.  Mumble...

Second, the feedback I received on the paper was *excellent*, and I
will be making many of the recommended changes.  I've already extended
the discussion of "Why Python?" including addressing the weaknesses of
Scheme and Prolog for this application.  I have said more about uses
of CML2 beyond the Linux kernel.  I am working on a discussion of the
politics of CML2 option, but may save that for the stand-up talk
rather than the written paper.  I will try to trim the CML2 language
reference for the final version.

(The reviewer who complained about the lack of references on the SAT 
problem should be pleased to hear that URLs to relevant papers are in
fact included in the masters.  I hope they show in the final version
as rendered for publication.)
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The Constitution is not neutral. It was designed to take the
government off the backs of the people.
	-- Justice William O. Douglas 



From moshez at zadka.site.co.il  Thu Dec 14 13:22:24 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Thu, 14 Dec 2000 14:22:24 +0200 (IST)
Subject: [Python-Dev] Splitting up _cursesmodule
Message-ID: <20001214122224.739EEA82E@darjeeling.zadka.site.co.il>

On Wed, 13 Dec 2000 07:41:54 -0800, Neil Schemenauer <nas at arctrix.com> wrote:

> I don't think that is a very safe bet.  Python 2.0 missed the
> Debian Potato boat.

By a long time -- potato was frozen for a few months when 2.0 came out.

>  I have no idea when Woody is expected to be
> released but I expect it may take longer than that if history is
> any indication.

My bet is that woody starts freezing as soon as 2.4.0 is out. 
Note that once it starts freezing, 2.1 doesn't have a shot
of getting in, regardless of how long it takes to freeze.
OTOH, since in woody time there's a good chance for the "testing"
distribution, a lot more people would be running something
that *can* and *will* upgrade to 2.1 almost as soon as it is
out.
(For the record, most of the Debian users I know run woody on 
their server)
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From jeremy at alum.mit.edu  Thu Dec 14 06:04:43 2000
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 14 Dec 2000 00:04:43 -0500 (EST)
Subject: [Python-Dev] new draft of PEP 227
Message-ID: <14904.21739.804346.650062@bitdiddle.concentric.net>

I've got a new draft of PEP 227.  The terminology and wording are more
convoluted than they need to be.  I'll do at least one revision just
to say things more clearly, but I'd appreciate comments on the
proposed spec if you can read the current draft.

Jeremy




From cgw at fnal.gov  Thu Dec 14 07:03:01 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Thu, 14 Dec 2000 00:03:01 -0600 (CST)
Subject: [Python-Dev] Memory leaks in tupleobject.c
Message-ID: <14904.25237.654143.861733@buffalo.fnal.gov>

I've been running a set of memory-leak tests against the latest Python
and have found that running "test_extcall" leaks memory.  This gave me
a strange sense of deja vu, having fixed this once before...


From nas at arctrix.com  Thu Dec 14 00:43:43 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 15:43:43 -0800
Subject: [Python-Dev] Memory leaks in tupleobject.c
In-Reply-To: <14904.25237.654143.861733@buffalo.fnal.gov>; from cgw@fnal.gov on Thu, Dec 14, 2000 at 12:03:01AM -0600
References: <14904.25237.654143.861733@buffalo.fnal.gov>
Message-ID: <20001213154343.A18303@glacier.fnational.com>

On Thu, Dec 14, 2000 at 12:03:01AM -0600, Charles G Waldman wrote:
>  date: 2000/10/05 19:36:49;  author: nascheme;  state: Exp;  lines: +24 -86
>  Simplify _PyTuple_Resize by not using the tuple free list and dropping
>  support for the last_is_sticky flag.  A few hard to find bugs may be
>  fixed by this patch since the old code was buggy.
> 
> The 2.47 patch seems to have re-introduced the memory leak which was
> fixed in 2.31.  Maybe the old code was buggy, but the "right thing"
> would have been to fix it, not to throw it away.... if _PyTuple_Resize
> simply ignores the tuple free list, memory will be leaked.

Guilty as charged.  Can you explain how the current code is
leaking memory?  I can see one problem with deallocating size=0
tuples.  Are there any more leaks?

  Neil



From cgw at fnal.gov  Thu Dec 14 07:57:05 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Thu, 14 Dec 2000 00:57:05 -0600 (CST)
Subject: [Python-Dev] Memory leaks in tupleobject.c
In-Reply-To: <20001213154343.A18303@glacier.fnational.com>
References: <14904.25237.654143.861733@buffalo.fnal.gov>
	<20001213154343.A18303@glacier.fnational.com>
Message-ID: <14904.28481.292539.354303@buffalo.fnal.gov>

Neil Schemenauer writes:

 > Guilty as charged.  Can you explain how the current code is
 > leaking memory?  I can see one problem with deallocating size=0
 > tuples.  Are there any more leaks?

Actually, I think I may have spoken too hastily - it's late and I'm
tired and I should be sleeping rather than staring at the screen 
(like I've been doing since 8:30 this morning) - I jumped to
conclusions - I'm not really sure that it was your patch that caused
the leak; all I can say with 100% certainty is that if you run
"test_extcall" in a loop, memory usage goes through the ceiling....
It's not just the cyclic garbage caused by the "saboteur" function
because even with this commented out, the memory leak persists.

I'm actually trying to track down a different memory leak, something
which is currently causing trouble in one of our production servers
(more about this some other time) and just as a sanity check I ran my
little "leaktest.py" script over all the test_*.py modules in the
distribution, and found that test_extcall triggers leaks... having
analyzed and fixed this once before (see the CVS logs for
tupleobject.c), I jumped to conclusions about the reason for its
return.  I'll take a more clear-headed and careful look tomorrow and
post something (hopefully) a little more conclusive.  It may have been
some other change that caused this memory leak to re-appear.  If you
feel inclined to investigate, just do "reload(test.test_extcall)" in a
loop and watch the memory usage with ps or top or what-have-you...

	 -C




From paulp at ActiveState.com  Thu Dec 14 08:00:21 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Wed, 13 Dec 2000 23:00:21 -0800
Subject: [Python-Dev] new draft of PEP 227
References: <14904.21739.804346.650062@bitdiddle.concentric.net>
Message-ID: <3A387005.6725DAAE@ActiveState.com>

Jeremy Hylton wrote:
> 
> I've got a new draft of PEP 227.  The terminology and wording are more
> convoluted than they need to be.  I'll do at least one revision just
> to say things more clearly, but I'd appreciate comments on the
> proposed spec if you can read the current draft.

It set me to thinking:

Python should never require declarations. But would it necessarily be a
problem for Python to have a variable declaration syntax? Might not the
existence of declarations simplify some aspects of the proposal and of
backwards compatibility?

Along the same lines, might a new rule make Python code more robust? We
could say that a local can only shadow a global if the local is formally
declared. It's pretty rare that there is a good reason to shadow a
global and Python makes it too easy to do accidentally.

 Paul Prescod



From paulp at ActiveState.com  Thu Dec 14 08:29:35 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Wed, 13 Dec 2000 23:29:35 -0800
Subject: [Python-Dev] Online help scope
Message-ID: <3A3876DF.5554080C@ActiveState.com>

I think Guido and I are pretty far apart on the scope and requirements
of this online help thing so I'd like some clarification and opinions
from the peanut gallery.

Consider these scenarios

a) Signature

>>> help( dir )
dir([object]) -> list of stringsb) 

b) Usage hint

>>> help( dir )
dir([object]) -> list of stringsb) 

Return an alphabetized list of names comprising (some of) the attributes
of the given object.  Without an argument, the names in the current
scope
are listed.  With an instance argument, only the instance attributes are
returned.  With a class argument, attributes of the base class are not
returned.  For other types or arguments, this may list members or
methods.

c) Complete documentation, paged(man-style)

>>> help( dir )
dir([object]) -> list of stringsb) 

Without arguments, return the list of names in the current local symbol
table. With an argument, attempts to return a list of valid attribute
for that object. This information is gleaned from the object's __dict__,
__methods__ and __members__ attributes, if defined. The list is not
necessarily complete; e.g., for classes, attributes defined in base
classes are not included, and for class instances, methods are not
included. The resulting list is sorted alphabetically. For example: 

  >>> import sys
  >>> dir()
  ['sys']
  >>> dir(sys)
  ['argv', 'exit', 'modules', 'path', 'stderr', 'stdin', 'stdout']

d) Complete documentation in a user-chosen hypertext window

>>> help( dir )
(Netscape or lynx pops up)

I'm thinking that maybe we need two functions:

 * help
 * pythondoc

pythondoc("dir") would launch the Python documentation for the "dir"
command.

> That'S What Some People Think.  I Disagree That It Would Be Either
> Feasible Or A Good Idea To Put All Documentation For A Typical Module
> In Its Doc Strings.

Java and Perl people do it regularly. I think that in the greater world
of software development, the inline model has won (or is winning) and I
don't see a compelling reason to fight the tide. There will always be
out-of-line tutorials, discussions, books etc. 

The canonical module documentation could be inline. That improves the
liklihood of it being maintained. The LaTeX documentation is a major
bottleneck and moving to XML or SGML will not help. Programmers do not
want to learn documentation systems or syntaxes. They want to write code
and comments.

> I said above, and I'll say it again: I think the majority of people
> would prefer to use their standard web browser to read the standard
> docs.  It's not worth the effort to try to make those accessible
> through help().  

No matter what we decide on the issue above, reusing the standard
documentation is the only practical way of populating the help system in
the short-term. Right now, today, there is a ton of documentation that
exists only in LaTeX and HTML. Tons of modules have no docstrings.
Keywords have no docstrings. Compare the docstring for
urllib.urlretrieve to the HTML documentation.

In fact, you've given me a good idea: if the HTML is not available
locally, I can access it over the web.

 Paul Prescod



From paulp at ActiveState.com  Thu Dec 14 08:29:53 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Wed, 13 Dec 2000 23:29:53 -0800
Subject: [Python-Dev] Online help PEP
References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com>  
		            <3A366A41.1A14EFD4@ActiveState.com> <200012131548.KAA21344@cj20424-a.reston1.va.home.com>
Message-ID: <3A3876F1.D3E65E90@ActiveState.com>

Guido van Rossum wrote:
> 
> Having the repr() overloading invoke the pager is dangerous.  The beta
> version of the license command did this, and it caused some strange
> side effects, e.g. vars(__builtins__) would start reading from input
> and confuse the users.  The new version's repr() returns the desired
> string if it's less than a page, and 'Type license() to see the full
> license text' if the pager would need to be invoked.

I'll add this to the PEP.

> The implied import is a major liability.  If you can do this without
> importing (e.g. by source code inspection), fine.  Otherwise, you
> might issue some kind of message like "you must first import XXX.YYY".

Okay, I'll add to the PEP that an open issue is what strategy to use,
but that we want to avoid implicit import.

> The hurt is code bloat in the interpreter and creeping featurism.  If
> you need command line access to the docs (which may be a reasonable
> thing to ask for, although to me it sounds backwards :-), it's better
> to provide a separate command, e.g. pythondoc.  (Analog to perldoc.)

Okay, I'll add a pythondoc proposal to the PEP.

> Yes.  Please add that option to the PEP.

Done.

> > > What does "demand-loaded" mean in a Python context?
> >
> > When you "touch" the help object, it loads the onlinehelp module which
> > has the real implementation. The thing in __builtins__ is just a
> > lightweight proxy.
> 
> Please suggest an implementation.

In the PEP.

> Glad You'Re So Agreeable. :)

What happened to your capitalization? elisp gone awry? 

> ...
> To Tell You The Truth, I'M Not Holding My Breath Either. :-)  So your
> code should just dump the doc string on stdout without interpreting it
> in any way (except for paging).

I'll do this for the first version.

> It's buggier than just that.  The output of the pager prints an extra
> "| " at the start of each page except for the first, and the first
> page is a line longer than subsequent pages.

For some reason that I now I forget, that code is pretty hairy.

> BTW, another bug: try help(cgi).  It's nice that it gives the default
> value for arguments, but the defaults for FieldStorage.__init__ happen
> to include os.environ.  Its entire value is dumped -- which causes the
> pager to be off (it wraps over about 20 lines for me).  I think you
> may have to truncate long values a bit, e.g. by using the repr module.

Okay. There are a lot of little things we need to figure out. Such as
whether we should print out docstrings for private methods etc.

>...
> I don't know specific tools, but any serious docstring processing tool
> ends up parsing the source code for this very reason, so there's
> probably plenty of prior art.

Okay, I'll look into it.

 Paul



From tim.one at home.com  Thu Dec 14 08:35:00 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 14 Dec 2000 02:35:00 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <3A387005.6725DAAE@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENPIDAA.tim.one@home.com>

[Paul Prescod]
> ...
> Along the same lines, might a new rule make Python code more robust?
> We could say that a local can only shadow a global if the local is
> formally declared. It's pretty rare that there is a good reason to
> shadow a global and Python makes it too easy to do accidentally.

I've rarely seen problems due to shadowing a global, but have often seen
problems due to shadowing a builtin.  Alas, if this rule were extended to
builtins too-- where it would do the most good --then the names of builtins
would effectively become reserved words (any code shadowing them today would
be broken until declarations were added, and any code working today may
break tomorrow if a new builtin were introduced that happened to have the
same name as a local).




From pf at artcom-gmbh.de  Thu Dec 14 08:42:59 2000
From: pf at artcom-gmbh.de (Peter Funk)
Date: Thu, 14 Dec 2000 08:42:59 +0100 (MET)
Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py)
In-Reply-To: <200012132039.MAA07496@slayer.i.sourceforge.net> from Moshe Zadka at "Dec 13, 2000 12:39:24 pm"
Message-ID: <m146T2Z-000DmFC@artcom0.artcom-gmbh.de>

Hi,

I think the following change is incompatible and will break applications.

At least I have some server type applications that rely on 
'allow_reuse_address' defaulting to 0, because they use
the 'address already in use' exception, to make sure, that exactly one
server process is running on this port.  One of these applications, 
which is BTW build on top of Fredrik Lundhs 'xmlrpclib' fails to work,
if I change this default in SocketServer.py.  

Would you please explain the reasoning behind this change?

Moshe Zadka:
> *** SocketServer.py	2000/09/01 03:25:14	1.19
> --- SocketServer.py	2000/12/13 20:39:17	1.20
> ***************
> *** 158,162 ****
>       request_queue_size = 5
>   
> !     allow_reuse_address = 0
>   
>       def __init__(self, server_address, RequestHandlerClass):
> --- 158,162 ----
>       request_queue_size = 5
>   
> !     allow_reuse_address = 1
>   
>       def __init__(self, server_address, RequestHandlerClass):

Regards, Peter
-- 
Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260
office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen)



From paul at prescod.net  Thu Dec 14 08:57:30 2000
From: paul at prescod.net (Paul Prescod)
Date: Wed, 13 Dec 2000 23:57:30 -0800
Subject: [Python-Dev] new draft of PEP 227
References: <LNBBLJKPBEHFEDALKOLCEENPIDAA.tim.one@home.com>
Message-ID: <3A387D6A.782E6A3B@prescod.net>

Tim Peters wrote:
> 
> ...
> 
> I've rarely seen problems due to shadowing a global, but have often seen
> problems due to shadowing a builtin.  

Really?

I think that there are two different issues here. One is consciously
choosing to create a new variable but not understanding that there
already exists a variable by that name. (i.e. str, list).

Another is trying to assign to a global but actually shadowing it. There
is no way that anyone coming from another language is going to consider
this transcript reasonable:

>>> a=5
>>> def show():
...    print a
...
>>> def set(val):
...     a=val
...
>>> a
5
>>> show()
5
>>> set(10)
>>> show()
5

It doesn't seem to make any sense. My solution is to make the assignment
in "set" illegal unless you add a declaration that says: "No, really. I
mean it. Override that sucker." As the PEP points out, overriding is
seldom a good idea so the requirement to declare would be rarely
invoked.

Actually, one could argue that there is no good reason to even *allow*
the shadowing of globals. You can always add an underscore to the end of
the variable name to disambiguate.

> Alas, if this rule were extended to
> builtins too-- where it would do the most good --then the names of builtins
> would effectively become reserved words (any code shadowing them today would
> be broken until declarations were added, and any code working today may
> break tomorrow if a new builtin were introduced that happened to have the
> same name as a local).

I have no good solutions to the shadowing-builtins accidently problem.
But I will say that those sorts of problems are typically less subtle:

str = "abcdef"
...
str(5) # You'll get a pretty good error message here!

The "right answer" in terms of namespace theory is to consistently refer
to builtins with a prefix (whether "__builtins__" or "$") but that's
pretty unpalatable from an aesthetic point of view.

 Paul Prescod



From tim.one at home.com  Thu Dec 14 09:41:19 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 14 Dec 2000 03:41:19 -0500
Subject: [Python-Dev] Online help scope
In-Reply-To: <3A3876DF.5554080C@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEOBIDAA.tim.one@home.com>

[Paul Prescod]
> I think Guido and I are pretty far apart on the scope and requirements
> of this online help thing so I'd like some clarification and opinions
> from the peanut gallery.
>
> Consider these scenarios
>
> a) Signature
> ...
> b) Usage hint
> ...
> c) Complete documentation, paged(man-style)
> ...
> d) Complete documentation in a user-chosen hypertext window
> ...

Guido's style guide has a lot to say about docstrings, suggesting that they
were intended to support two scenarios:  #a+#b together (the first line of a
multi-line docstring), and #c+#d together (the entire docstring).  In this
respect I think Guido was (consciously or not) aping elisp's conventions, up
to but not including the elisp convention for naming the arguments in the
first line of a docstring.  The elisp conventions were very successful
(simple, and useful in practice), so aping them is a good thing.

We've had stalemate ever since:  there isn't a single style of writing
docstrings in practice because no single docstring processor has been
blessed, while no docstring processor can gain momentum before being
blessed.  Every attempt to date has erred by trying to do too much, thus
attracting so much complaint that it can't ever become blessed.  The current
argument over PEP 233 appears more of the same.

The way to break the stalemate is to err on the side of simplicity:  just
cater to the two obvious (first-line vs whole-string) cases, and for
existing docstrings only.  HTML vs plain text is fluff.  Paging vs
non-paging is fluff.  Dumping to stdout vs displaying in a browser is fluff.
Jumping through hoops for functions and modules whose authors didn't bother
to write docstrings is fluff.  Etc.  People fight over fluff until it fills
the air and everyone chokes to death on it <0.9 wink>.

Something dirt simple can get blessed, and once *anything* is blessed, a
million docstrings will bloom.

[Guido]
> That'S What Some People Think.  I Disagree That It Would Be Either
> Feasible Or A Good Idea To Put All Documentation For A Typical Module
> In Its Doc Strings.

I'm with Paul on this one:  that's what module.__doc__ is for, IMO (Javadoc
is great, Eiffel's embedded doc tools are great, Perl POD is great, even
REBOL's interactive help is great).  All Java, Eiffel, Perl and REBOL have
in common that Python lacks is *a* blessed system, no matter how crude.

[back to Paul]
> ...
> No matter what we decide on the issue above, reusing the standard
> documentation is the only practical way of populating the help system
> in the short-term. Right now, today, there is a ton of documentation
> that exists only in LaTeX and HTML. Tons of modules have no docstrings.

Then write tools to automatically create docstrings from the LaTeX and HTML,
but *check in* the results (i.e., add the docstrings so created to the
codebase), and keep the help system simple.

> Keywords have no docstrings.

Neither do integers, but they're obvious too <wink>.




From thomas at xs4all.net  Thu Dec 14 10:13:49 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 14 Dec 2000 10:13:49 +0100
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001214010534.M4396@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 14, 2000 at 01:05:34AM +0100
References: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl>
Message-ID: <20001214101348.N4396@xs4all.nl>

On Thu, Dec 14, 2000 at 01:05:34AM +0100, Thomas Wouters wrote:

> By the way, in woody, there are 52 packages with 'python' in the name, and
> 32 with 'perl' in the name...

Ah, not true, sorry. I shouldn't have posted off-topic stuff after being
awoken by machine-down-alarms ;) That was just what my reasonably-default
install had installed. Debian has what looks like most CPAN modules as
packages, too, so it's closer to a 110/410 spread (python/perl.) Still, not
a bad number :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mal at lemburg.com  Thu Dec 14 11:32:58 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 11:32:58 +0100
Subject: [Python-Dev] new draft of PEP 227
References: <14904.21739.804346.650062@bitdiddle.concentric.net>
Message-ID: <3A38A1DA.7EC49149@lemburg.com>

Jeremy Hylton wrote:
> 
> I've got a new draft of PEP 227.  The terminology and wording are more
> convoluted than they need to be.  I'll do at least one revision just
> to say things more clearly, but I'd appreciate comments on the
> proposed spec if you can read the current draft.

The PEP doesn't mention the problems I pointed out about 
breaking the lookup schemes w/r to symbols in methods, classes
and globals.

Please add a comment about this to the PEP + maybe the example
I gave in one the posts to python-dev about it. I consider
the problem serious enough to limit the nested scoping
to lambda functions (or functions in general) only if that's
possible.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Thu Dec 14 11:55:38 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 11:55:38 +0100
Subject: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule)
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl>
Message-ID: <3A38A72A.4011B5BD@lemburg.com>

Thomas Wouters wrote:
> 
> On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote:
> > > I don't think that is a very safe bet.  Python 2.0 missed the Debian
> > > Potato boat.
> >
> > This may have had to do more with the unresolved GPL issues.
> 
> This is very likely. Debian is very licence -- or at least GPL -- aware.
> Which is a pity, really, because I already prefer it over RedHat in all
> other cases (and RedHat is also pretty licence aware, just less piously,
> devoutly, beyond-practicality-IMHO dedicated to the GPL.)
 
About the GPL issue: as I understood Guido's post, RMS still regards
the choice of law clause as being incompatible to the GPL (heck,
doesn't this guy ever think about international trade terms,
the United Nations Convention on International Sale of Goods
or local law in one of the 200+ countries where you could deploy
GPLed software... is the GPL only meant for US programmers ?).

I am currently rewriting my open source licenses as well and among
other things I chose to integrate a choice of law clause as well.
Seeing RMS' view of things, I guess that my license will be regarded
as incompatible to the GPL which is sad even though I'm in good
company... e.g. the Apache license, the Zope license, etc. Dual
licensing is not possible as it would reopen the loop-wholes in the
GPL I tried to fix in my license. Any idea on how to proceed ?

Another issue: since Python doesn't link Python scripts, is it
still true that if one (pure) Python package is covered by the GPL, 
then all other packages needed by that application will also fall
under GPL ?

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From gstein at lyra.org  Thu Dec 14 12:57:43 2000
From: gstein at lyra.org (Greg Stein)
Date: Thu, 14 Dec 2000 03:57:43 -0800
Subject: (offtopic) Re: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <3A38A72A.4011B5BD@lemburg.com>; from mal@lemburg.com on Thu, Dec 14, 2000 at 11:55:38AM +0100
References: <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> <3A38A72A.4011B5BD@lemburg.com>
Message-ID: <20001214035742.Z8951@lyra.org>

On Thu, Dec 14, 2000 at 11:55:38AM +0100, M.-A. Lemburg wrote:
>...
> I am currently rewriting my open source licenses as well and among
> other things I chose to integrate a choice of law clause as well.
> Seeing RMS' view of things, I guess that my license will be regarded
> as incompatible to the GPL which is sad even though I'm in good
> company... e.g. the Apache license, the Zope license, etc. Dual
> licensing is not possible as it would reopen the loop-wholes in the
> GPL I tried to fix in my license. Any idea on how to proceed ?

Only RMS is under the belief that the Apache license is incompatible. It is
either clause 4 or 5 (I forget which) where we state that certain names
(e.g. "Apache") cannot be used in derived products' names and promo
materials. RMS views this as an "additional restriction on redistribution",
which is apparently not allowed by the GPL.

We (the ASF) generally feel he is being a royal pain in the ass with this.
We've sent him a big, long email asking for clarification / resolution, but
haven't heard back (we sent it a month or so ago). Basically, his FUD
creates views such as yours ("the Apache license is incompatible with the
GPL") because people just take his word for it. We plan to put together a
web page to outline our own thoughts and licensing beliefs/philosophy.

We're also planning to rev our license to rephrase/alter the particular
clause, but for logistic purposes (putting the project name in there ties it
to the particular project; we want a generic ASF license that can be applied
to all of the projects without a search/replace).

At this point, the ASF is taking the position of ignoring him and his
controlling attitude(*) and beliefs. There is the outstanding letter to him,
but that doesn't really change our point of view.

Cheers,
-g

(*) for a person espousing freedom, it is rather ironic just how much of a
control freak he is (stemming from a no-compromise position to guarantee
peoples' freedoms, he always wants things done his way)

-- 
Greg Stein, http://www.lyra.org/



From tg at melaten.rwth-aachen.de  Thu Dec 14 14:07:12 2000
From: tg at melaten.rwth-aachen.de (Thomas Gellekum)
Date: 14 Dec 2000 14:07:12 +0100
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: Andrew Kuchling's message of "Wed, 13 Dec 2000 19:26:32 -0500"
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
	<14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
	<20001213074423.A30348@kronos.cnri.reston.va.us>
	<kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>
	<14903.37733.339921.131872@cj42289-a.reston1.va.home.com>
	<20001213192632.A30585@kronos.cnri.reston.va.us>
Message-ID: <kq3dfrkv7j.fsf@cip12.melaten.rwth-aachen.de>

Andrew Kuchling <akuchlin at mems-exchange.org> writes:

> I'm ambivalent about the list_of_panels.  It's a linked list storing
> (PyWindow, PyPanel) pairs.  Probably it should use a dictionary
> instead of implementing a little list, just to reduce the amount of
> code.

I don't like it either, so feel free to shred it. As I said, this is
the first (piece of an) extension module I've written and I thought it
would be easier to implement a little list than to manage a Python
list or such in C.

> So, I suggest we create _curses_panel.c, which would be available as
> curses.panel.  (A panel.py module could then add any convenience
> functions that are required.)
> 
> Thomas, do you want to work on this, or should I?

Just do it. I'll try to add more examples in the meantime.

tg



From fredrik at pythonware.com  Thu Dec 14 14:19:08 2000
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 14 Dec 2000 14:19:08 +0100
Subject: [Python-Dev] fuzzy logic?
Message-ID: <015101c065d0$717d1680$0900a8c0@SPIFF>

here's a simple (but somewhat strange) test program:

def spam():
    a = 1
    if (0):
        global a
        print "global a"
    a = 2

def egg():
    b = 1
    if 0:
        global b
        print "global b"
    b = 2

egg()
spam()

print a
print b

if I run this under 1.5.2, I get:

    2
    Traceback (innermost last):
        File "<stdin>", line 19, in ?
    NameError: b

</F>




From gstein at lyra.org  Thu Dec 14 14:42:11 2000
From: gstein at lyra.org (Greg Stein)
Date: Thu, 14 Dec 2000 05:42:11 -0800
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: <015101c065d0$717d1680$0900a8c0@SPIFF>; from fredrik@pythonware.com on Thu, Dec 14, 2000 at 02:19:08PM +0100
References: <015101c065d0$717d1680$0900a8c0@SPIFF>
Message-ID: <20001214054210.G8951@lyra.org>

I would take a guess that the "if 0:" is optimized away *before* the
inspection for a "global" statement. But the compiler doesn't know how to
optimize away "if (0):", so the global statement remains.

Ah. Just checked. Look at compile.c::com_if_stmt(). There is a call to
"is_constant_false()" in there.

Heh. Looks like is_constant_false() could be made a bit smarter. But the
point is valid: you can make is_constant_false() as smart as you want, and
you'll still end up with "funny" global behavior.

Cheers,
-g

On Thu, Dec 14, 2000 at 02:19:08PM +0100, Fredrik Lundh wrote:
> here's a simple (but somewhat strange) test program:
> 
> def spam():
>     a = 1
>     if (0):
>         global a
>         print "global a"
>     a = 2
> 
> def egg():
>     b = 1
>     if 0:
>         global b
>         print "global b"
>     b = 2
> 
> egg()
> spam()
> 
> print a
> print b
> 
> if I run this under 1.5.2, I get:
> 
>     2
>     Traceback (innermost last):
>         File "<stdin>", line 19, in ?
>     NameError: b
> 
> </F>
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Greg Stein, http://www.lyra.org/



From mwh21 at cam.ac.uk  Thu Dec 14 14:58:24 2000
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 14 Dec 2000 13:58:24 +0000
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: "Fredrik Lundh"'s message of "Thu, 14 Dec 2000 14:19:08 +0100"
References: <015101c065d0$717d1680$0900a8c0@SPIFF>
Message-ID: <m3snnr14vz.fsf@atrus.jesus.cam.ac.uk>

1) Is there anything is the standard library that does the equivalent
   of

import symbol,token

def decode_ast(ast):
    if token.ISTERMINAL(ast[0]):
        return (token.tok_name[ast[0]], ast[1])
    else:
        return (symbol.sym_name[ast[0]],)+tuple(map(decode_ast,ast[1:]))

  so that, eg:

>>> pprint.pprint(decode.decode_ast(parser.expr("0").totuple()))
('eval_input',
 ('testlist',
  ('test',
   ('and_test',
    ('not_test',
     ('comparison',
      ('expr',
       ('xor_expr',
        ('and_expr',
         ('shift_expr',
          ('arith_expr',
           ('term',
            ('factor', ('power', ('atom', ('NUMBER', '0'))))))))))))))),
 ('NEWLINE', ''),
 ('ENDMARKER', ''))

  ?  Should there be?  (Especially if it was a bit better written).

... and Greg's just said everything else I wanted to!

Cheers,
M.

-- 
  please realize that the Common  Lisp community is more than 40 
  years old.  collectively, the community has already been where 
  every clueless newbie  will be going for the next three years.  
  so relax, please.                     -- Erik Naggum, comp.lang.lisp




From guido at python.org  Thu Dec 14 15:51:26 2000
From: guido at python.org (Guido van Rossum)
Date: Thu, 14 Dec 2000 09:51:26 -0500
Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py)
In-Reply-To: Your message of "Thu, 14 Dec 2000 08:42:59 +0100."
             <m146T2Z-000DmFC@artcom0.artcom-gmbh.de> 
References: <m146T2Z-000DmFC@artcom0.artcom-gmbh.de> 
Message-ID: <200012141451.JAA15637@cj20424-a.reston1.va.home.com>

> I think the following change is incompatible and will break applications.
> 
> At least I have some server type applications that rely on 
> 'allow_reuse_address' defaulting to 0, because they use
> the 'address already in use' exception, to make sure, that exactly one
> server process is running on this port.  One of these applications, 
> which is BTW build on top of Fredrik Lundhs 'xmlrpclib' fails to work,
> if I change this default in SocketServer.py.  
> 
> Would you please explain the reasoning behind this change?

The reason for the patch is that without this, if you kill a TCP server
and restart it right away, you'll get a 'port in use" error -- TCP has
some kind of strange wait period after a connection is closed before
it can be reused.  The patch avoids this error.

As far as I know, with TCP, code using SO_REUSEADDR still cannot bind
to the port when another process is already using it, but for UDP, the
semantics may be different.

Is your server using UDP?

Try this patch if your problem is indeed related to UDP:

*** SocketServer.py	2000/12/13 20:39:17	1.20
--- SocketServer.py	2000/12/14 14:48:16
***************
*** 268,273 ****
--- 268,275 ----
  
      """UDP server class."""
  
+     allow_reuse_address = 0
+ 
      socket_type = socket.SOCK_DGRAM
  
      max_packet_size = 8192

If this works for you, I'll check it in, of course.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Thu Dec 14 15:52:37 2000
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 14 Dec 2000 09:52:37 -0500 (EST)
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <3A38A1DA.7EC49149@lemburg.com>
References: <14904.21739.804346.650062@bitdiddle.concentric.net>
	<3A38A1DA.7EC49149@lemburg.com>
Message-ID: <14904.57013.371474.691948@bitdiddle.concentric.net>

>>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:

  MAL> Jeremy Hylton wrote:
  >>
  >> I've got a new draft of PEP 227.  The terminology and wording are
  >> more convoluted than they need to be.  I'll do at least one
  >> revision just to say things more clearly, but I'd appreciate
  >> comments on the proposed spec if you can read the current draft.

  MAL> The PEP doesn't mention the problems I pointed out about
  MAL> breaking the lookup schemes w/r to symbols in methods, classes
  MAL> and globals.

I believe it does.  There was some discussion on python-dev and
with others in private email about how classes should be handled.

The relevant section of the specification is:

    If a name is used within a code block, but it is not bound there
    and is not declared global, the use is treated as a reference to
    the nearest enclosing function region.  (Note: If a region is
    contained within a class definition, the name bindings that occur
    in the class block are not visible to enclosed functions.)

  MAL> Please add a comment about this to the PEP + maybe the example
  MAL> I gave in one the posts to python-dev about it. I consider the
  MAL> problem serious enough to limit the nested scoping to lambda
  MAL> functions (or functions in general) only if that's possible.

If there was some other concern you had, then I don't know what it
was.  I recall that you had a longish example that raised a NameError
immediately :-).

Jeremy



From mal at lemburg.com  Thu Dec 14 16:02:33 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 16:02:33 +0100
Subject: [Python-Dev] new draft of PEP 227
References: <14904.21739.804346.650062@bitdiddle.concentric.net>
		<3A38A1DA.7EC49149@lemburg.com> <14904.57013.371474.691948@bitdiddle.concentric.net>
Message-ID: <3A38E109.54C07565@lemburg.com>

Jeremy Hylton wrote:
> 
> >>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:
> 
>   MAL> Jeremy Hylton wrote:
>   >>
>   >> I've got a new draft of PEP 227.  The terminology and wording are
>   >> more convoluted than they need to be.  I'll do at least one
>   >> revision just to say things more clearly, but I'd appreciate
>   >> comments on the proposed spec if you can read the current draft.
> 
>   MAL> The PEP doesn't mention the problems I pointed out about
>   MAL> breaking the lookup schemes w/r to symbols in methods, classes
>   MAL> and globals.
> 
> I believe it does.  There was some discussion on python-dev and
> with others in private email about how classes should be handled.
> 
> The relevant section of the specification is:
> 
>     If a name is used within a code block, but it is not bound there
>     and is not declared global, the use is treated as a reference to
>     the nearest enclosing function region.  (Note: If a region is
>     contained within a class definition, the name bindings that occur
>     in the class block are not visible to enclosed functions.)

Well hidden ;-)

Honestly, I think that you should either make this specific case
more visible to readers of the PEP since this single detail would
produce most of the problems with nested scopes.

BTW, what about nested classes ? AFAIR, the PEP only talks about
nested functions.

>   MAL> Please add a comment about this to the PEP + maybe the example
>   MAL> I gave in one the posts to python-dev about it. I consider the
>   MAL> problem serious enough to limit the nested scoping to lambda
>   MAL> functions (or functions in general) only if that's possible.
> 
> If there was some other concern you had, then I don't know what it
> was.  I recall that you had a longish example that raised a NameError
> immediately :-).

The idea behind the example should have been clear, though.

x = 1
class C:
   x = 2
   def test(self):
       print x
  
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fdrake at acm.org  Thu Dec 14 16:09:57 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 14 Dec 2000 10:09:57 -0500 (EST)
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: <m3snnr14vz.fsf@atrus.jesus.cam.ac.uk>
References: <015101c065d0$717d1680$0900a8c0@SPIFF>
	<m3snnr14vz.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <14904.58053.282537.260186@cj42289-a.reston1.va.home.com>

Michael Hudson writes:
 > 1) Is there anything is the standard library that does the equivalent
 >    of

  No, but I have a chunk of code that does in a different way.  Where
in the library do you think it belongs?  The compiler package sounds
like the best place, but that's not installed by default.  (Jeremy, is
that likely to change soon?)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From mwh21 at cam.ac.uk  Thu Dec 14 16:47:33 2000
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 14 Dec 2000 15:47:33 +0000
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: "Fred L. Drake, Jr."'s message of "Thu, 14 Dec 2000 10:09:57 -0500 (EST)"
References: <015101c065d0$717d1680$0900a8c0@SPIFF> <m3snnr14vz.fsf@atrus.jesus.cam.ac.uk> <14904.58053.282537.260186@cj42289-a.reston1.va.home.com>
Message-ID: <m3vgsnovhm.fsf@atrus.jesus.cam.ac.uk>

"Fred L. Drake, Jr." <fdrake at acm.org> writes:

> Michael Hudson writes:
>  > 1) Is there anything is the standard library that does the equivalent
>  >    of
> 
>   No, but I have a chunk of code that does in a different way.  

I'm guessing everyone who's played with the parser much does, hence
the suggestion.  I agree my implementation is probably not optimal - I
just threw it together as quickly as I could!

> Where in the library do you think it belongs?  The compiler package
> sounds like the best place, but that's not installed by default.
> (Jeremy, is that likely to change soon?)

Actually, I'd have thought the parser module would be most natural,
but that would probably mean doing the _module.c trick, and it's
probably not worth the bother.  OTOH, it seems that wrapping any given
extension module in a python module is becoming if anything the norm,
so maybe it is.

Cheers,
M.

-- 
  I don't remember any dirty green trousers.
                                             -- Ian Jackson, ucam.chat




From nowonder at nowonder.de  Thu Dec 14 16:50:10 2000
From: nowonder at nowonder.de (Peter Schneider-Kamp)
Date: Thu, 14 Dec 2000 16:50:10 +0100
Subject: [Python-Dev] [PEP-212] new draft
Message-ID: <3A38EC32.210BD1A2@nowonder.de>

In an attempt to revive PEP 212 - Loop counter iteration I have
updated the draft. The HTML version can be found at:

http://python.sourceforge.net/peps/pep-0212.html

I will appreciate any form of comments and/or criticisms.

Peter

P.S.: Now I have posted it - should I update the Post-History?
      Or is that for posts to c.l.py?



From pf at artcom-gmbh.de  Thu Dec 14 16:56:08 2000
From: pf at artcom-gmbh.de (Peter Funk)
Date: Thu, 14 Dec 2000 16:56:08 +0100 (MET)
Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py)
In-Reply-To: <200012141451.JAA15637@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 14, 2000  9:51:26 am"
Message-ID: <m146ajo-000DmFC@artcom0.artcom-gmbh.de>

Hi,

Moshes checkin indeed makes a lot of sense.  Sorry for the irritation.

Guido van Rossum:
> The reason for the patch is that without this, if you kill a TCP server
> and restart it right away, you'll get a 'port in use" error -- TCP has
> some kind of strange wait period after a connection is closed before
> it can be reused.  The patch avoids this error.
> 
> As far as I know, with TCP, code using SO_REUSEADDR still cannot bind
> to the port when another process is already using it, but for UDP, the
> semantics may be different.
> 
> Is your server using UDP?

No and I must admit, that I didn't tested carefully enough:  From
a quick look at my process listing I assumed there were indeed
two server processes running concurrently which would have broken
the needed mutual exclusion.  But the second process went in
a sleep-and-retry-to-connect-loop which I simply forgot about.
This loop was initially built into my server to wait until the
"strange wait period" you mentioned above was over or a certain
number of retries has been exceeded.

I guess I can take this ugly work-around out with Python 2.0 and newer,
since the BaseHTTPServer.py shipped with Python 2.0 already contained
allow_reuse_address = 1 default in the HTTPServer class.

BTW: I've took my old W.Richard Stevens Unix Network Programming
from the shelf.  After rereading the rather terse paragraph about
SO_REUSEADDR I guess the wait period is necessary to make sure, that
their is no connect pending from an outside client on this TCP port.
I can't find nothing about UDP and REUSE.

Regards, Peter



From guido at python.org  Thu Dec 14 17:17:27 2000
From: guido at python.org (Guido van Rossum)
Date: Thu, 14 Dec 2000 11:17:27 -0500
Subject: [Python-Dev] Online help scope
In-Reply-To: Your message of "Wed, 13 Dec 2000 23:29:35 PST."
             <3A3876DF.5554080C@ActiveState.com> 
References: <3A3876DF.5554080C@ActiveState.com> 
Message-ID: <200012141617.LAA16179@cj20424-a.reston1.va.home.com>

> I think Guido and I are pretty far apart on the scope and requirements
> of this online help thing so I'd like some clarification and opinions
> from the peanut gallery.

I started replying but I think Tim's said it all.  Let's do something
dead simple.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Thu Dec 14 18:14:01 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Thu, 14 Dec 2000 12:14:01 -0500
Subject: [Python-Dev] [PEP-212] new draft
References: <3A38EC32.210BD1A2@nowonder.de>
Message-ID: <14904.65497.940293.975775@anthem.concentric.net>

>>>>> "PS" == Peter Schneider-Kamp <nowonder at nowonder.de> writes:

    PS> P.S.: Now I have posted it - should I update the Post-History?
    PS> Or is that for posts to c.l.py?

Originally, I'd thought of it as tracking the posting history to
c.l.py.  I'm not sure how useful that header is after all -- maybe in
just giving a start into the python-list archives...

-Barry



From tim.one at home.com  Thu Dec 14 18:33:41 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 14 Dec 2000 12:33:41 -0500
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: <015101c065d0$717d1680$0900a8c0@SPIFF>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEOPIDAA.tim.one@home.com>

Note that the behavior of both functions is undefined ("Names listed in a
global statement must not be used in the same code block textually preceding
that global statement", from the Lang Ref, and "if" does not introduce a new
code block in Python's terminology).

But you'll get the same outcome via these trivial variants, which sidestep
that problem:

def spam():
    if (0):
        global a
        print "global a"
    a = 2

def egg():
    if 0:
        global b
        print "global b"
    b = 2

*Now* you can complain <wink>.


> -----Original Message-----
> From: python-dev-admin at python.org [mailto:python-dev-admin at python.org]On
> Behalf Of Fredrik Lundh
> Sent: Thursday, December 14, 2000 8:19 AM
> To: python-dev at python.org
> Subject: [Python-Dev] fuzzy logic?
>
>
> here's a simple (but somewhat strange) test program:
>
> def spam():
>     a = 1
>     if (0):
>         global a
>         print "global a"
>     a = 2
>
> def egg():
>     b = 1
>     if 0:
>         global b
>         print "global b"
>     b = 2
>
> egg()
> spam()
>
> print a
> print b
>
> if I run this under 1.5.2, I get:
>
>     2
>     Traceback (innermost last):
>         File "<stdin>", line 19, in ?
>     NameError: b
>
> </F>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://www.python.org/mailman/listinfo/python-dev




From tim.one at home.com  Thu Dec 14 19:46:09 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 14 Dec 2000 13:46:09 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule)
In-Reply-To: <3A38A72A.4011B5BD@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPFIDAA.tim.one@home.com>

[MAL]
> About the GPL issue: as I understood Guido's post, RMS still regards
> the choice of law clause as being incompatible to the GPL

Yes.  Actually, I don't know what RMS really thinks -- his public opinions
on legal issues appear to be echoes of what Eben Moglen tells him.  Like his
views or not, Moglen is a tenured law professor

> (heck, doesn't this guy ever think about international trade terms,
> the United Nations Convention on International Sale of Goods
> or local law in one of the 200+ countries where you could deploy
> GPLed software...

Yes.

> is the GPL only meant for US programmers ?).

No.  Indeed, that's why the GPL is grounded in copyright law, because
copyright law is the most uniform (across countries) body of law we've got.
Most commentary I've seen suggests that the GPL has its *weakest* legal legs
in the US!

> I am currently rewriting my open source licenses as well and among
> other things I chose to integrate a choice of law clause as well.
> Seeing RMS' view of things, I guess that my license will be regarded
> as incompatible to the GPL

Yes.

> which is sad even though I'm in good company... e.g. the Apache
> license, the Zope license, etc. Dual licensing is not possible as
> it would reopen the loop-wholes in the GPL I tried to fix in my
> license. Any idea on how to proceed ?

You can wait to see how the CNRI license turns out, then copy it if it's
successful; you can approach the FSF directly; you can stop trying to do it
yourself and reuse some license that's already been blessed by the FSF; or
you can give up on GPL compatibility (according to the FSF).  I don't see
any other choices.

> Another issue: since Python doesn't link Python scripts, is it
> still true that if one (pure) Python package is covered by the GPL,
> then all other packages needed by that application will also fall
> under GPL ?

Sorry, couldn't make sense of the question.  Just as well, since you should
ask about it on a GNU forum anyway <wink>.




From mal at lemburg.com  Thu Dec 14 21:02:05 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 21:02:05 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <LNBBLJKPBEHFEDALKOLCIEPFIDAA.tim.one@home.com>
Message-ID: <3A39273D.4AE24920@lemburg.com>

Tim Peters wrote:
> 
> [MAL]
> > About the GPL issue: as I understood Guido's post, RMS still regards
> > the choice of law clause as being incompatible to the GPL
> 
> Yes.  Actually, I don't know what RMS really thinks -- his public opinions
> on legal issues appear to be echoes of what Eben Moglen tells him.  Like his
> views or not, Moglen is a tenured law professor

But it's his piece of work, isn't it ? He's the one who can change it.
 
> > (heck, doesn't this guy ever think about international trade terms,
> > the United Nations Convention on International Sale of Goods
> > or local law in one of the 200+ countries where you could deploy
> > GPLed software...
> 
> Yes.

Strange, then how come he sees the choice of law clause as a problem:
without explicitely ruling out the applicability of the UN CISC,
this clause is waived by it anyway... at least according to a 
specialist on software law here in Germany.

> > is the GPL only meant for US programmers ?).
> 
> No.  Indeed, that's why the GPL is grounded in copyright law, because
> copyright law is the most uniform (across countries) body of law we've got.
> Most commentary I've seen suggests that the GPL has its *weakest* legal legs
> in the US!

Huh ? Just an example: in Germany customer rights assure a 6 month
warranty on everything you buy or obtain in some other way. Liability
is another issue: there are some very unpleasant laws which render
most of the "no liability" paragraphs in licenses useless in Germany.

Even better: since the license itself is written in English a
German party could simply consider the license non-binding, since
he or she hasn't agreed to accept contract in foreign languages.
France has similar interpretations.

> > I am currently rewriting my open source licenses as well and among
> > other things I chose to integrate a choice of law clause as well.
> > Seeing RMS' view of things, I guess that my license will be regarded
> > as incompatible to the GPL
> 
> Yes.
> 
> > which is sad even though I'm in good company... e.g. the Apache
> > license, the Zope license, etc. Dual licensing is not possible as
> > it would reopen the loop-wholes in the GPL I tried to fix in my
> > license. Any idea on how to proceed ?
> 
> You can wait to see how the CNRI license turns out, then copy it if it's
> successful; you can approach the FSF directly; you can stop trying to do it
> yourself and reuse some license that's already been blessed by the FSF; or
> you can give up on GPL compatibility (according to the FSF).  I don't see
> any other choices.

I guess I'll go with the latter.
 
> > Another issue: since Python doesn't link Python scripts, is it
> > still true that if one (pure) Python package is covered by the GPL,
> > then all other packages needed by that application will also fall
> > under GPL ?
> 
> Sorry, couldn't make sense of the question.  Just as well, since you should
> ask about it on a GNU forum anyway <wink>.

Isn't this question (whether the GPL virus applies to byte-code
as well) important to Python programmers as well ?

Oh well, nevermind... it's still nice to hear that CNRI and RMS
have finally made up their minds to render Python GPL-compatible --
whatever this means ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From cgw at fnal.gov  Thu Dec 14 22:06:43 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Thu, 14 Dec 2000 15:06:43 -0600 (CST)
Subject: [Python-Dev] memory leaks
Message-ID: <14905.13923.659879.100243@buffalo.fnal.gov>

The following code (extracted from test_extcall.py) leaks memory:

class Foo:
   def method(self, arg1, arg2):
        return arg1 + arg2

def f():
    err = None
    try:
        Foo.method(*(1, 2, 3))
    except TypeError, err:
        pass
    del err



One-line fix (also posted to Sourceforge):

--- Python/ceval.c	2000/10/30 17:15:19	2.213
+++ Python/ceval.c	2000/12/14 20:54:02
@@ -1905,8 +1905,7 @@
 							 class))) {
 				    PyErr_SetString(PyExc_TypeError,
 	    "unbound method must be called with instance as first argument");
-				    x = NULL;
-				    break;
+				    goto extcall_fail;
 				}
 			    }
 			}



I think that there are a bunch more memory leaks lurking around...
this only fixes one of them.  I'll send more info as I find out what's
going on.




From tim.one at home.com  Thu Dec 14 22:28:09 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 14 Dec 2000 16:28:09 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <3A39273D.4AE24920@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEPLIDAA.tim.one@home.com>

I'm not going to argue about the GPL.  Take it up with the FSF!  I will say
that if you do get the FSF's attention, Moglen will have an instant counter
to any objection you're likely to raise -- he's been thinking about this for
10 years, and he's heard it all.  And in our experience, RMS won't commit to
anything before running it past Moglen.

[MAL]
> But it's his [RMS's] piece of work, isn't it ? He's the one who can
> change it.

Akin to saying Python is Guido's piece of work.  Yes, no, kinda, more true
at some times than others, ditto respects.  RMS has consistently said that
any changes for the next version of the GPL will take at least a year, due
to extensive legal review required first.  Would be more clearly true to say
that the first version of the GPL was RMS's alone -- but version 2 came out
in 1991.

> ...
> Strange, then how come he sees the choice of law clause as a problem:
> without explicitely ruling out the applicability of the UN CISC,
> this clause is waived by it anyway... at least according to a
> specialist on software law here in Germany.
> ... [and other "who knows?" objections] ...

Guido quoted the text of your Wed, 06 Sep 2000 14:19:09 +0200 "Re:
[License-py20] Re: GPL incompability as seen from Europe" msg to Moglen, who
dismissed it almost offhandedly as "layman's commentary".  You'll have to
ask him why:  MAL, we're not lawyers.  We're incompetent to have this
discussion -- or at least I am, and Moglen thinks you are too <wink>.

>>> Another issue: since Python doesn't link Python scripts, is it
>>> still true that if one (pure) Python package is covered by the GPL,
>>> then all other packages needed by that application will also fall
>>> under GPL ?

[Tim]
>> Sorry, couldn't make sense of the question.  Just as well,
>> since you should ask about it on a GNU forum anyway <wink>.

[MAL]
> Isn't this question (whether the GPL virus applies to byte-code
> as well) important to Python programmers as well ?

I don't know -- like I said, I couldn't make sense of the question, i.e. I
couldn't figure out what it is you're asking.  I *suspect* it's based on a
misunderstanding of the GPL; for example, gcc is a GPL'ed application that
requires stuff from the OS in order to do its job of compiling, but that
doesn't mean that every OS it runs on falls under the GPL.  The GPL contains
no restrictions on *use*, it restricts only copying, modifying and
distributing (the specific rights granted by copyright law).  I don't see
any way to read the GPL as restricting your ability to distribute a GPL'ed
program P on its own, no matter what the status of the packages that P may
rely upon for operation.

The GPL is also not viral in the sense that it cannot infect an unwitting
victim.  Nothing whatsoever you do or don't do can make *any* other program
Q "fall under" the GPL -- only Q's owner can set the license for Q.  The GPL
purportedly can prevent you from distributing (but not from using) a program
that links with a GPL'ed program, but that doesn't appear to be what you're
asking about.  Or is it?

If you were to put, say, mxDateTime, under the GPL, then yes, I believe the
FSF would claim I could not distribute my program T that uses mxDateTime
unless T were also under the GPL or a GPL-compatible license.  But if
mxDateTime is not under the GPL, then nothing I do with T can magically
change the mxDateTime license to the GPL (although if your mxDateTime
license allows me to redistribute mxDateTime under a different license, then
it allows me to ship a copy of mxDateTime under the GPL).

That said, the whole theory of GPL linking is muddy to me, especially since
the word "link" (and its variants) doesn't appear in the GPL.

> Oh well, nevermind... it's still nice to hear that CNRI and RMS
> have finally made up their minds to render Python GPL-compatible --
> whatever this means ;-)

I'm not sure it means anything yet.  CNRI and the FSF believed they reached
agreement before, but that didn't last after Moglen and Kahn each figured
out what the other was really suggesting.




From mal at lemburg.com  Thu Dec 14 23:25:31 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 23:25:31 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <LNBBLJKPBEHFEDALKOLCEEPLIDAA.tim.one@home.com>
Message-ID: <3A3948DB.9165E404@lemburg.com>

Tim Peters wrote:
> 
> I'm not going to argue about the GPL.  Take it up with the FSF! 

Sorry, I got a bit carried away -- I don't want to take it up
with the FSF, simply because I couldn't care less. What's bugging
me is that this one guy is splitting the OSS world in two even 
though both halfs actually want the same thing: software which
you can use for free with full source code. I find that a very 
poor situation.

> I will say
> that if you do get the FSF's attention, Moglen will have an instant counter
> to any objection you're likely to raise -- he's been thinking about this for
> 10 years, and he's heard it all.  And in our experience, RMS won't commit to
> anything before running it past Moglen.
> 
> [MAL]
> > But it's his [RMS's] piece of work, isn't it ? He's the one who can
> > change it.
> 
> Akin to saying Python is Guido's piece of work.  Yes, no, kinda, more true
> at some times than others, ditto respects.  RMS has consistently said that
> any changes for the next version of the GPL will take at least a year, due
> to extensive legal review required first.  Would be more clearly true to say
> that the first version of the GPL was RMS's alone -- but version 2 came out
> in 1991.

Point taken.
 
> > ...
> > Strange, then how come he sees the choice of law clause as a problem:
> > without explicitely ruling out the applicability of the UN CISC,
> > this clause is waived by it anyway... at least according to a
> > specialist on software law here in Germany.
> > ... [and other "who knows?" objections] ...
> 
> Guido quoted the text of your Wed, 06 Sep 2000 14:19:09 +0200 "Re:
> [License-py20] Re: GPL incompability as seen from Europe" msg to Moglen, who
> dismissed it almost offhandedly as "layman's commentary".  You'll have to
> ask him why:  MAL, we're not lawyers.  We're incompetent to have this
> discussion -- or at least I am, and Moglen thinks you are too <wink>.

I'm not a lawyer either, but I am able to apply common sense and 
know about German trade laws. Anyway, here a reference which
covers all the controversial subjects. It's in German, but these
guys qualify as lawyers ;-) ...

	http://www.ifross.de/ifross_html/index.html

There's also a book on the subject in German which covers
all aspects of software licensing. Here's the reference in
case anyone cares:

	Jochen Marly, Software?berlassungsvertr?ge
	C.H. Beck, M?nchen, 2000
 
> >>> Another issue: since Python doesn't link Python scripts, is it
> >>> still true that if one (pure) Python package is covered by the GPL,
> >>> then all other packages needed by that application will also fall
> >>> under GPL ?
> 
> [Tim]
> >> Sorry, couldn't make sense of the question.  Just as well,
> >> since you should ask about it on a GNU forum anyway <wink>.
> 
> [MAL]
> > Isn't this question (whether the GPL virus applies to byte-code
> > as well) important to Python programmers as well ?
> 
> I don't know -- like I said, I couldn't make sense of the question, i.e. I
> couldn't figure out what it is you're asking.  I *suspect* it's based on a
> misunderstanding of the GPL; for example, gcc is a GPL'ed application that
> requires stuff from the OS in order to do its job of compiling, but that
> doesn't mean that every OS it runs on falls under the GPL.  The GPL contains
> no restrictions on *use*, it restricts only copying, modifying and
> distributing (the specific rights granted by copyright law).  I don't see
> any way to read the GPL as restricting your ability to distribute a GPL'ed
> program P on its own, no matter what the status of the packages that P may
> rely upon for operation.

This is very controversial: if an application Q needs a GPLed 
library P to work, then P and Q form a new whole in the sense of
the GPL. And this even though P wasn't even distributed together
with Q. Don't ask me why, but that's how RMS and folks look at it.

It can be argued that the dynamic linker actually integrates
P into Q, but is the same argument valid for a Python program Q
which relies on a GPLed package P ? (The relationship between
Q and P is one of providing interfaces -- there is no call address
patching required for the setup to work.)

> The GPL is also not viral in the sense that it cannot infect an unwitting
> victim.  Nothing whatsoever you do or don't do can make *any* other program
> Q "fall under" the GPL -- only Q's owner can set the license for Q.  The GPL
> purportedly can prevent you from distributing (but not from using) a program
> that links with a GPL'ed program, but that doesn't appear to be what you're
> asking about.  Or is it?

No. What's viral about the GPL is that you can turn an application
into a GPLed one by merely linking the two together -- that's why
e.g. the libc is distributed under the LGPL which doesn't have this
viral property.
 
> If you were to put, say, mxDateTime, under the GPL, then yes, I believe the
> FSF would claim I could not distribute my program T that uses mxDateTime
> unless T were also under the GPL or a GPL-compatible license.  But if
> mxDateTime is not under the GPL, then nothing I do with T can magically
> change the mxDateTime license to the GPL (although if your mxDateTime
> license allows me to redistribute mxDateTime under a different license, then
> it allows me to ship a copy of mxDateTime under the GPL).
>
> That said, the whole theory of GPL linking is muddy to me, especially since
> the word "link" (and its variants) doesn't appear in the GPL.

True.
 
> > Oh well, nevermind... it's still nice to hear that CNRI and RMS
> > have finally made up their minds to render Python GPL-compatible --
> > whatever this means ;-)
> 
> I'm not sure it means anything yet.  CNRI and the FSF believed they reached
> agreement before, but that didn't last after Moglen and Kahn each figured
> out what the other was really suggesting.

Oh boy...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From greg at cosc.canterbury.ac.nz  Fri Dec 15 00:19:09 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 15 Dec 2000 12:19:09 +1300 (NZDT)
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <3A3948DB.9165E404@lemburg.com>
Message-ID: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz>

"M.-A. Lemburg" <mal at lemburg.com>:
> if an application Q needs a GPLed 
> library P to work, then P and Q form a new whole in the sense of
> the GPL.

I don't see how Q can *need* any particular library P
to work. The most it can need is some library with
an API which is compatible with P's. So I don't
buy that argument.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From greg at cosc.canterbury.ac.nz  Fri Dec 15 00:58:24 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 15 Dec 2000 12:58:24 +1300 (NZDT)
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <3A387005.6725DAAE@ActiveState.com>
Message-ID: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz>

Paul Prescod <paulp at ActiveState.com>:

> We could say that a local can only shadow a global 
> if the local is formally declared.

How do you intend to enforce that? Seems like it would
require a test on every assignment to a local, to make
sure nobody has snuck in a new global since the function
was compiled.

> Actually, one could argue that there is no good reason to 
> even *allow* the shadowing of globals.

If shadowing were completely disallowed, it would make it
impossible to write a completely self-contained function
whose source could be moved from one environment to another
without danger of it breaking. I wouldn't like the language
to have a characteristic like that.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From greg at cosc.canterbury.ac.nz  Fri Dec 15 01:06:12 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 15 Dec 2000 13:06:12 +1300 (NZDT)
Subject: [Python-Dev] Online help scope
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEOBIDAA.tim.one@home.com>
Message-ID: <200012150006.NAA02154@s454.cosc.canterbury.ac.nz>

Tim Peters <tim.one at home.com>:

> [Paul Prescod]

> > Keywords have no docstrings.

> Neither do integers, but they're obvious too <wink>.

Oh, I don't know, it could be useful.

>>> help(2)
The first prime number.

>>> help(2147483647)
sys.maxint, the largest Python small integer.

>>> help(42)
The answer to the ultimate question of life, the universe
and everything. See also: ultimate_question.

>>> help("ultimate_question")
[Importing research.mice.earth]
[Calling earth.find_ultimate_question]
This may take about 10 million years, please be patient...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From barry at digicool.com  Fri Dec 15 01:33:16 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Thu, 14 Dec 2000 19:33:16 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <3A3948DB.9165E404@lemburg.com>
	<200012142319.MAA02140@s454.cosc.canterbury.ac.nz>
Message-ID: <14905.26316.407495.981198@anthem.concentric.net>

>>>>> "GE" == Greg Ewing <greg at cosc.canterbury.ac.nz> writes:

    GE> I don't see how Q can *need* any particular library P to
    GE> work. The most it can need is some library with an API which
    GE> is compatible with P's. So I don't buy that argument.

It's been my understanding that the FSF's position on this is as
follows.  If the only functional implementation of the API is GPL'd
software then simply writing your code against that API is tantamount
to linking with that software.  Their reasoning is that the clear
intent of the programmer (shut up, Chad) is to combine the program
with GPL code.  As soon as there is a second, non-GPL implementation
of the API, you're fine because while you may not distribute your
program with the GPL'd software linked in, those who receive your
software wouldn't be forced to combine GPL and non-GPL code.

-Barry



From tim.one at home.com  Fri Dec 15 04:01:36 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 14 Dec 2000 22:01:36 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <3A3948DB.9165E404@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEAEIEAA.tim.one@home.com>

[MAL]
> Sorry, I got a bit carried away -- I don't want to take it up
> with the FSF, simply because I couldn't care less.

Well, nobody else is able to Pronounce on what the FSF believes or will do.
Which tells me that you're not really interested in playing along with the
FSF here after all -- which we both knew from the start anyway <wink>.

> What's bugging me is that this one guy is splitting the OSS world

There are many people on the FSF bandwagon.  I'm not one of them, but I can
count.

> in two even though both halfs actually want the same thing: software
> which you can use for free with full source code. I find that a very
> poor situation.

RMS would not agree that both halves want the same thing; to the contrary,
he's openly contemptuous of the Open Source movement -- which you also knew
from the start.

> [stuff about German law I won't touch with 12-foot schnitzel]

OTOH, a German FSF advocate assured me:

    I also tend to forget that the system of the law works different
    in the US as in Germany. In Germany something that most people
    will believe (called "common grounds") play a role in the court.
    So if you knew, because it is widely known what the GPL means,
    than it is harder to attack that in court.

In the US, when something gets to court it doesn't matter at all what people
believed about it.  Heck, we'll let mass murderers go free if a comma was in
the wrong place in a 1592 statute, or send a kid to jail for life for using
crack cocaine instead of the flavor favored by stockbrokers <wink>.  I hope
the US is unique in that respect, but it does makes the GPL weaker here
because even if *everyone* in our country believed the GPL means what RMS
says it means, a US court would give that no weight in its logic-chopping.

>>> Another issue: since Python doesn't link Python scripts, is it
>>> still true that if one (pure) Python package is covered by the GPL,
>>> then all other packages needed by that application will also fall
>>> under GPL ?

> This is very controversial: if an application Q needs a GPLed
> library P to work, then P and Q form a new whole in the sense of
> the GPL. And this even though P wasn't even distributed together
> with Q. Don't ask me why, but that's how RMS and folks look at it.

Understood, but have you reread your question above, which I've said twice I
can't make sense of?  That's not what you were asking about.  Your question
above asks, if anything, the opposite:  the *application* Q is GPL'ed, and
the question above asks whether that means the *Ps* it depends on must also
be GPL'ed.  To the best of my ability, I've answered "NO" to that one, and
"YES" to the question it appears you meant to ask.

> It can be argued that the dynamic linker actually integrates
> P into Q, but is the same argument valid for a Python program Q
> which relies on a GPLed package P ? (The relationship between
> Q and P is one of providing interfaces -- there is no call address
> patching required for the setup to work.)

As before, I believe the FSF will say YES.  Unless there's also a non-GPL'ed
implementation of the same interface that people could use just as well.
See my extended mxDateTime example too.

> ...
> No. What's viral about the GPL is that you can turn an application
> into a GPLed one by merely linking the two together

No, you cannot.  You can link them together all day without any hassle.
What you cannot do is *distribute* it unless the aggregate is first placed
under the GPL (or a GPL-compatible license) too.  If you distribute it
without taking that step, that doesn't turn it into a GPL'ed application
either -- in that case you've simply (& supposedly) violated the license on
P, so your distribution was simply (& supposedly) illegal.  And that is in
fact the end result that people who knowingly use the GPL want (granting
that it appears most people who use the GPL do so unknowing of its
consequences).

> -- that's why e.g. the libc is distributed under the LGPL which
> doesn't have this viral property.

You should read RMS on why glibc is under the LGPL:

    http://www.fsf.org/philosophy/why-not-lgpl.html

It will at least disabuse you of the notion that RMS and you are after the
same thing <wink>.




From paulp at ActiveState.com  Fri Dec 15 05:02:08 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Thu, 14 Dec 2000 20:02:08 -0800
Subject: [Python-Dev] new draft of PEP 227
References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz>
Message-ID: <3A3997C0.F977AF51@ActiveState.com>

Greg Ewing wrote:
> 
> Paul Prescod <paulp at ActiveState.com>:
> 
> > We could say that a local can only shadow a global
> > if the local is formally declared.
> 
> How do you intend to enforce that? Seems like it would
> require a test on every assignment to a local, to make
> sure nobody has snuck in a new global since the function
> was compiled.

I would expect that all of the checks would be at compile-time. Except
for __dict__ hackery, I think it is doable. Python already keeps track
of all assignments to locals and all assignments to globals in a
function scope. The only addition is keeping track of assignments at a
global scope.

> > Actually, one could argue that there is no good reason to
> > even *allow* the shadowing of globals.
> 
> If shadowing were completely disallowed, it would make it
> impossible to write a completely self-contained function
> whose source could be moved from one environment to another
> without danger of it breaking. I wouldn't like the language
> to have a characteristic like that.

That seems like a very esoteric requirement. How often do you have
functions that do not rely *at all* on their environment (other
functions, import statements, global variables).

When you move code you have to do some rewriting or customizing of the
environment in 94% of the cases. How much effort do you want to spend on
the other 6%? Also, there are tools that are designed to help you move
code without breaking programs (refactoring editors). They can just as
easily handle renaming local variables as adding import statements and
fixing up function calls.

 Paul Prescod



From mal at lemburg.com  Fri Dec 15 11:05:59 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 11:05:59 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz>
Message-ID: <3A39ED07.6B3EE68E@lemburg.com>

Greg Ewing wrote:
> 
> "M.-A. Lemburg" <mal at lemburg.com>:
> > if an application Q needs a GPLed
> > library P to work, then P and Q form a new whole in the sense of
> > the GPL.
> 
> I don't see how Q can *need* any particular library P
> to work. The most it can need is some library with
> an API which is compatible with P's. So I don't
> buy that argument.

It's the view of the FSF, AFAIK. You can't distribute an application
in binary which dynamically links against libreadline (which is GPLed)
on the user's machine, since even though you don't distribute
libreadline the application running on the user's machine is
considered the "whole" in terms of the GPL.

FWIW, I don't agree with that view either, but that's probably
because I'm a programmer and not a lawyer :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Fri Dec 15 11:25:12 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 11:25:12 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <LNBBLJKPBEHFEDALKOLCKEAEIEAA.tim.one@home.com>
Message-ID: <3A39F188.E366B481@lemburg.com>

Tim Peters wrote:
> 
> [Tim and MAL talking about the FSF and their views]
> 
> [Tim and MAL showing off as hobby advocates ;-)]
> 
> >>> Another issue: since Python doesn't link Python scripts, is it
> >>> still true that if one (pure) Python package is covered by the GPL,
> >>> then all other packages needed by that application will also fall
> >>> under GPL ?
> 
> > This is very controversial: if an application Q needs a GPLed
> > library P to work, then P and Q form a new whole in the sense of
> > the GPL. And this even though P wasn't even distributed together
> > with Q. Don't ask me why, but that's how RMS and folks look at it.
> 
> Understood, but have you reread your question above, which I've said twice I
> can't make sense of? 

I know, it was backwards. 

Take an example: I have a program which
wants to process MP3 files in some way. Now because of some stroke
is luck, all Python MP3 modules out there are covered by the GPL.

Now I could write an application which uses a certain interface
and then tell the user to install the MP3 module separately.
As Barry mentioned, this setup will cause distribution of my 
application to be illegal because I could have only done so
by putting the application under the GPL.

> You should read RMS on why glibc is under the LGPL:
> 
>     http://www.fsf.org/philosophy/why-not-lgpl.html
> 
> It will at least disabuse you of the notion that RMS and you are after the
> same thing <wink>.

:-) 

Let's stop this discussion and get back to those cheerful things
like Christmas Bells and Santa Clause... :-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From amk at mira.erols.com  Fri Dec 15 14:27:24 2000
From: amk at mira.erols.com (A.M. Kuchling)
Date: Fri, 15 Dec 2000 08:27:24 -0500
Subject: [Python-Dev] Use of %c and Py_UNICODE
Message-ID: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com>

unicodeobject.c contains this code:

                PyErr_Format(PyExc_ValueError,
                            "unsupported format character '%c' (0x%x) "
                            "at index %i",
                            c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat));

c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits,
so '%\u3000' % 1 results in an error message containing "'\000'
(0x3000)".  Is this worth fixing?  I'd say no, since the hex value is
more useful for Unicode strings anyway.  (I still wanted to mention
this little buglet, since I just touched this bit of code.)

--amk




From jack at oratrix.nl  Fri Dec 15 15:26:15 2000
From: jack at oratrix.nl (Jack Jansen)
Date: Fri, 15 Dec 2000 15:26:15 +0100
Subject: [Python-Dev] reuse of address default value (was Re: 
 [Python-checkins] CVS: python/dist/src/Lib SocketServer.py)
In-Reply-To: Message by Guido van Rossum <guido@python.org> ,
	     Thu, 14 Dec 2000 09:51:26 -0500 , <200012141451.JAA15637@cj20424-a.reston1.va.home.com> 
Message-ID: <20001215142616.705993B9B44@snelboot.oratrix.nl>

> The reason for the patch is that without this, if you kill a TCP server
> and restart it right away, you'll get a 'port in use" error -- TCP has
> some kind of strange wait period after a connection is closed before
> it can be reused.  The patch avoids this error.

Well, actually there's a pretty good reason for the "port in use" behaviour: 
the TCP standard more-or-less requires it. A srchost/srcport/dsthost/dstport 
combination should not be reused until the maximum TTL has passed, because 
there may still be "old" retransmissions around. Especially the "open" packets 
are potentially dangerous.

Setting the reuse bit while you're debugging is fine, but setting it in 
general is not a very good idea...
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 





From guido at python.org  Fri Dec 15 15:31:19 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 09:31:19 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: Your message of "Thu, 14 Dec 2000 20:02:08 PST."
             <3A3997C0.F977AF51@ActiveState.com> 
References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz>  
            <3A3997C0.F977AF51@ActiveState.com> 
Message-ID: <200012151431.JAA19799@cj20424-a.reston1.va.home.com>

> Greg Ewing wrote:
> > 
> > Paul Prescod <paulp at ActiveState.com>:
> > 
> > > We could say that a local can only shadow a global
> > > if the local is formally declared.
> > 
> > How do you intend to enforce that? Seems like it would
> > require a test on every assignment to a local, to make
> > sure nobody has snuck in a new global since the function
> > was compiled.
> 
> I would expect that all of the checks would be at compile-time. Except
> for __dict__ hackery, I think it is doable. Python already keeps track
> of all assignments to locals and all assignments to globals in a
> function scope. The only addition is keeping track of assignments at a
> global scope.
> 
> > > Actually, one could argue that there is no good reason to
> > > even *allow* the shadowing of globals.
> > 
> > If shadowing were completely disallowed, it would make it
> > impossible to write a completely self-contained function
> > whose source could be moved from one environment to another
> > without danger of it breaking. I wouldn't like the language
> > to have a characteristic like that.
> 
> That seems like a very esoteric requirement. How often do you have
> functions that do not rely *at all* on their environment (other
> functions, import statements, global variables).
> 
> When you move code you have to do some rewriting or customizing of the
> environment in 94% of the cases. How much effort do you want to spend on
> the other 6%? Also, there are tools that are designed to help you move
> code without breaking programs (refactoring editors). They can just as
> easily handle renaming local variables as adding import statements and
> fixing up function calls.

Can we cut this out please?  Paul is misguided.  There's no reason to
forbid a local shadowing a global.  All languages with nested scopes
allow this.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Fri Dec 15 17:17:08 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Fri, 15 Dec 2000 11:17:08 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz>
	<3A39ED07.6B3EE68E@lemburg.com>
Message-ID: <14906.17412.221040.895357@anthem.concentric.net>

>>>>> "M" == M  <mal at lemburg.com> writes:

    M> It's the view of the FSF, AFAIK. You can't distribute an
    M> application in binary which dynamically links against
    M> libreadline (which is GPLed) on the user's machine, since even
    M> though you don't distribute libreadline the application running
    M> on the user's machine is considered the "whole" in terms of the
    M> GPL.

    M> FWIW, I don't agree with that view either, but that's probably
    M> because I'm a programmer and not a lawyer :)

I'm not sure I agree with that view either, but mostly because there
is a non-GPL replacement for parts of the readline API:

    http://www.cstr.ed.ac.uk/downloads/editline.html

Don't know anything about it, so it may not be featureful enough for
Python's needs, but if licensing is really a problem, it might be
worth looking into.

-Barry



From paulp at ActiveState.com  Fri Dec 15 17:16:37 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Fri, 15 Dec 2000 08:16:37 -0800
Subject: [Python-Dev] new draft of PEP 227
References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz>  
	            <3A3997C0.F977AF51@ActiveState.com> <200012151431.JAA19799@cj20424-a.reston1.va.home.com>
Message-ID: <3A3A43E5.347AAF6C@ActiveState.com>

Guido van Rossum wrote:
> 
> ...
> 
> Can we cut this out please?  Paul is misguided.  There's no reason to
> forbid a local shadowing a global.  All languages with nested scopes
> allow this.

Python is the only one I know of that implicitly shadows without
requiring some form of declaration. JavaScript has it right: reading and
writing of globals are symmetrical. In the rare case that you explicitly
want to shadow, you need a declaration. Python's rule is confusing,
implicit and error causing. In my opinion, of course. If you are
dead-set against explicit declarations then I would say that disallowing
the ambiguous construct is better than silently treating it as a
declaration.

 Paul Prescod



From guido at python.org  Fri Dec 15 17:23:07 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 11:23:07 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: Your message of "Fri, 15 Dec 2000 08:16:37 PST."
             <3A3A43E5.347AAF6C@ActiveState.com> 
References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> <3A3997C0.F977AF51@ActiveState.com> <200012151431.JAA19799@cj20424-a.reston1.va.home.com>  
            <3A3A43E5.347AAF6C@ActiveState.com> 
Message-ID: <200012151623.LAA27630@cj20424-a.reston1.va.home.com>

> Python is the only one I know of that implicitly shadows without
> requiring some form of declaration. JavaScript has it right: reading and
> writing of globals are symmetrical. In the rare case that you explicitly
> want to shadow, you need a declaration. Python's rule is confusing,
> implicit and error causing. In my opinion, of course. If you are
> dead-set against explicit declarations then I would say that disallowing
> the ambiguous construct is better than silently treating it as a
> declaration.

Let's agree to differ.  This will never change.  In Python, assignment
is declaration.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Dec 15 18:01:33 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 12:01:33 -0500
Subject: [Python-Dev] Use of %c and Py_UNICODE
In-Reply-To: Your message of "Fri, 15 Dec 2000 08:27:24 EST."
             <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> 
References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> 
Message-ID: <200012151701.MAA28058@cj20424-a.reston1.va.home.com>

> unicodeobject.c contains this code:
> 
>                 PyErr_Format(PyExc_ValueError,
>                             "unsupported format character '%c' (0x%x) "
>                             "at index %i",
>                             c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat));
> 
> c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits,
> so '%\u3000' % 1 results in an error message containing "'\000'
> (0x3000)".  Is this worth fixing?  I'd say no, since the hex value is
> more useful for Unicode strings anyway.  (I still wanted to mention
> this little buglet, since I just touched this bit of code.)

Sounds like the '%c' should just be deleted.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From bckfnn at worldonline.dk  Fri Dec 15 18:05:42 2000
From: bckfnn at worldonline.dk (Finn Bock)
Date: Fri, 15 Dec 2000 17:05:42 GMT
Subject: [Python-Dev] CWD in sys.path.
Message-ID: <3a3a480b.28490597@smtp.worldonline.dk>

Hi,

I'm trying to understand the initialization of sys.path and especially
if CWD is supposed to be included in sys.path by default. (I understand
the purpose of sys.path[0], that is not the focus of my question).

My setup is Python2.0 on Win2000, no PYTHONHOME or PYTHONPATH envvars.

In this setup, an empty string exists as sys.path[1], but I'm unsure if
this is by careful design or some freak accident. The empty entry is
added because

  HKEY_LOCAL_MACHINE\SOFTWARE\Python\PythonCore\2.0\PythonPath 

does *not* have any subkey. There are a default value, but that value
appears to be ignored. If I add a subkey "foo":

  HKEY_LOCAL_MACHINE\SOFTWARE\Python\PythonCore\2.0\PythonPath\foo 

with a default value of "d:\foo", the CWD is no longer in sys.path.

i:\java\jython.cvs\org\python\util>d:\Python20\python.exe  -S
Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', 'd:\\foo', 'D:\\PYTHON20\\DLLs', 'D:\\PYTHON20\\lib',
'D:\\PYTHON20\\lib\\plat-win', 'D:\\PYTHON20\\lib\\lib-tk',
'D:\\PYTHON20']
>>>

I noticed that some of the PYTHONPATH macros in PC/config.h includes the
'.', others does not.

So, to put it as a question (for jython): Should CWD be included in
sys.path? Are there some situation (like embedding) where CWD shouldn't
be in sys.path?

regards,
finn



From guido at python.org  Fri Dec 15 18:12:03 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 12:12:03 -0500
Subject: [Python-Dev] CWD in sys.path.
In-Reply-To: Your message of "Fri, 15 Dec 2000 17:05:42 GMT."
             <3a3a480b.28490597@smtp.worldonline.dk> 
References: <3a3a480b.28490597@smtp.worldonline.dk> 
Message-ID: <200012151712.MAA02544@cj20424-a.reston1.va.home.com>

On Unix, CWD is not in sys.path unless as sys.path[0].

--Guido van Rossum (home page: http://www.python.org/~guido/)



From moshez at zadka.site.co.il  Sat Dec 16 02:43:41 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Sat, 16 Dec 2000 03:43:41 +0200 (IST)
Subject: [Python-Dev] new draft of PEP 227
Message-ID: <20001216014341.5BA97A82E@darjeeling.zadka.site.co.il>

On Fri, 15 Dec 2000 08:16:37 -0800, Paul Prescod <paulp at ActiveState.com> wrote:

> Python is the only one I know of that implicitly shadows without
> requiring some form of declaration.

Perl and Scheme permit implicit shadowing too.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tismer at tismer.com  Fri Dec 15 17:42:18 2000
From: tismer at tismer.com (Christian Tismer)
Date: Fri, 15 Dec 2000 18:42:18 +0200
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL (Splitting up 
 _cursesmodule)
References: <LNBBLJKPBEHFEDALKOLCIEPFIDAA.tim.one@home.com>
Message-ID: <3A3A49EA.5D9418E@tismer.com>


Tim Peters wrote:
...
> > Another issue: since Python doesn't link Python scripts, is it
> > still true that if one (pure) Python package is covered by the GPL,
> > then all other packages needed by that application will also fall
> > under GPL ?
> 
> Sorry, couldn't make sense of the question.  Just as well, since you should
> ask about it on a GNU forum anyway <wink>.

The GNU license is transitive. It automatically extends on other
parts of a project, unless they are identifiable, independent
developments. As soon as a couple of modules is published together,
based upon one GPL-ed module, this propagates. I think this is
what MAL meant?
Anyway, I'd be interested to hear what the GNU forum says.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From amk at mira.erols.com  Fri Dec 15 19:10:34 2000
From: amk at mira.erols.com (A.M. Kuchling)
Date: Fri, 15 Dec 2000 13:10:34 -0500
Subject: [Python-Dev] What to do about PEP 229?
Message-ID: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com>

I began writing the fabled fancy setup script described in PEP 229,
and then realized there was duplication going on here.  The code in
setup.py would need to know what libraries, #defines, &c., are needed
by each module in order to check if they're needed and set them.  But
if Modules/Setup can be used to override setup.py's behaviour, then
much of this information would need to be in that file, too; the
details of compiling a module are in two places. 

Possibilities:

1) Setup contains fully-loaded module descriptions, and the setup
   script drops unneeded bits.  For example, the socket module
   requires -lnsl on some platforms.  The Setup file would contain
   "socket socketmodule.c -lnsl" on all platforms, and setup.py would 
   check for an nsl library and only use if it's there.  

   This seems dodgy to me; what if -ldbm is needed on one platform and
   -lndbm on another?

2) Drop setup completely and just maintain setup.py, with some
   different overriding mechanism.  This is more radical.  Adding a
   new module is then not just a matter of editing a simple text file;
   you'd have to modify setup.py, making it more like maintaining an
   autoconf script.
  
Remember, the underlying goal of PEP 229 is to have the out-of-the-box
Python installation you get from "./configure;make" contain many more
useful modules; right now you wouldn't get zlib, syslog, resource, any
of the DBM modules, PyExpat, &c.  I'm not wedded to using Distutils to
get that, but think that's the only practical way; witness the hackery
required to get the DB module automatically compiled.  

You can also wave your hands in the direction of packagers such as
ActiveState or Red Hat, and say "let them make to compile everything".
But this problem actually inconveniences *me*, since I always build
Python myself and have to extensively edit Setup, so I'd like to fix
the problem.

Thoughts? 

--amk




From nas at arctrix.com  Fri Dec 15 13:03:04 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 15 Dec 2000 04:03:04 -0800
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <14906.17412.221040.895357@anthem.concentric.net>; from barry@digicool.com on Fri, Dec 15, 2000 at 11:17:08AM -0500
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net>
Message-ID: <20001215040304.A22056@glacier.fnational.com>

On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote:
> I'm not sure I agree with that view either, but mostly because there
> is a non-GPL replacement for parts of the readline API:
> 
>     http://www.cstr.ed.ac.uk/downloads/editline.html

It doesn't work with the current readline module.  It is much
smaller than readline and works just as well in my experience.
Would there be any interest in including a copy with the standard
distribution?  The license is quite nice (X11 type).

  Neil



From nas at arctrix.com  Fri Dec 15 13:14:50 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 15 Dec 2000 04:14:50 -0800
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012151509.HAA18093@slayer.i.sourceforge.net>; from gvanrossum@users.sourceforge.net on Fri, Dec 15, 2000 at 07:09:46AM -0800
References: <200012151509.HAA18093@slayer.i.sourceforge.net>
Message-ID: <20001215041450.B22056@glacier.fnational.com>

On Fri, Dec 15, 2000 at 07:09:46AM -0800, Guido van Rossum wrote:
> Update of /cvsroot/python/python/dist/src/Lib
> In directory slayer.i.sourceforge.net:/tmp/cvs-serv18082
> 
> Modified Files:
> 	httplib.py 
> Log Message:
> Get rid of string functions.

Can you explain the logic behind this recent interest in removing
string functions from the standard library?  It it performance?
Some unicode issue?  I don't have a great attachment to string.py
but I also don't see the justification for the amount of work it
requires.

  Neil



From guido at python.org  Fri Dec 15 20:29:37 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 14:29:37 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Fri, 15 Dec 2000 04:14:50 PST."
             <20001215041450.B22056@glacier.fnational.com> 
References: <200012151509.HAA18093@slayer.i.sourceforge.net>  
            <20001215041450.B22056@glacier.fnational.com> 
Message-ID: <200012151929.OAA03073@cj20424-a.reston1.va.home.com>

> Can you explain the logic behind this recent interest in removing
> string functions from the standard library?  It it performance?
> Some unicode issue?  I don't have a great attachment to string.py
> but I also don't see the justification for the amount of work it
> requires.

I figure that at *some* point we should start putting our money where
our mouth is, deprecate most uses of the string module, and start
warning about it.  Not in 2.1 probably, given my experience below.

As a realistic test of the warnings module I played with some warnings
about the string module, and then found that say most of the std
library modules use it, triggering an extraordinary amount of
warnings.  I then decided to experiment with the conversion.  I
quickly found out it's too much work to do manually, so I'll hold off
until someone comes up with a tool that does 99% of the work.

(The selection of std library modules to convert manually was
triggered by something pretty random -- I decided to silence a
particular cron job I was running. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From Barrett at stsci.edu  Fri Dec 15 20:32:10 2000
From: Barrett at stsci.edu (Paul Barrett)
Date: Fri, 15 Dec 2000 14:32:10 -0500 (EST)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>
References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>
Message-ID: <14906.17712.830224.481130@nem-srvr.stsci.edu>

Guido,

Here are my comments on PEP 207.  (I've also gone back and read most
of the 1998 discussion.  What a tedious, in terms of time, but
enlightening, in terms of content, discussion that was.)

| - New function:
|
|      PyObject *PyObject_RichCompare(PyObject *, PyObject *, enum cmp_op)
|
|      This performs the requested rich comparison, returning a Python
|      object or raising an exception.  The 3rd argument must be one of
|      LT, LE, EQ, NE, GT or GE.

I'd much prefer '<', '<=', '=', etc. to LT, LE, EQ, etc.


|    Classes
|
|    - Classes can define new special methods __lt__, __le__, __gt__,
|      __ge__, __eq__, __ne__ to override the corresponding operators.
|      (You gotta love the Fortran heritage.)  If a class overrides
|      __cmp__ as well, it is only used by PyObject_Compare().

Likewise, I'd prefer __less__, __lessequal__, __equal__, etc. to
__lt__, __le__, __eq__, etc.  I'm not keen on the FORTRAN derived
symbolism.  I also find it contrary to Python's heritage of being
clear and concise.  I don't mind typing __lessequal__ (or
__less_equal__) once per class for the additional clarity.


|    - Should we even bother upgrading the existing types?

Isn't this question partly related to the coercion issue and which
type of comparison takes precedence?  And if so, then I would think
the answer would be 'yes'.  Or better still see below my suggestion of
adding poor and rich comparison operators along with matrix-type
operators. 


    - If so, how should comparisons on container types be defined?
      Suppose we have a list whose items define rich comparisons.  How
      should the itemwise comparisons be done?  For example:

        def __lt__(a, b): # a<b for lists
            for i in range(min(len(a), len(b))):
                ai, bi = a[i], b[i]
                if ai < bi: return 1
                if ai == bi: continue
                if ai > bi: return 0
                raise TypeError, "incomparable item types"
            return len(a) < len(b)

      This uses the same sequence of comparisons as cmp(), so it may
      as well use cmp() instead:

        def __lt__(a, b): # a<b for lists
            for i in range(min(len(a), len(b))):
                c = cmp(a[i], b[i])
                if c < 0: return 1
                if c == 0: continue
                if c > 0: return 0
                assert 0 # unreachable
            return len(a) < len(b)

      And now there's not really a reason to change lists to rich
      comparisons.

I don't understand this example.  If a[i] and b[i] define rich
comparisons, then 'a[i] < b[i]' is likely to return a non-boolean
value.  Yet the 'if' statement expects a boolean value.  I don't see
how the above example will work.

This example also makes me think that the proposals for new operators
(ie.  PEP 211 and 225) are a good idea.  The discussion of rich
comparisons in 1998 also lends some support to this.  I can see many
uses for two types of comparison operators (as well as the proposed
matrix-type operators), one set for poor or boolean comparisons and
one for rich or non-boolean comparisons.  For example, numeric arrays
can define both.  Rich comparison operators would return an array of
boolean values, while poor comparison operators return a boolean value
by performing an implied 'and.reduce' operation.  These operators
provide clarity and conciseness, without much change to current Python 
behavior.

 -- Paul



From guido at python.org  Fri Dec 15 20:51:04 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 14:51:04 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: Your message of "Fri, 15 Dec 2000 14:32:10 EST."
             <14906.17712.830224.481130@nem-srvr.stsci.edu> 
References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>  
            <14906.17712.830224.481130@nem-srvr.stsci.edu> 
Message-ID: <200012151951.OAA03219@cj20424-a.reston1.va.home.com>

> Here are my comments on PEP 207.  (I've also gone back and read most
> of the 1998 discussion.  What a tedious, in terms of time, but
> enlightening, in terms of content, discussion that was.)
> 
> | - New function:
> |
> |      PyObject *PyObject_RichCompare(PyObject *, PyObject *, enum cmp_op)
> |
> |      This performs the requested rich comparison, returning a Python
> |      object or raising an exception.  The 3rd argument must be one of
> |      LT, LE, EQ, NE, GT or GE.
> 
> I'd much prefer '<', '<=', '=', etc. to LT, LE, EQ, etc.

This is only at the C level.  Having to do a string compare is too
slow.  Since some of these are multi-character symbols, a character
constant doesn't suffice (multi-character character constants are not
portable).

> |    Classes
> |
> |    - Classes can define new special methods __lt__, __le__, __gt__,
> |      __ge__, __eq__, __ne__ to override the corresponding operators.
> |      (You gotta love the Fortran heritage.)  If a class overrides
> |      __cmp__ as well, it is only used by PyObject_Compare().
> 
> Likewise, I'd prefer __less__, __lessequal__, __equal__, etc. to
> __lt__, __le__, __eq__, etc.  I'm not keen on the FORTRAN derived
> symbolism.  I also find it contrary to Python's heritage of being
> clear and concise.  I don't mind typing __lessequal__ (or
> __less_equal__) once per class for the additional clarity.

I don't care about Fortran, but you just showed why I think the short
operator names are better: there's less guessing or disagreement about
how they are to be spelled.  E.g. should it be __lessthan__ or
__less_than__ or __less__?

> |    - Should we even bother upgrading the existing types?
> 
> Isn't this question partly related to the coercion issue and which
> type of comparison takes precedence?  And if so, then I would think
> the answer would be 'yes'.

It wouldn't make much of a difference -- comparisons between different
types numbers would get the same outcome either way.

> Or better still see below my suggestion of
> adding poor and rich comparison operators along with matrix-type
> operators. 
> 
> 
>     - If so, how should comparisons on container types be defined?
>       Suppose we have a list whose items define rich comparisons.  How
>       should the itemwise comparisons be done?  For example:
> 
>         def __lt__(a, b): # a<b for lists
>             for i in range(min(len(a), len(b))):
>                 ai, bi = a[i], b[i]
>                 if ai < bi: return 1
>                 if ai == bi: continue
>                 if ai > bi: return 0
>                 raise TypeError, "incomparable item types"
>             return len(a) < len(b)
> 
>       This uses the same sequence of comparisons as cmp(), so it may
>       as well use cmp() instead:
> 
>         def __lt__(a, b): # a<b for lists
>             for i in range(min(len(a), len(b))):
>                 c = cmp(a[i], b[i])
>                 if c < 0: return 1
>                 if c == 0: continue
>                 if c > 0: return 0
>                 assert 0 # unreachable
>             return len(a) < len(b)
> 
>       And now there's not really a reason to change lists to rich
>       comparisons.
> 
> I don't understand this example.  If a[i] and b[i] define rich
> comparisons, then 'a[i] < b[i]' is likely to return a non-boolean
> value.  Yet the 'if' statement expects a boolean value.  I don't see
> how the above example will work.

Sorry.  I was thinking of list items that contain objects that respond
to the new overloading protocol, but still return Boolean outcomes.
My conclusion is that __cmp__ is just as well.

> This example also makes me think that the proposals for new operators
> (ie.  PEP 211 and 225) are a good idea.  The discussion of rich
> comparisons in 1998 also lends some support to this.  I can see many
> uses for two types of comparison operators (as well as the proposed
> matrix-type operators), one set for poor or boolean comparisons and
> one for rich or non-boolean comparisons.  For example, numeric arrays
> can define both.  Rich comparison operators would return an array of
> boolean values, while poor comparison operators return a boolean value
> by performing an implied 'and.reduce' operation.  These operators
> provide clarity and conciseness, without much change to current Python 
> behavior.

Maybe.  That can still be decided later.  Right now, adding operators
is not on the table for 2.1 (if only because there are two conflicting
PEPs); adding rich comparisons *is* on the table because it doesn't
change the parser (and because the rich comparisons idea was already
pretty much worked out two years ago).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Fri Dec 15 22:08:02 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 15 Dec 2000 16:08:02 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012151929.OAA03073@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com>

[Neil Schemenauer]
> Can you explain the logic behind this recent interest in removing
> string functions from the standard library?  It it performance?
> Some unicode issue?  I don't have a great attachment to string.py
> but I also don't see the justification for the amount of work it
> requires.

[Guido]
> I figure that at *some* point we should start putting our money where
> our mouth is, deprecate most uses of the string module, and start
> warning about it.  Not in 2.1 probably, given my experience below.

I think this begs Neil's questions:  *is* our mouth there <ahem>, and if so,
why?  The only public notice of impending string module deprecation anyone
came up with was a vague note on the 1.6 web page, and one not repeated in
any of the 2.0 release material.

"string" is right up there with "os" and "sys" as a FIM (Frequently Imported
Module), so the required code changes will be massive.  As a user, I don't
see what's in it for me to endure that pain:  the string module functions
work fine!   Neither are they warts in the language, any more than that we
say sin(pi) instead of pi.sin().  Keeping the functions around doesn't hurt
anybody that I can see.

> As a realistic test of the warnings module I played with some warnings
> about the string module, and then found that say most of the std
> library modules use it, triggering an extraordinary amount of
> warnings.  I then decided to experiment with the conversion.  I
> quickly found out it's too much work to do manually, so I'll hold off
> until someone comes up with a tool that does 99% of the work.

Ah, so that's the *easy* way to kill this crusade -- forget I said anything
<wink>.




From Barrett at stsci.edu  Fri Dec 15 22:20:20 2000
From: Barrett at stsci.edu (Paul Barrett)
Date: Fri, 15 Dec 2000 16:20:20 -0500 (EST)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <200012151951.OAA03219@cj20424-a.reston1.va.home.com>
References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>
	<14906.17712.830224.481130@nem-srvr.stsci.edu>
	<200012151951.OAA03219@cj20424-a.reston1.va.home.com>
Message-ID: <14906.33325.5784.118110@nem-srvr.stsci.edu>

>> This example also makes me think that the proposals for new operators
>> (ie.  PEP 211 and 225) are a good idea.  The discussion of rich
>> comparisons in 1998 also lends some support to this.  I can see many
>> uses for two types of comparison operators (as well as the proposed
>> matrix-type operators), one set for poor or boolean comparisons and
>> one for rich or non-boolean comparisons.  For example, numeric arrays
>> can define both.  Rich comparison operators would return an array of
>> boolean values, while poor comparison operators return a boolean value
>> by performing an implied 'and.reduce' operation.  These operators
>> provide clarity and conciseness, without much change to current Python 
>> behavior.
>
> Maybe.  That can still be decided later.  Right now, adding operators
> is not on the table for 2.1 (if only because there are two conflicting
> PEPs); adding rich comparisons *is* on the table because it doesn't
> change the parser (and because the rich comparisons idea was already
> pretty much worked out two years ago).

Yes, it was worked out previously _assuming_ rich comparisons do not
use any new operators.  

But let's stop for a moment and contemplate adding rich comparisons 
along with new comparison operators.  What do we gain?

1. The current boolean operator behavior does not have to change, and
   hence will be backward compatible.

2. It eliminates the need to decide whether or not rich comparisons
   takes precedence over boolean comparisons.

3. The new operators add additional behavior without directly impacting 
   current behavior and the use of them is unambigous, at least in
   relation to current Python behavior.  You know by the operator what 
   type of comparison will be returned.  This should appease Jim
   Fulton, based on his arguments in 1998 about comparison operators
   always returning a boolean value.

4. Compound objects, such as lists, could implement both rich
   and boolean comparisons.  The boolean comparison would remain as
   is, while the rich comparison would return a list of boolean
   values.  Current behavior doesn't change; just a new feature, which
   you may or may not choose to use, is added.

If we go one step further and add the matrix-style operators along
with the comparison operators, we can provide a consistent user
interface to array/complex operations without changing current Python
behavior.  If a user has no need for these new operators, he doesn't
have to use them or even know about them.  All we've done is made
Python richer, but I believe with making it more complex.  For
example, all element-wise operations could have a ':' appended to
them, e.g. '+:', '<:', etc.; and will define element-wise addition,
element-wise less-than, etc.  The traditional '*', '/', etc. operators
can then be used for matrix operations, which will appease the Matlab
people.

Therefore, I don't think rich comparisons and matrix-type operators
should be considered separable.  I really think you should consider
this suggestion.  It appeases many groups while providing a consistent 
and clear user interface, while greatly impacting current Python
behavior. 

Always-causing-havoc-at-the-last-moment-ly Yours,
Paul

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218



From guido at python.org  Fri Dec 15 22:23:46 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 16:23:46 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Fri, 15 Dec 2000 16:08:02 EST."
             <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com> 
Message-ID: <200012152123.QAA03566@cj20424-a.reston1.va.home.com>

> "string" is right up there with "os" and "sys" as a FIM (Frequently
> Imported Module), so the required code changes will be massive.  As
> a user, I don't see what's in it for me to endure that pain: the
> string module functions work fine!  Neither are they warts in the
> language, any more than that we say sin(pi) instead of pi.sin().
> Keeping the functions around doesn't hurt anybody that I can see.

Hm.  I'm not saying that this one will be easy.  But I don't like
having "two ways to do it".  It means more learning, etc. (you know
the drill).  We could have chosen to make the strop module support
Unicode; instead, we chose to give string objects methods and promote
the use of those methods instead of the string module.  (And in a
generous mood, we also supported Unicode in the string module -- by
providing wrappers that invoke string methods.)

If you're saying that we should give users ample time for the
transition, I'm with you.

If you're saying that you think the string module is too prominent to
ever start deprecating its use, I'm afraid we have a problem.

I'd also like to note that using the string module's wrappers incurs
the overhead of a Python function call -- using string methods is
faster.

Finally, I like the look of fields[i].strip().lower() much better than
that of string.lower(string.strip(fields[i])) -- an actual example
from mimetools.py.

Ideally, I would like to deprecate the entire string module, so that I
can place a single warning at its top.  This will cause a single
warning to be issued for programs that still use it (no matter how
many times it is imported).  Unfortunately, there are a couple of
things that still need it: string.letters etc., and
string.maketrans().

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gvwilson at nevex.com  Fri Dec 15 22:43:47 2000
From: gvwilson at nevex.com (Greg Wilson)
Date: Fri, 15 Dec 2000 16:43:47 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <14906.33325.5784.118110@nem-srvr.stsci.edu>
Message-ID: <002901c066e0$1b3f13c0$770a0a0a@nevex.com>

Hi, Paul; thanks for your mail.  W.r.t. adding matrix operators to Python,
you may want to take a look at the counter-arguments in PEP 0211 (attached).
Basically, I spoke with the authors of GNU Octave (a GPL'd clone of MATLAB)
about what users really used.  They felt that the only matrix operator that
really mattered was matrix-matrix multiply; other operators (including the
left and right division operators that even experienced MATLAB users often
mix up) were second order at best, and were better handled with methods or
functions.

Thanks,
Greg

p.s. PEP 0225 (also attached) is an alternative to PEP 0211 which would add
most of the MATLAB-ish operators to Python.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: pep-0211.txt
URL: <http://mail.python.org/pipermail/python-dev/attachments/20001215/3bf6c282/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: pep-0225.txt
URL: <http://mail.python.org/pipermail/python-dev/attachments/20001215/3bf6c282/attachment-0001.txt>

From guido at python.org  Fri Dec 15 22:55:46 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 16:55:46 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: Your message of "Fri, 15 Dec 2000 16:20:20 EST."
             <14906.33325.5784.118110@nem-srvr.stsci.edu> 
References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> <14906.17712.830224.481130@nem-srvr.stsci.edu> <200012151951.OAA03219@cj20424-a.reston1.va.home.com>  
            <14906.33325.5784.118110@nem-srvr.stsci.edu> 
Message-ID: <200012152155.QAA03879@cj20424-a.reston1.va.home.com>

> > Maybe.  That can still be decided later.  Right now, adding operators
> > is not on the table for 2.1 (if only because there are two conflicting
> > PEPs); adding rich comparisons *is* on the table because it doesn't
> > change the parser (and because the rich comparisons idea was already
> > pretty much worked out two years ago).
> 
> Yes, it was worked out previously _assuming_ rich comparisons do not
> use any new operators.  
> 
> But let's stop for a moment and contemplate adding rich comparisons 
> along with new comparison operators.  What do we gain?
> 
> 1. The current boolean operator behavior does not have to change, and
>    hence will be backward compatible.

What incompatibility do you see in the current proposal?

> 2. It eliminates the need to decide whether or not rich comparisons
>    takes precedence over boolean comparisons.

Only if you want different semantics -- that's only an issue for NumPy.

> 3. The new operators add additional behavior without directly impacting 
>    current behavior and the use of them is unambigous, at least in
>    relation to current Python behavior.  You know by the operator what 
>    type of comparison will be returned.  This should appease Jim
>    Fulton, based on his arguments in 1998 about comparison operators
>    always returning a boolean value.

As you know, I'm now pretty close to Jim. :-)  He seemed pretty mellow
about this now.

> 4. Compound objects, such as lists, could implement both rich
>    and boolean comparisons.  The boolean comparison would remain as
>    is, while the rich comparison would return a list of boolean
>    values.  Current behavior doesn't change; just a new feature, which
>    you may or may not choose to use, is added.
> 
> If we go one step further and add the matrix-style operators along
> with the comparison operators, we can provide a consistent user
> interface to array/complex operations without changing current Python
> behavior.  If a user has no need for these new operators, he doesn't
> have to use them or even know about them.  All we've done is made
> Python richer, but I believe with making it more complex.  For
> example, all element-wise operations could have a ':' appended to
> them, e.g. '+:', '<:', etc.; and will define element-wise addition,
> element-wise less-than, etc.  The traditional '*', '/', etc. operators
> can then be used for matrix operations, which will appease the Matlab
> people.
> 
> Therefore, I don't think rich comparisons and matrix-type operators
> should be considered separable.  I really think you should consider
> this suggestion.  It appeases many groups while providing a consistent 
> and clear user interface, while greatly impacting current Python
> behavior. 
> 
> Always-causing-havoc-at-the-last-moment-ly Yours,

I think you misunderstand.  Rich comparisons are mostly about allowing
the separate overloading of <, <=, ==, !=, >, and >=.  This is useful
in its own light.

If you don't want to use this overloading facility for elementwise
comparisons in NumPy, that's fine with me.  Nobody says you have to --
it's just that you *could*.

Red my lips: there won't be *any* new operators in 2.1.

There will a better way to overload the existing Boolean operators,
and they will be able to return non-Boolean results.  That's useful in
other situations besides NumPy.

Feel free to lobby for elementwise operators -- but based on the
discussion about this subject so far, I don't give it much of a chance
even past Python 2.1.  They would add a lot of baggage to the language
(e.g. the table of operators in all Python books would be about twice
as long) and by far the most users don't care about them.  (Read the
intro to 211 for some of the concerns -- this PEP tries to make the
addition palatable by adding exactly *one* new operator.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Dec 15 23:16:34 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 17:16:34 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Fri, 08 Dec 2000 17:58:03 EST."
             <200012082258.RAA02389@cj20424-a.reston1.va.home.com> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>  
            <200012082258.RAA02389@cj20424-a.reston1.va.home.com> 
Message-ID: <200012152216.RAA11098@cj20424-a.reston1.va.home.com>

I've checked in the essential parts of the warnings PEP, and closed
the SF patch.  I haven't checked in the examples in the patch -- it's
too early for that.  But I figured that it's easier to revise the code
once it's checked in.  I'm pretty confident that it works as
advertised.

Still missing is documentation: the warnings module, the new API
functions, and the new command line option should all be documented.
I'll work on that over the holidays.

I consider the PEP done.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Fri Dec 15 23:21:24 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:21:24 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com>
Message-ID: <3A3A9964.A6B3DD11@lemburg.com>

Neil Schemenauer wrote:
> 
> On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote:
> > I'm not sure I agree with that view either, but mostly because there
> > is a non-GPL replacement for parts of the readline API:
> >
> >     http://www.cstr.ed.ac.uk/downloads/editline.html
> 
> It doesn't work with the current readline module.  It is much
> smaller than readline and works just as well in my experience.
> Would there be any interest in including a copy with the standard
> distribution?  The license is quite nice (X11 type).

+1 from here -- line editing is simply a very important part of
an interactive prompt and readline is not only big, slow and
full of strange surprises, but also GPLed ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Fri Dec 15 23:24:34 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:24:34 +0100
Subject: [Python-Dev] Use of %c and Py_UNICODE
References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com>
Message-ID: <3A3A9A22.E9BA9551@lemburg.com>

"A.M. Kuchling" wrote:
> 
> unicodeobject.c contains this code:
> 
>                 PyErr_Format(PyExc_ValueError,
>                             "unsupported format character '%c' (0x%x) "
>                             "at index %i",
>                             c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat));
> 
> c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits,
> so '%\u3000' % 1 results in an error message containing "'\000'
> (0x3000)".  Is this worth fixing?  I'd say no, since the hex value is
> more useful for Unicode strings anyway.  (I still wanted to mention
> this little buglet, since I just touched this bit of code.)

Why would you want to fix it ? Format characters will always
be ASCII and thus 7-bit -- theres really no need to expand the
set of possibilities beyond 8 bits ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fdrake at acm.org  Fri Dec 15 23:22:34 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 15 Dec 2000 17:22:34 -0500 (EST)
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012152216.RAA11098@cj20424-a.reston1.va.home.com>
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>
	<200012082258.RAA02389@cj20424-a.reston1.va.home.com>
	<200012152216.RAA11098@cj20424-a.reston1.va.home.com>
Message-ID: <14906.39338.795843.947683@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > Still missing is documentation: the warnings module, the new API
 > functions, and the new command line option should all be documented.
 > I'll work on that over the holidays.

  I've assigned a bug to you in case you forget.  I've given it a
"show-stopper" priority level, so I'll feel good ripping the code out
if you don't get docs written in time.  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From mal at lemburg.com  Fri Dec 15 23:39:18 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:39:18 +0100
Subject: [Python-Dev] What to do about PEP 229?
References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com>
Message-ID: <3A3A9D96.80781D61@lemburg.com>

"A.M. Kuchling" wrote:
> 
> I began writing the fabled fancy setup script described in PEP 229,
> and then realized there was duplication going on here.  The code in
> setup.py would need to know what libraries, #defines, &c., are needed
> by each module in order to check if they're needed and set them.  But
> if Modules/Setup can be used to override setup.py's behaviour, then
> much of this information would need to be in that file, too; the
> details of compiling a module are in two places.
> 
> Possibilities:
> 
> 1) Setup contains fully-loaded module descriptions, and the setup
>    script drops unneeded bits.  For example, the socket module
>    requires -lnsl on some platforms.  The Setup file would contain
>    "socket socketmodule.c -lnsl" on all platforms, and setup.py would
>    check for an nsl library and only use if it's there.
> 
>    This seems dodgy to me; what if -ldbm is needed on one platform and
>    -lndbm on another?

Can't distutils try both and then settle for the working combination ?

[distutils isn't really ready for auto-configure yet, but Greg
has already provided most of the needed functionality -- it's just
not well integrated into the rest of the build process in version 1.0.1
... BTW, where is Gerg ? I haven't heard from him in quite a while.]
 
> 2) Drop setup completely and just maintain setup.py, with some
>    different overriding mechanism.  This is more radical.  Adding a
>    new module is then not just a matter of editing a simple text file;
>    you'd have to modify setup.py, making it more like maintaining an
>    autoconf script.

Why not parse Setup and use it as input to distutils setup.py ?
 
> Remember, the underlying goal of PEP 229 is to have the out-of-the-box
> Python installation you get from "./configure;make" contain many more
> useful modules; right now you wouldn't get zlib, syslog, resource, any
> of the DBM modules, PyExpat, &c.  I'm not wedded to using Distutils to
> get that, but think that's the only practical way; witness the hackery
> required to get the DB module automatically compiled.
> 
> You can also wave your hands in the direction of packagers such as
> ActiveState or Red Hat, and say "let them make to compile everything".
> But this problem actually inconveniences *me*, since I always build
> Python myself and have to extensively edit Setup, so I'd like to fix
> the problem.
> 
> Thoughts?

Nice idea :-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Fri Dec 15 23:44:15 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:44:15 +0100
Subject: [Python-Dev] Death to string functions!
References: <200012151509.HAA18093@slayer.i.sourceforge.net>  
	            <20001215041450.B22056@glacier.fnational.com> <200012151929.OAA03073@cj20424-a.reston1.va.home.com>
Message-ID: <3A3A9EBF.3F9306B6@lemburg.com>

Guido van Rossum wrote:
> 
> > Can you explain the logic behind this recent interest in removing
> > string functions from the standard library?  It it performance?
> > Some unicode issue?  I don't have a great attachment to string.py
> > but I also don't see the justification for the amount of work it
> > requires.
> 
> I figure that at *some* point we should start putting our money where
> our mouth is, deprecate most uses of the string module, and start
> warning about it.  Not in 2.1 probably, given my experience below.
> 
> As a realistic test of the warnings module I played with some warnings
> about the string module, and then found that say most of the std
> library modules use it, triggering an extraordinary amount of
> warnings.  I then decided to experiment with the conversion.  I
> quickly found out it's too much work to do manually, so I'll hold off
> until someone comes up with a tool that does 99% of the work.

This would also help a lot of programmers out there who are
stuch with 100k LOCs of Python code using string.py ;)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Fri Dec 15 23:49:01 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:49:01 +0100
Subject: [Python-Dev] Death to string functions!
References: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com> <200012152123.QAA03566@cj20424-a.reston1.va.home.com>
Message-ID: <3A3A9FDD.E6F021AF@lemburg.com>

Guido van Rossum wrote:
> 
> Ideally, I would like to deprecate the entire string module, so that I
> can place a single warning at its top.  This will cause a single
> warning to be issued for programs that still use it (no matter how
> many times it is imported).  Unfortunately, there are a couple of
> things that still need it: string.letters etc., and
> string.maketrans().

Can't we come up with a module similar to unicodedata[.py] ? 

string.py could then still provide the interfaces, but the
implementation would live in stringdata.py

[Perhaps we won't need stringdata by then... Unicode will have
 taken over and the discussion be mood ;-)]

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Fri Dec 15 23:54:25 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 15 Dec 2000 23:54:25 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <20001215040304.A22056@glacier.fnational.com>; from nas@arctrix.com on Fri, Dec 15, 2000 at 04:03:04AM -0800
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com>
Message-ID: <20001215235425.A29681@xs4all.nl>

On Fri, Dec 15, 2000 at 04:03:04AM -0800, Neil Schemenauer wrote:
> On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote:
> > I'm not sure I agree with that view either, but mostly because there
> > is a non-GPL replacement for parts of the readline API:
> > 
> >     http://www.cstr.ed.ac.uk/downloads/editline.html
> 
> It doesn't work with the current readline module.  It is much
> smaller than readline and works just as well in my experience.
> Would there be any interest in including a copy with the standard
> distribution?  The license is quite nice (X11 type).

Definately +1 from here. Readline reminds me of the cold war, for some
reason. (Actually, multiple reasons ;) I don't have time to do it myself,
unfortunately, or I would. (Looking at editline has been on my TODO list for
a while... :P)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From martin at loewis.home.cs.tu-berlin.de  Sat Dec 16 13:32:30 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 16 Dec 2000 13:32:30 +0100
Subject: [Python-Dev] PEP 226
Message-ID: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de>

I remember earlier discussion on the Python 2.1 release schedule, and
never managed to comment on those.

I believe that Python contributors and maintainers did an enourmous
job in releasing Python 2, which took quite some time from everybody's
life. I think it is unrealistic to expect the same amount of
commitment for the next release, especially if that release appears
just a few months after the previous release (that is, one month from
now).

So I'd like to ask the release manager to take that into
account. I'm not quite sure what kind of action I expect; possible
alternatives are:
- declare 2.1 a pure bug fix release only; with a minimal set of new
  features. In particular, don't push for completion of PEPs; everybody
  should then accept that most features that are currently discussed
  will appear in Python 2.2.

- move the schedule for Python 2.1 back (or is it forward?) by, say, a
  few month. This will people give some time to do the things that did
  not get the right amount of attention during 2.0 release, and will
  still allow to work on new and interesting features.

Just my 0.02EUR,

Martin



From guido at python.org  Sat Dec 16 17:38:28 2000
From: guido at python.org (Guido van Rossum)
Date: Sat, 16 Dec 2000 11:38:28 -0500
Subject: [Python-Dev] PEP 226
In-Reply-To: Your message of "Sat, 16 Dec 2000 13:32:30 +0100."
             <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> 
References: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> 
Message-ID: <200012161638.LAA13888@cj20424-a.reston1.va.home.com>

> I remember earlier discussion on the Python 2.1 release schedule, and
> never managed to comment on those.
> 
> I believe that Python contributors and maintainers did an enourmous
> job in releasing Python 2, which took quite some time from everybody's
> life. I think it is unrealistic to expect the same amount of
> commitment for the next release, especially if that release appears
> just a few months after the previous release (that is, one month from
> now).
> 
> So I'd like to ask the release manager to take that into
> account. I'm not quite sure what kind of action I expect; possible
> alternatives are:
> - declare 2.1 a pure bug fix release only; with a minimal set of new
>   features. In particular, don't push for completion of PEPs; everybody
>   should then accept that most features that are currently discussed
>   will appear in Python 2.2.
> 
> - move the schedule for Python 2.1 back (or is it forward?) by, say, a
>   few month. This will people give some time to do the things that did
>   not get the right amount of attention during 2.0 release, and will
>   still allow to work on new and interesting features.
> 
> Just my 0.02EUR,

You're right -- 2.0 (including 1.6) was a monumental effort, and I'm
grateful to all who contributed.

I don't expect that 2.1 will be anywhere near the same amount of work!

Let's look at what's on the table.

0042  Small Feature Requests                 Hylton
 SD  205  pep-0205.txt  Weak References                        Drake
 S   207  pep-0207.txt  Rich Comparisons                       Lemburg, van Rossum
 S   208  pep-0208.txt  Reworking the Coercion Model           Schemenauer
 S   217  pep-0217.txt  Display Hook for Interactive Use       Zadka
 S   222  pep-0222.txt  Web Library Enhancements               Kuchling
 I   226  pep-0226.txt  Python 2.1 Release Schedule            Hylton
 S   227  pep-0227.txt  Statically Nested Scopes               Hylton
 S   230  pep-0230.txt  Warning Framework                      van Rossum
 S   232  pep-0232.txt  Function Attributes                    Warsaw
 S   233  pep-0233.txt  Python Online Help                     Prescod



From guido at python.org  Sat Dec 16 17:46:32 2000
From: guido at python.org (Guido van Rossum)
Date: Sat, 16 Dec 2000 11:46:32 -0500
Subject: [Python-Dev] PEP 226
In-Reply-To: Your message of "Sat, 16 Dec 2000 13:32:30 +0100."
             <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> 
References: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> 
Message-ID: <200012161646.LAA13947@cj20424-a.reston1.va.home.com>

[Oops, I posted a partial edit of this message by mistake before.]

> I remember earlier discussion on the Python 2.1 release schedule, and
> never managed to comment on those.
> 
> I believe that Python contributors and maintainers did an enourmous
> job in releasing Python 2, which took quite some time from everybody's
> life. I think it is unrealistic to expect the same amount of
> commitment for the next release, especially if that release appears
> just a few months after the previous release (that is, one month from
> now).
> 
> So I'd like to ask the release manager to take that into
> account. I'm not quite sure what kind of action I expect; possible
> alternatives are:
> - declare 2.1 a pure bug fix release only; with a minimal set of new
>   features. In particular, don't push for completion of PEPs; everybody
>   should then accept that most features that are currently discussed
>   will appear in Python 2.2.
> 
> - move the schedule for Python 2.1 back (or is it forward?) by, say, a
>   few month. This will people give some time to do the things that did
>   not get the right amount of attention during 2.0 release, and will
>   still allow to work on new and interesting features.
> 
> Just my 0.02EUR,

You're right -- 2.0 (including 1.6) was a monumental effort, and I'm
grateful to all who contributed.

I don't expect that 2.1 will be anywhere near the same amount of work!

Let's look at what's on the table.  These are listed as Active PEPs --
under serious consideration for Python 2.1:

> 0042  Small Feature Requests                 Hylton

We can do some of these or leave them.

> 0205  Weak References                        Drake

This one's open.

> 0207  Rich Comparisons                       Lemburg, van Rossum

This is really not that much work -- I would've done it already if I
weren't distracted by the next one.

> 0208  Reworking the Coercion Model           Schemenauer

Neil has most of this under control.  I don't doubt for a second that
it will be finished.

> 0217  Display Hook for Interactive Use       Zadka

Probably a 20-line fix.

> 0222  Web Library Enhancements               Kuchling

Up to Andrew.  If he doesn't get to it, no big deal.

> 0226  Python 2.1 Release Schedule            Hylton

I still think this is realistic -- a release before the conference
seems doable!

> 0227  Statically Nested Scopes               Hylton

This one's got a 50% chance at least.  Jeremy seems motivated to do
it.

> 0230  Warning Framework                      van Rossum

Done except for documentation.

> 0232  Function Attributes                    Warsaw

We need to discuss this more, but it's not much work to implement.

> 0233  Python Online Help                     Prescod

If Paul can control his urge to want to solve everything at once, I
see no reason whi this one couldn't find its way into 2.1.

Now, officially the PEP deadline is closed today: the schedule says
"16-Dec-2000: 2.1 PEPs ready for review".  That means that no new PEPs
will be considered for inclusion in 2.1, and PEPs not in the active
list won't be considered either.  But the PEPs in the list above are
all ready for review, even if we don't agree with all of them.

I'm actually more worried about the ever-growing number of bug reports
and submitted patches.  But that's for another time.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Sun Dec 17 01:09:28 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Sat, 16 Dec 2000 19:09:28 -0500
Subject: [Python-Dev] Use of %c and Py_UNICODE
In-Reply-To: <3A3A9A22.E9BA9551@lemburg.com>; from mal@lemburg.com on Fri, Dec 15, 2000 at 11:24:34PM +0100
References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> <3A3A9A22.E9BA9551@lemburg.com>
Message-ID: <20001216190928.A6703@kronos.cnri.reston.va.us>

On Fri, Dec 15, 2000 at 11:24:34PM +0100, M.-A. Lemburg wrote:
>Why would you want to fix it ? Format characters will always
>be ASCII and thus 7-bit -- theres really no need to expand the
>set of possibilities beyond 8 bits ;-)

This message is for characters that aren't format characters, which
therefore includes all characters >127.  

--amk




From akuchlin at mems-exchange.org  Sun Dec 17 01:17:39 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Sat, 16 Dec 2000 19:17:39 -0500
Subject: [Python-Dev] What to do about PEP 229?
In-Reply-To: <3A3A9D96.80781D61@lemburg.com>; from mal@lemburg.com on Fri, Dec 15, 2000 at 11:39:18PM +0100
References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com>
Message-ID: <20001216191739.B6703@kronos.cnri.reston.va.us>

On Fri, Dec 15, 2000 at 11:39:18PM +0100, M.-A. Lemburg wrote:
>Can't distutils try both and then settle for the working combination ?

I'm worried about subtle problems; what if an unneeded -lfoo drags in
a customized malloc, or has symbols which conflict with some other
library.  

>... BTW, where is Greg ? I haven't heard from him in quite a while.]

Still around; he just hasn't been posting much these days.

>Why not parse Setup and use it as input to distutils setup.py ?

That was option 1.  The existing Setup format doesn't really contain
enough intelligence, though; the intelligence is usually in comments
such as "Uncomment the following line for Solaris".  So either the
Setup format is modified (bad, since we'd break existing 3rd-party
packages that still use a Makefile.pre.in), or I give up and just do
everything in a setup.py.

--amk



From guido at python.org  Sun Dec 17 03:38:01 2000
From: guido at python.org (Guido van Rossum)
Date: Sat, 16 Dec 2000 21:38:01 -0500
Subject: [Python-Dev] What to do about PEP 229?
In-Reply-To: Your message of "Sat, 16 Dec 2000 19:17:39 EST."
             <20001216191739.B6703@kronos.cnri.reston.va.us> 
References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com>  
            <20001216191739.B6703@kronos.cnri.reston.va.us> 
Message-ID: <200012170238.VAA14466@cj20424-a.reston1.va.home.com>

> >Why not parse Setup and use it as input to distutils setup.py ?
> 
> That was option 1.  The existing Setup format doesn't really contain
> enough intelligence, though; the intelligence is usually in comments
> such as "Uncomment the following line for Solaris".  So either the
> Setup format is modified (bad, since we'd break existing 3rd-party
> packages that still use a Makefile.pre.in), or I give up and just do
> everything in a setup.py.

Forget Setup.  Convert it and be done with it.  There really isn't
enough there to hang on to.  We'll support Setup format (through the
makesetup script and the Misc/Makefile.pre.in file) for 3rd party b/w
compatibility, but we won't need to use it ourselves.  (Too bad for
3rd party documentation that describes the Setup format. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Sun Dec 17 08:34:27 2000
From: tim.one at home.com (Tim Peters)
Date: Sun, 17 Dec 2000 02:34:27 -0500
Subject: [Python-Dev] Use of %c and Py_UNICODE
In-Reply-To: <20001216190928.A6703@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEEHIEAA.tim.one@home.com>

[MAL]
> Why would you want to fix it ? Format characters will always
> be ASCII and thus 7-bit -- theres really no need to expand the
> set of possibilities beyond 8 bits ;-)

[AMK]
> This message is for characters that aren't format characters, which
> therefore includes all characters >127.

I'm with the wise man who suggested to drop the %c in this case and just
display the hex value.  Although it would be more readable to drop the %c if
and only if the bogus format character isn't printable 7-bit ASCII.  Which
is obvious, yes?  A new if/else isn't going to hurt anything.




From tim.one at home.com  Sun Dec 17 08:57:01 2000
From: tim.one at home.com (Tim Peters)
Date: Sun, 17 Dec 2000 02:57:01 -0500
Subject: [Python-Dev] PEP 226
In-Reply-To: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEHIEAA.tim.one@home.com>

[Martin v. Loewis]
> ...
> - move the schedule for Python 2.1 back (or is it forward?) by, say, a
>   few month. This will people give some time to do the things that did
>   not get the right amount of attention during 2.0 release, and will
>   still allow to work on new and interesting features.

Just a stab in the dark, but is one of your real concerns the spotty state
of Unicode support in the std libraries?  If so, nobody working on the PEPs
Guido identified would be likely to work on improving Unicode support even
if the PEPs vanished.  I don't know how Unicode support is going to improve,
but in the absence of visible work in that direction-- or even A Plan to get
some --I doubt we're going to hold up 2.1 waiting for magic.

no-feature-is-ever-done-ly y'rs  - tim




From tim.one at home.com  Sun Dec 17 09:30:24 2000
From: tim.one at home.com (Tim Peters)
Date: Sun, 17 Dec 2000 03:30:24 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <3A387D6A.782E6A3B@prescod.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEEIIEAA.tim.one@home.com>

[Tim]
>> I've rarely seen problems due to shadowing a global, but have often
>> seen problems due to shadowing a builtin.

[Paul Prescod]
> Really?

Yes.

> I think that there are two different issues here. One is consciously
> choosing to create a new variable but not understanding that there
> already exists a variable by that name. (i.e. str, list).

Yes, and that's what I've often seen, typically long after the original code
is written:  someone sticks in some debugging output, or makes a small
change to the implementation, and introduces e.g.

    str = some_preexisting_var + ":"
    yadda(str)

"Suddenly" the program misbehaves in baffling ways.  They're "baffling"
because the errors do not occur on the lines where the changes were made,
and are almost never related to the programmer's intent when making the
changes.

> Another is trying to assign to a global but actually shadowing it.

I've rarely seen that.

> There is no way that anyone coming from another language is going
> to consider this transcript reasonable:

True, but I don't really care:  everyone gets burned once, the better ones
eventually learn to use classes instead of mutating globals, and even the
dull get over it.  It is not, in my experience, an on-going problem for
anyone.  But I still get burned regularly by shadowing builtins.  The burns
are not fatal, however, and I can't think of an ointment less painful than
the blisters.

> >>> a=5
> >>> def show():
> ...    print a
> ...
> >>> def set(val):
> ...     a=val
> ...
> >>> a
> 5
> >>> show()
> 5
> >>> set(10)
> >>> show()
> 5
>
> It doesn't seem to make any sense. My solution is to make the assignment
> in "set" illegal unless you add a declaration that says: "No, really. I
> mean it. Override that sucker." As the PEP points out, overriding is
> seldom a good idea so the requirement to declare would be rarely
> invoked.

I expect it would do less harm to introduce a compile-time warning for
locals that are never referenced (such as the "a" in "set").

> ...
> The "right answer" in terms of namespace theory is to consistently refer
> to builtins with a prefix (whether "__builtins__" or "$") but that's
> pretty unpalatable from an aesthetic point of view.

Right, that's one of the ointments I won't apply to my own code, so wouldn't
think of asking others to either.

WRT mutable globals, people who feel they have to use them would be well
served to adopt a naming convention.  For example, begin each name with "g"
and capitalize the second letter.  This can make global-rich code much
easier to follow (I've done-- and very happily --similar things in
Javascript and C++).




From pf at artcom-gmbh.de  Sun Dec 17 10:59:11 2000
From: pf at artcom-gmbh.de (Peter Funk)
Date: Sun, 17 Dec 2000 10:59:11 +0100 (MET)
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 15, 2000  4:23:46 pm"
Message-ID: <m147ab2-000CxUC@artcom0.artcom-gmbh.de>

Hi,

Guido van Rossum:
> If you're saying that you think the string module is too prominent to
> ever start deprecating its use, I'm afraid we have a problem.
 
I strongly believe the string module is too prominent.

> I'd also like to note that using the string module's wrappers incurs
> the overhead of a Python function call -- using string methods is
> faster.

I think most care more about readbility than about run time performance.
For people without much OOP experience, the method syntax hurts
readability.

> Finally, I like the look of fields[i].strip().lower() much better than
> that of string.lower(string.strip(fields[i])) -- an actual example
> from mimetools.py.

Hmmmm.... May be this is just a matter of taste?  Like my preference
for '<>' instead of '!='?  Personally I still like the old fashinoned
form more.  Especially, if string.join() or string.split() are
involved.

Since Python 1.5.2 will stay around for several years, keeping backward 
compatibility in our Python coding is still major issue for us.
So we won't change our Python coding style soon if ever.  

> Ideally, I would like to deprecate the entire string module, so that I
[...]
I share Mark Lutz and Tim Peters oppinion, that this crusade will do 
more harm than good to Python community.  IMO this is a really bad idea.

Just my $0.02, Peter



From martin at loewis.home.cs.tu-berlin.de  Sun Dec 17 12:13:09 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 17 Dec 2000 12:13:09 +0100
Subject: [Python-Dev] PEP 226
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEEHIEAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCOEEHIEAA.tim.one@home.com>
Message-ID: <200012171113.MAA00733@loewis.home.cs.tu-berlin.de>

> Just a stab in the dark, but is one of your real concerns the spotty state
> of Unicode support in the std libraries?  

Not at all. I really responded to amk's message

# All the PEPs for 2.1 are supposed to be complete for Dec. 16, and
# some of those PEPs are pretty complicated.  I'm a bit worried that
# it's been so quiet on python-dev lately, especially after the
# previous two weeks of lively discussion.

I just thought that something was wrong here - contributing to a free
software project ought to be fun for contributors, not a cause for
worries.

There-are-other-things-but-i18n-although-they-are-not-that-interesting y'rs,
Martin



From guido at python.org  Sun Dec 17 15:38:07 2000
From: guido at python.org (Guido van Rossum)
Date: Sun, 17 Dec 2000 09:38:07 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: Your message of "Sun, 17 Dec 2000 03:30:24 EST."
             <LNBBLJKPBEHFEDALKOLCKEEIIEAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCKEEIIEAA.tim.one@home.com> 
Message-ID: <200012171438.JAA21603@cj20424-a.reston1.va.home.com>

> I expect it would do less harm to introduce a compile-time warning for
> locals that are never referenced (such as the "a" in "set").

Another warning that would be quite useful (and trap similar cases)
would be "local variable used before set".

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Sun Dec 17 15:40:40 2000
From: guido at python.org (Guido van Rossum)
Date: Sun, 17 Dec 2000 09:40:40 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Sun, 17 Dec 2000 10:59:11 +0100."
             <m147ab2-000CxUC@artcom0.artcom-gmbh.de> 
References: <m147ab2-000CxUC@artcom0.artcom-gmbh.de> 
Message-ID: <200012171440.JAA21620@cj20424-a.reston1.va.home.com>

> I think most care more about readbility than about run time performance.
> For people without much OOP experience, the method syntax hurts
> readability.

I don't believe one bit of this.  By that standard, we would do better
to define a new module "list" and start writing list.append(L, x) for
L.append(x).

> I share Mark Lutz and Tim Peters oppinion, that this crusade will do 
> more harm than good to Python community.  IMO this is a really bad
> idea.

You are entitled to your opinion, but given that your arguments seem
very weak I will continue to ignore it (except to argue with you :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Sun Dec 17 17:17:12 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Sun, 17 Dec 2000 11:17:12 -0500
Subject: [Python-Dev] Death to string functions!
References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com>
	<m147ab2-000CxUC@artcom0.artcom-gmbh.de>
Message-ID: <14908.59144.321167.419762@anthem.concentric.net>

>>>>> "PF" == Peter Funk <pf at artcom-gmbh.de> writes:

    PF> Hmmmm.... May be this is just a matter of taste?  Like my
    PF> preference for '<>' instead of '!='?  Personally I still like
    PF> the old fashinoned form more.  Especially, if string.join() or
    PF> string.split() are involved.

Hey cool!  I prefer <> over != too, but I also (not surprisingly)
strongly prefer string methods over string module functions.

TOOWTDI-MA-ly y'rs,
-Barry



From gvwilson at nevex.com  Sun Dec 17 17:25:17 2000
From: gvwilson at nevex.com (Greg Wilson)
Date: Sun, 17 Dec 2000 11:25:17 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <14908.59144.321167.419762@anthem.concentric.net>
Message-ID: <000201c06845$f1afdb40$770a0a0a@nevex.com>

+1 on deprecating string functions.  Every Python book and tutorial
(including mine) emphasizes Python's simplicity and lack of Perl-ish
redundancy; the more we practice what we preach, the more persuasive
this argument is.

Greg (who admittedly only has a few thousand lines of Python to maintain)



From pf at artcom-gmbh.de  Sun Dec 17 18:40:06 2000
From: pf at artcom-gmbh.de (Peter Funk)
Date: Sun, 17 Dec 2000 18:40:06 +0100 (MET)
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012171440.JAA21620@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 17, 2000  9:40:40 am"
Message-ID: <m147hn4-000CxUC@artcom0.artcom-gmbh.de>

[string.function(S, ...) vs. S.method(...)]

Guido van Rossum:
> I don't believe one bit of this.  By that standard, we would do better
> to define a new module "list" and start writing list.append(L, x) for
> L.append(x).

list objects have only very few methods.  Strings have so many methods.  
Some of them have names, that clash easily with the method names 
of other kind of objects.  Since there are no type declarations in
Python, looking at the code in isolation and seeing a line
	i = string.index(some_parameter)
tells at the first glance, that some_parameter should be a string
object even if the doc string of this function is too terse.  However
in 
	i = some_parameter.index()
it could be a list, a database or whatever.  

> You are entitled to your opinion, but given that your arguments seem
> very weak I will continue to ignore it (except to argue with you :-).

I see.  But given the time frame that the string module wouldn't
go away any time soon, I guess I have a lot of time to either think
about some stronger arguments or to get finally accustomed to that
new style of coding.  But since we have to keep compatibility with
Python 1.5.2 for at least the next two years chances for the latter
are bad.

Regards and have a nice vacation, Peter



From mwh21 at cam.ac.uk  Sun Dec 17 19:18:24 2000
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 17 Dec 2000 18:18:24 +0000
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: Thomas Wouters's message of "Fri, 15 Dec 2000 23:54:25 +0100"
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> <20001215235425.A29681@xs4all.nl>
Message-ID: <m3hf42q5cf.fsf@atrus.jesus.cam.ac.uk>

Thomas Wouters <thomas at xs4all.net> writes:

> On Fri, Dec 15, 2000 at 04:03:04AM -0800, Neil Schemenauer wrote:
> > On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote:
> > > I'm not sure I agree with that view either, but mostly because there
> > > is a non-GPL replacement for parts of the readline API:
> > > 
> > >     http://www.cstr.ed.ac.uk/downloads/editline.html
> > 
> > It doesn't work with the current readline module.  It is much
> > smaller than readline and works just as well in my experience.
> > Would there be any interest in including a copy with the standard
> > distribution?  The license is quite nice (X11 type).
> 
> Definately +1 from here. Readline reminds me of the cold war, for
> some reason. (Actually, multiple reasons ;) I don't have time to do
> it myself, unfortunately, or I would. (Looking at editline has been
> on my TODO list for a while... :P)

It wouldn't be particularly hard to rewrite editline in Python (we
have termios & the terminal handling functions in curses - and even
ioctl if we get really keen).

I've been hacking on my own Python line reader on and off for a while;
it's still pretty buggy, but if you're feeling brave you could look at:

http://www-jcsu.jesus.cam.ac.uk/~mwh21/hacks/pyrl-0.0.0.tar.gz

To try it out, unpack it, cd into the ./pyrl directory and try:

>>> import foo # sorry
>>> foo.test_loop()

It sort of imitates the Python command prompt, except that it doesn't
actually execute the code you type.

You need a recent _cursesmodule.c for it to work.

Cheers,
M.

-- 
41. Some programming languages manage to absorb change, but 
    withstand progress.
  -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html




From thomas at xs4all.net  Sun Dec 17 19:30:38 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Sun, 17 Dec 2000 19:30:38 +0100
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <000201c06845$f1afdb40$770a0a0a@nevex.com>; from gvwilson@nevex.com on Sun, Dec 17, 2000 at 11:25:17AM -0500
References: <14908.59144.321167.419762@anthem.concentric.net> <000201c06845$f1afdb40$770a0a0a@nevex.com>
Message-ID: <20001217193038.C29681@xs4all.nl>

On Sun, Dec 17, 2000 at 11:25:17AM -0500, Greg Wilson wrote:

> +1 on deprecating string functions.

How wonderfully ambiguous ! Do you mean string methods, or the string module?
:)

FWIW, I agree that in time, the string module should be deprecated. But I
also think that 'in time' should be a considerable timespan. Don't deprecate
it before everything it provides is available though some other means. Wait
a bit longer than that, even, before calling it deprecated -- that scares
people off. And then keep it for practically forever (until Py3K) just to
support old code. And don't forget to document it 'deprecated' everywhere,
not just one minor release note.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tismer at tismer.com  Sun Dec 17 18:38:31 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sun, 17 Dec 2000 19:38:31 +0200
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com>
Message-ID: <3A3CFA17.ED26F51A@tismer.com>

Old topic: {}.popitem() (was Re: {}.first[key,value,item] ...)

Christian Tismer wrote:
> 
> Fredrik Lundh wrote:
> >
> > christian wrote:
> > > That algorithm is really a gem which you should know,
> > > so let me try to explain it.
> >
> > I think someone just won the "brain exploder 2000" award ;-)

<snip>

> As you might have guessed, I didn't do this just for fun.
> It is the old game of explaining what is there, convincing
> everybody that you at least know what you are talking about,
> and then three days later coming up with an improved
> application of the theory.
> 
> Today is Monday, 2 days left. :-)

Ok, today is Sunday, I had no time to finish this.
But now it is here.

                  ===========================
                  =====    Claim:       =====
                  ===========================

-  Dictionary access time can be improved with a minimal change -

On the hash() function:
All Objects are supposed to provide a hash function which is
as good as possible.
Good means to provide a wide range of different keys for different
values.

Problem: There are hash functions which are "good" in this sense,
but they do not spread their randomness uniformly over the
32 bits.

Example: Integers use their own value as hash.
This is ok, as far as the integers are uniformly distributed.
But if they all contain a high power of two, for instance,
the low bits give a very bad hash function.

Take a dictionary with integers range(1000) as keys and access
all entries. Then use a dictionay with the integers shifted
left by 16.
Access time is slowed down by a factor of 100, since every
access is a linear search now.

This is not an urgent problem, although applications exist
where this can play a role (memory addresses for instance
can have high factors of two when people do statistics on
page accesses...)

While this is not a big problem, it is ugly enough to
think of a solution.

Solution 1:
-------------
Try to involve more bits of the hash value by doing extra
shuffling, either 
a) in the dictlook function, or
b) in the hash generation itself.

I believe, both can't be really justified for a rare problem.
But how about changing the existing solution in a way that
an improvement is gained without extra cost?

Solution 2: (*the* solution)
----------------------------
Some people may remember what I wrote about re-hashing
functions through the multiplicative group GF(2^n)*,
and I don't want to repeat this here.
The simple idea can be summarized quickly:

The original algorithm uses multiplication by polynomials,
and it is guaranteed that these re-hash values are jittering
through all possible nonzero patterns of the n bits.

Observation: Whe are using an operation of a finite field.
This means that the inverse of multiplication also exists!

Old algortithm (multiplication):
      shift the index left by 1
      if index > mask:
          xor the index with the generator polynomial

New algorithm (division):
      if low bit of index set:
          xor the index with the generator polynomial
      shift the index right by 1

What does this mean? Not so much, we are just cycling through
our bit patterns in reverse order.

But now for the big difference.

First change:    We change from multiplication to division.
Second change:   We do not mask the hash value before!

The second change is what I was after: By not masking the
hash value when computing the initial index, all the
existing bits in the hash come into play.

This can be seen like a polynomial division, but the initial
remainder (the hash value) was not normalized. After a number
of steps, all the extra bits are wheeled into our index,
but not wasted by masking them off. That gives our re-hash
some more randomness. When all the extra bits are sucked
in, the guaranteed single-visit cycle begins. There cannot
be more than 27 extra cycles in the worst case (dict size
= 32, so there are 27 bits to consume).

I do not expect any bad effect from this modification.

Here some results, dictionaries have 1000 entries:

timing for strings              old=  5.097 new= 5.088
timing for bad integers (<<10)  old=101.540 new=12.610
timing for bad integers (<<16)  old=571.210 new=19.220

On strings, both algorithms behave the same.
On numbers, they differ dramatically.
While the current algorithm is 110 times slower on a worst
case dict (quadratic behavior), the new algorithm accounts
a little for the extra cycle, but is only 4 times slower.

Alternative implementation:
The above approach is conservative in the sense that it
tries not to slow down the current implementation in any
way. An alternative would be to comsume all of the extra
bits at once. But this would add an extra "warmup" loop
like this to the algorithm:

    while index > mask:
        if low bit of index set:
            xor the index with the generator polynomial
        shift the index right by 1

This is of course a very good digest of the higher bits,
since it is a polynomial division and not just some
bit xor-ing which might give quite predictable cancellations,
therefore it is "the right way" in my sense.
It might be cheap, but would add over 20 cycles to every
small dict. I therefore don't think it is worth to do this.

Personally, I prefer the solution to merge the bits during
the actual lookup, since it suffices to get access time
from quadratic down to logarithmic.

Attached is a direct translation of the relevant parts
of dictobject.c into Python, with both algorithms
implemented.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com
-------------- next part --------------
## dictest.py
## Test of a new rehash algorithm
## Chris Tismer
## 2000-12-17
## Mission Impossible 5oftware Team

# The following is a partial re-implementation of
# Python dictionaries in Python.
# The original algorithm was literally turned
# into Python code.

##/*
##Table of irreducible polynomials to efficiently cycle through
##GF(2^n)-{0}, 2<=n<=30.
##*/
polys = [
    4 + 3,
    8 + 3,
    16 + 3,
    32 + 5,
    64 + 3,
    128 + 3,
    256 + 29,
    512 + 17,
    1024 + 9,
    2048 + 5,
    4096 + 83,
    8192 + 27,
    16384 + 43,
    32768 + 3,
    65536 + 45,
    131072 + 9,
    262144 + 39,
    524288 + 39,
    1048576 + 9,
    2097152 + 5,
    4194304 + 3,
    8388608 + 33,
    16777216 + 27,
    33554432 + 9,
    67108864 + 71,
    134217728 + 39,
    268435456 + 9,
    536870912 + 5,
    1073741824 + 83,
    0
]


class NULL: pass

class Dictionary:
    dummy = "<dummy key>"
    def __init__(mp, newalg=0):
        mp.ma_size = 0
        mp.ma_poly = 0
        mp.ma_table = []
        mp.ma_fill = 0
        mp.ma_used = 0
        mp.oldalg = not newalg

    def lookdict(mp, key, _hash):
        me_hash, me_key, me_value = range(3) # rec slots
        dummy = mp.dummy
        
        mask = mp.ma_size-1
        ep0 = mp.ma_table
        i = (~_hash) & mask
        ep = ep0[i]
        if ep[me_key] is NULL or ep[me_key] == key:
            return ep
        if ep[me_key] == dummy:
            freeslot = ep
        else:
            if (ep[me_hash] == _hash and
                cmp(ep[me_key], key) == 0) :
                return ep
            freeslot = NULL

###### FROM HERE
        if mp.oldalg:
            incr = (_hash ^ (_hash >> 3)) & mask
        else:
            # note that we do not mask!
            # even the shifting my not be worth it.
            incr = _hash ^ (_hash >> 3)
###### TO HERE
            
        if (not incr):
            incr = mask
        while 1:
            ep = ep0[(i+incr)&mask]
            if (ep[me_key] is NULL) :
                if (freeslot != NULL) :
                    return freeslot
                else:
                    return ep
            if (ep[me_key] == dummy) :
                if (freeslot == NULL):
                    freeslot = ep
            elif (ep[me_key] == key or
                 (ep[me_hash] == _hash and
                  cmp(ep[me_key], key) == 0)) :
                return ep

            # Cycle through GF(2^n)-{0}
###### FROM HERE
            if mp.oldalg:
                incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            else:
                # new algorithm: do a division
                if (incr & 1):
                    incr = incr ^ mp.ma_poly
                incr = incr >> 1
###### TO HERE

    def insertdict(mp, key, _hash, value):
        me_hash, me_key, me_value = range(3) # rec slots
        
        ep = mp.lookdict(key, _hash)
        if (ep[me_value] is not NULL) :
            old_value = ep[me_value]
            ep[me_value] = value
        else :
            if (ep[me_key] is NULL):
                mp.ma_fill=mp.ma_fill+1
            ep[me_key] = key
            ep[me_hash] = _hash
            ep[me_value] = value
            mp.ma_used = mp.ma_used+1

    def dictresize(mp, minused):
        me_hash, me_key, me_value = range(3) # rec slots

        oldsize = mp.ma_size
        oldtable = mp.ma_table
        MINSIZE = 4
        newsize = MINSIZE
        for i in range(len(polys)):
            if (newsize > minused) :
                newpoly = polys[i]
                break
            newsize = newsize << 1
        else:
            return -1

        _nullentry = range(3)
        _nullentry[me_hash] = 0
        _nullentry[me_key] = NULL
        _nullentry[me_value] = NULL

        newtable = map(lambda x,y=_nullentry:y[:], range(newsize))      

        mp.ma_size = newsize
        mp.ma_poly = newpoly
        mp.ma_table = newtable
        mp.ma_fill = 0
        mp.ma_used = 0

        for ep in oldtable:
            if (ep[me_value] is not NULL):
                mp.insertdict(ep[me_key],ep[me_hash],ep[me_value])
        return 0

    # PyDict_GetItem
    def __getitem__(op, key):
        me_hash, me_key, me_value = range(3) # rec slots

        if not op.ma_table:
            raise KeyError, key
        _hash = hash(key)
        return op.lookdict(key, _hash)[me_value]

    # PyDict_SetItem
    def __setitem__(op, key, value):
        mp = op
        _hash = hash(key)
##      /* if fill >= 2/3 size, double in size */
        if (mp.ma_fill*3 >= mp.ma_size*2) :
            if (mp.dictresize(mp.ma_used*2) != 0):
                if (mp.ma_fill+1 > mp.ma_size):
                    raise MemoryError
        mp.insertdict(key, _hash, value)

    # more interface functions
    def keys(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _key)
        return res

    def values(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _value)
        return res

    def items(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( (_key, _value) )
        return res

    def __cmp__(self, other):
        mine = self.items()
        others = other.items()
        mine.sort()
        others.sort()
        return cmp(mine, others)

######################################################
## tests

def timing(func, args=None, n=1, **keywords) :
	import time
	time=time.time
	appl=apply
	if args is None: args = ()
	if type(args) != type(()) : args=(args,)
	rep=range(n)
	dummyarg = ("",)
	dummykw = {}
	dummyfunc = len
	if keywords:
		before=time()
		for i in rep: res=appl(dummyfunc, dummyarg, dummykw)
		empty = time()-before
		before=time()
		for i in rep: res=appl(func, args, keywords)
	else:
		before=time()
		for i in rep: res=appl(dummyfunc, dummyarg)
		empty = time()-before
		before=time()
		for i in rep: res=appl(func, args)
	after = time()
	return round(after-before-empty,4), res

def test(lis, dic):
    for key in lis: dic[key]

def nulltest(lis, dic):
    for key in lis: dic

def string_dicts():
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	for i in range(1000):
		s = str(i) * 5
		d1[s] = d2[s] = i
	return d1, d2

def badnum_dicts():
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	for i in range(1000):
		bad = i << 16
		d1[bad] = d2[bad] = i
	return d1, d2

def do_test(dict, keys, n):
	t0 = timing(nulltest, (keys, dict), n)[0]
	t1 = timing(test, (keys, dict), n)[0]
	return t1-t0

if __name__ == "__main__":
	sdold, sdnew = string_dicts()
	bdold, bdnew = badnum_dicts()
	print "timing for strings old=%.3f new=%.3f" % (
		  do_test(sdold, sdold.keys(), 100),
		  do_test(sdnew, sdnew.keys(), 100) )
	print "timing for bad integers old=%.3f new=%.3f" % (
		  do_test(bdold, bdold.keys(), 10) *10,
		  do_test(bdnew, bdnew.keys(), 10) *10)

		  
"""
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.097 new=5.088
timing for bad integers old=101.540 new=12.610
"""

From fdrake at acm.org  Sun Dec 17 19:49:58 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sun, 17 Dec 2000 13:49:58 -0500 (EST)
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <20001217193038.C29681@xs4all.nl>
References: <14908.59144.321167.419762@anthem.concentric.net>
	<000201c06845$f1afdb40$770a0a0a@nevex.com>
	<20001217193038.C29681@xs4all.nl>
Message-ID: <14909.2774.158973.760077@cj42289-a.reston1.va.home.com>

Thomas Wouters writes:
 > FWIW, I agree that in time, the string module should be deprecated. But I
 > also think that 'in time' should be a considerable timespan. Don't deprecate

  *If* most functions in the string module are going to be deprecated,
that should be done *now*, so that the documentation will include the
appropriate warning to users.  When they should actually be removed is
another matter, and I think Guido is sufficiently aware of their
widespread use and won't remove them too quickly -- his creation of
Python isn't the reason he's *accepted* as BDFL, it just made it a
possibility.  He's had to actually *earn* the BDFL position, I think.
  With regard to converting the standard library to string methods:
that needs to be done as part of the deprecation.  The code in the
library is commonly used as example code, and should be good example
code wherever possible.

 > support old code. And don't forget to document it 'deprecated' everywhere,
 > not just one minor release note.

  When Guido tells me exactly what is deprecated, the documentation
will be updated with proper deprecation notices in the appropriate
places.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tismer at tismer.com  Sun Dec 17 19:10:07 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sun, 17 Dec 2000 20:10:07 +0200
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com>
Message-ID: <3A3D017F.62AD599F@tismer.com>



Christian Tismer wrote:

...
(my timings)
Attached is the updated script with the timings mentioned
in the last posting. Sorry, I posted an older version before.

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com
-------------- next part --------------
## dictest.py
## Test of a new rehash algorithm
## Chris Tismer
## 2000-12-17
## Mission Impossible 5oftware Team

# The following is a partial re-implementation of
# Python dictionaries in Python.
# The original algorithm was literally turned
# into Python code.

##/*
##Table of irreducible polynomials to efficiently cycle through
##GF(2^n)-{0}, 2<=n<=30.
##*/
polys = [
    4 + 3,
    8 + 3,
    16 + 3,
    32 + 5,
    64 + 3,
    128 + 3,
    256 + 29,
    512 + 17,
    1024 + 9,
    2048 + 5,
    4096 + 83,
    8192 + 27,
    16384 + 43,
    32768 + 3,
    65536 + 45,
    131072 + 9,
    262144 + 39,
    524288 + 39,
    1048576 + 9,
    2097152 + 5,
    4194304 + 3,
    8388608 + 33,
    16777216 + 27,
    33554432 + 9,
    67108864 + 71,
    134217728 + 39,
    268435456 + 9,
    536870912 + 5,
    1073741824 + 83,
    0
]


class NULL: pass

class Dictionary:
    dummy = "<dummy key>"
    def __init__(mp, newalg=0):
        mp.ma_size = 0
        mp.ma_poly = 0
        mp.ma_table = []
        mp.ma_fill = 0
        mp.ma_used = 0
        mp.oldalg = not newalg

    def lookdict(mp, key, _hash):
        me_hash, me_key, me_value = range(3) # rec slots
        dummy = mp.dummy
        
        mask = mp.ma_size-1
        ep0 = mp.ma_table
        i = (~_hash) & mask
        ep = ep0[i]
        if ep[me_key] is NULL or ep[me_key] == key:
            return ep
        if ep[me_key] == dummy:
            freeslot = ep
        else:
            if (ep[me_hash] == _hash and
                cmp(ep[me_key], key) == 0) :
                return ep
            freeslot = NULL

###### FROM HERE
        if mp.oldalg:
            incr = (_hash ^ (_hash >> 3)) & mask
        else:
            # note that we do not mask!
            # even the shifting my not be worth it.
            incr = _hash ^ (_hash >> 3)
###### TO HERE
            
        if (not incr):
            incr = mask
        while 1:
            ep = ep0[(i+incr)&mask]
            if (ep[me_key] is NULL) :
                if (freeslot != NULL) :
                    return freeslot
                else:
                    return ep
            if (ep[me_key] == dummy) :
                if (freeslot == NULL):
                    freeslot = ep
            elif (ep[me_key] == key or
                 (ep[me_hash] == _hash and
                  cmp(ep[me_key], key) == 0)) :
                return ep

            # Cycle through GF(2^n)-{0}
###### FROM HERE
            if mp.oldalg:
                incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            else:
                # new algorithm: do a division
                if (incr & 1):
                    incr = incr ^ mp.ma_poly
                incr = incr >> 1
###### TO HERE

    def insertdict(mp, key, _hash, value):
        me_hash, me_key, me_value = range(3) # rec slots
        
        ep = mp.lookdict(key, _hash)
        if (ep[me_value] is not NULL) :
            old_value = ep[me_value]
            ep[me_value] = value
        else :
            if (ep[me_key] is NULL):
                mp.ma_fill=mp.ma_fill+1
            ep[me_key] = key
            ep[me_hash] = _hash
            ep[me_value] = value
            mp.ma_used = mp.ma_used+1

    def dictresize(mp, minused):
        me_hash, me_key, me_value = range(3) # rec slots

        oldsize = mp.ma_size
        oldtable = mp.ma_table
        MINSIZE = 4
        newsize = MINSIZE
        for i in range(len(polys)):
            if (newsize > minused) :
                newpoly = polys[i]
                break
            newsize = newsize << 1
        else:
            return -1

        _nullentry = range(3)
        _nullentry[me_hash] = 0
        _nullentry[me_key] = NULL
        _nullentry[me_value] = NULL

        newtable = map(lambda x,y=_nullentry:y[:], range(newsize))      

        mp.ma_size = newsize
        mp.ma_poly = newpoly
        mp.ma_table = newtable
        mp.ma_fill = 0
        mp.ma_used = 0

        for ep in oldtable:
            if (ep[me_value] is not NULL):
                mp.insertdict(ep[me_key],ep[me_hash],ep[me_value])
        return 0

    # PyDict_GetItem
    def __getitem__(op, key):
        me_hash, me_key, me_value = range(3) # rec slots

        if not op.ma_table:
            raise KeyError, key
        _hash = hash(key)
        return op.lookdict(key, _hash)[me_value]

    # PyDict_SetItem
    def __setitem__(op, key, value):
        mp = op
        _hash = hash(key)
##      /* if fill >= 2/3 size, double in size */
        if (mp.ma_fill*3 >= mp.ma_size*2) :
            if (mp.dictresize(mp.ma_used*2) != 0):
                if (mp.ma_fill+1 > mp.ma_size):
                    raise MemoryError
        mp.insertdict(key, _hash, value)

    # more interface functions
    def keys(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _key)
        return res

    def values(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _value)
        return res

    def items(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( (_key, _value) )
        return res

    def __cmp__(self, other):
        mine = self.items()
        others = other.items()
        mine.sort()
        others.sort()
        return cmp(mine, others)

######################################################
## tests

def timing(func, args=None, n=1, **keywords) :
	import time
	time=time.time
	appl=apply
	if args is None: args = ()
	if type(args) != type(()) : args=(args,)
	rep=range(n)
	dummyarg = ("",)
	dummykw = {}
	dummyfunc = len
	if keywords:
		before=time()
		for i in rep: res=appl(dummyfunc, dummyarg, dummykw)
		empty = time()-before
		before=time()
		for i in rep: res=appl(func, args, keywords)
	else:
		before=time()
		for i in rep: res=appl(dummyfunc, dummyarg)
		empty = time()-before
		before=time()
		for i in rep: res=appl(func, args)
	after = time()
	return round(after-before-empty,4), res

def test(lis, dic):
    for key in lis: dic[key]

def nulltest(lis, dic):
    for key in lis: dic

def string_dicts():
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	for i in range(1000):
		s = str(i) * 5
		d1[s] = d2[s] = i
	return d1, d2

def badnum_dicts():
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	shift = 10
	if EXTREME:
		shift = 16
	for i in range(1000):
		bad = i << 16
		d1[bad] = d2[bad] = i
	return d1, d2

def do_test(dict, keys, n):
	t0 = timing(nulltest, (keys, dict), n)[0]
	t1 = timing(test, (keys, dict), n)[0]
	return t1-t0

EXTREME=1

if __name__ == "__main__":
	sdold, sdnew = string_dicts()
	bdold, bdnew = badnum_dicts()
	print "timing for strings old=%.3f new=%.3f" % (
		  do_test(sdold, sdold.keys(), 100),
		  do_test(sdnew, sdnew.keys(), 100) )
	print "timing for bad integers old=%.3f new=%.3f" % (
		  do_test(bdold, bdold.keys(), 10) *10,
		  do_test(bdnew, bdnew.keys(), 10) *10)
  
"""
Results with a shift of 10 (EXTREME=0):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.097 new=5.088
timing for bad integers old=101.540 new=12.610

Results with a shift of 16 (EXTREME=1):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.218 new=5.147
timing for bad integers old=571.210 new=19.220
"""

From lutz at rmi.net  Sun Dec 17 20:09:47 2000
From: lutz at rmi.net (Mark Lutz)
Date: Sun, 17 Dec 2000 12:09:47 -0700
Subject: [Python-Dev] Death to string functions!
References: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com>  <200012152123.QAA03566@cj20424-a.reston1.va.home.com>
Message-ID: <001f01c0685c$ef555200$7bdb5da6@vaio>

As a longstanding Python advocate and user, I find this
thread disturbing, and feel compelled to add a few words:

> > [Tim wrote:]
> > "string" is right up there with "os" and "sys" as a FIM (Frequently
> > Imported Module), so the required code changes will be massive.  As
> > a user, I don't see what's in it for me to endure that pain: the
> > string module functions work fine!  Neither are they warts in the
> > language, any more than that we say sin(pi) instead of pi.sin().
> > Keeping the functions around doesn't hurt anybody that I can see.
> 
> [Guido wrote:]
> Hm.  I'm not saying that this one will be easy.  But I don't like
> having "two ways to do it".  It means more learning, etc. (you know
> the drill).

But with all due respect, there are already _lots_ of places 
in Python that provide at least two ways to do something
already.  Why be so strict on this one alone?

Consider lambda and def; tuples and lists; map and for 
loops; the loop else and boolean exit flags; and so on.  
The notion of Python forcing a single solution is largely a 
myth. And as someone who makes a living teaching this 
stuff, I can tell you that none of the existing redundancies 
prevent anyone from learning Python.

More to the point, many of those shiny new features added 
to 2.0 fall squarely into this category too, and are completely 
redundant with other tools.  Consider list comprehensions
and simple loops; extended print statements and sys.std* 
assignments; augmented assignment statements and simpler
ones.  Eliminating redundancy at a time when we're also busy
introducing it seems a tough goal to sell.

I understand the virtues of aesthetics too, but removing the
string module seems an incredibly arbitrary application of it.  


> If you're saying that you think the string module is too prominent to
> ever start deprecating its use, I'm afraid we have a problem.
>
> [...]
> Ideally, I'd like to deprecate the entire string module, so that I
> can place a single warning at its top.  This will cause a single
> warning to be issued for programs that still use it (no matter how
> many times it is imported).

And to me, this seems the real crux of the matter. For a 
decade now, the string module _has_ been the right way to
do it.  And today, half a million Python developers absolutely
rely on it as an essential staple in their toolbox.  What could 
possibly be wrong with keeping it around for backward 
compatibility, albeit as a less recommended option?

If almost every Python program ever written suddenly starts 
issuing warning messages, then I think we do have a problem
indeed.  Frankly, a Python that changes without regard to its
user base seems an ominous thing to me.  And keep in mind 
that I like Python; others will look much less generously upon
a tool that seems inclined to rip the rug out from under its users.
Trust me on this; I've already heard the rumblings out there.

So please: can we keep string around?  Like it or not, we're 
way past the point of removing such core modules at this point.
Such a radical change might pass in a future non-backward-
compatible Python mutation; I'm not sure such a different
system will still be "Python", but that's a topic for another day.

All IMHO, of course,
--Mark Lutz  (http://www.rmi.net~lutz)





From tim.one at home.com  Sun Dec 17 20:50:55 2000
From: tim.one at home.com (Tim Peters)
Date: Sun, 17 Dec 2000 14:50:55 -0500
Subject: [Python-Dev] SourceForge SSH silliness
Message-ID: <LNBBLJKPBEHFEDALKOLCCEFCIEAA.tim.one@home.com>

Starting last night, I get this msg whenever I update Python code w/
CVSROOT=:ext:tim_one at cvs.python.sourceforge.net:/cvsroot/python:

"""
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@       WARNING: HOST IDENTIFICATION HAS CHANGED!         @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that the host key has just been changed.
Please contact your system administrator.
Add correct host key in C:\Code/.ssh/known_hosts to get rid of this message.
Password authentication is disabled to avoid trojan horses.
"""

This is SourceForge's doing, and is permanent (they've changed keys on their
end).  Here's a link to a thread that may or may not make sense to you:

http://sourceforge.net/forum/forum.php?forum_id=52867

Deleting the sourceforge entries from my .ssh/known_hosts file worked for
me.  But everyone in the thread above who tried it says that they haven't
been able to get scp working again (I haven't tried it yet ...).




From paulp at ActiveState.com  Sun Dec 17 21:04:27 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 17 Dec 2000 12:04:27 -0800
Subject: [Python-Dev] Pragmas and warnings
Message-ID: <3A3D1C4B.8F08A744@ActiveState.com>

A couple of other threads started me to thinking that there are a couple
of things missing from our warnings framework. 

Many languages have pragmas that allow you turn warnings on and off in
code. For instance, I should be able to put a pragma at the top of a
module that uses string functions to say: "I know that this module
doesn't adhere to the latest Python conventions. Please don't warn me
about it." I should also be able to put a declaration that says: "I'm
really paranoid about shadowing globals and builtins. Please warn me
when I do that."

Batch and visual linters could also use the declarations to customize
their behaviors.

And of course we have a stack of other features that could use pragmas:

 * type signatures
 * Unicode syntax declarations
 * external object model language binding hints
 * ...

A case could be made that warning pragmas could use a totally different
syntax from "user-defined" pragmas. I don't care much.

 Paul



From thomas at xs4all.net  Sun Dec 17 22:00:08 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Sun, 17 Dec 2000 22:00:08 +0100
Subject: [Python-Dev] SourceForge SSH silliness
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEFCIEAA.tim.one@home.com>; from tim.one@home.com on Sun, Dec 17, 2000 at 02:50:55PM -0500
References: <LNBBLJKPBEHFEDALKOLCCEFCIEAA.tim.one@home.com>
Message-ID: <20001217220008.D29681@xs4all.nl>

On Sun, Dec 17, 2000 at 02:50:55PM -0500, Tim Peters wrote:
> Starting last night, I get this msg whenever I update Python code w/
> CVSROOT=:ext:tim_one at cvs.python.sourceforge.net:/cvsroot/python:

> """
> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> @       WARNING: HOST IDENTIFICATION HAS CHANGED!         @
> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
> Someone could be eavesdropping on you right now (man-in-the-middle attack)!
> It is also possible that the host key has just been changed.
> Please contact your system administrator.
> Add correct host key in C:\Code/.ssh/known_hosts to get rid of this message.
> Password authentication is disabled to avoid trojan horses.
> """

> This is SourceForge's doing, and is permanent (they've changed keys on their
> end).  Here's a link to a thread that may or may not make sense to you:

> http://sourceforge.net/forum/forum.php?forum_id=52867

> Deleting the sourceforge entries from my .ssh/known_hosts file worked for
> me.  But everyone in the thread above who tried it says that they haven't
> been able to get scp working again (I haven't tried it yet ...).

What sourceforge did was switch Linux distributions, and upgrade. The switch
doesn't really matter for the SSH problem, because recent Debian and recent
RedHat releases both use a new ssh, the OpenBSD ssh imlementation.
Apparently, it isn't entirely backwards compatible to old versions of
F-secure ssh. For one thing, it doesn't support the 'idea' cypher. This
might or might not be your problem; if it is, you should get a decent
message that gives a relatively clear message such as 'cypher type 'idea'
not supported'. You should be able to pass the '-c' option to scp/ssh to use
a different cypher, like 3des (aka triple-des.) Or maybe the windows
versions have a menu to configure that kind of thing :) 

Another possible problem is that it might not have good support for older
protocol versions. The 'current' protocol version, at least for 'ssh1', is
1.5. The one message on the sourceforge thread above that actually mentions
a version in the *cough* bugreport is using an older ssh that only supports
protocol version 1.4. Since that particular version of F-secure ssh has
known problems (why else would they release 16 more versions ?) I'd suggest
anyone with problems first try a newer version. I hope that doesn't break
WinCVS, but it would suck if it did :P

If that doesn't work, which is entirely possible, it might be an honest bug
in the OpenBSD ssh that Sourceforge is using. If anyone cared, we could do a
bit of experimenting with the openssh-2.0 betas installed by Debian woody
(unstable) to see if the problem occurs there as well.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From greg at cosc.canterbury.ac.nz  Mon Dec 18 00:05:41 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 18 Dec 2000 12:05:41 +1300 (NZDT)
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <20001216014341.5BA97A82E@darjeeling.zadka.site.co.il>
Message-ID: <200012172305.MAA02512@s454.cosc.canterbury.ac.nz>

Moshe Zadka <moshez at zadka.site.co.il>:

> Perl and Scheme permit implicit shadowing too.

But Scheme always requires declarations!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From martin at loewis.home.cs.tu-berlin.de  Mon Dec 18 00:45:56 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 18 Dec 2000 00:45:56 +0100
Subject: [Python-Dev] Death to string functions!
Message-ID: <200012172345.AAA00877@loewis.home.cs.tu-berlin.de>

> But with all due respect, there are already _lots_ of places in
> Python that provide at least two ways to do something already.

Exactly. My favourite one here is string exceptions, which have quite
some analogy to the string module.

At some time, there were only string exceptions. Then, instance
exceptions were added, some releases later they were considered the
better choice, so the standard library was converted to use them.
Still, there is no sign whatsoever that anybody plans to deprecate
string exceptions.

I believe the string module will get less importance over
time. Comparing it with string exception, that may be well 5 years.
It seems there are two ways of "deprecation": a loud "we will remove
that, change your code", and a silent "strings have methods"
(i.e. don't mention the module when educating people). The latter
approach requires educators to agree that the module is
"uninteresting", and people to really not use once they find out it
exists.

I think deprecation should be only attempted once there is a clear
sign that people don't use it massively for new code anymore. Removal
should only occur if keeping the module less pain than maintaining it.

Regards,
Martin




From skip at mojam.com  Mon Dec 18 00:55:10 2000
From: skip at mojam.com (Skip Montanaro)
Date: Sun, 17 Dec 2000 17:55:10 -0600 (CST)
Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in?
Message-ID: <14909.21086.92774.940814@beluga.mojam.com>

I executed cvs update today (removing the sourceforge machines from
.ssh/known_hosts worked fine for me, btw) followed by a configure and a make
clean.  The last step failed with this output:

    ...
    make[1]: Entering directory `/home/beluga/skip/src/python/dist/src/Modules'
    Makefile.pre.in:20: *** missing separator.  Stop.
    make[1]: Leaving directory `/home/beluga/skip/src/python/dist/src/Modules'
    make: [clean] Error 2 (ignored)

I found the following at line 20 of Modules/Makefile.pre.in:

    @SET_CXX@

I then tried a cvs annotate on that file but saw that line 20 had been there
since rev 1.60 (16-Dec-99).  I then checked the top-level Makefile.in
thinking something must have changed in the clean target recently, but cvs
annotate shows no recent changes there either:

    1.1          (guido    24-Dec-93): clean:		localclean
    1.1          (guido    24-Dec-93): 		-for i in $(SUBDIRS); do \
    1.74         (guido    19-May-98): 		    if test -d $$i; then \
    1.24         (guido    20-Jun-96): 			(echo making clean in subdirectory $$i; cd $$i; \
    1.4          (guido    01-Aug-94): 			 if test -f Makefile; \
    1.4          (guido    01-Aug-94): 			 then $(MAKE) clean; \
    1.4          (guido    01-Aug-94): 			 else $(MAKE) -f Makefile.*in clean; \
    1.4          (guido    01-Aug-94): 			 fi); \
    1.74         (guido    19-May-98): 		    else true; fi; \
    1.1          (guido    24-Dec-93): 		done

Make distclean succeeded so I tried the following:

    make distclean
    ./configure
    make clean

but the last step still failed.  Any idea why make clean is now failing (for
me)?  Can anyone else reproduce this problem?

Skip



From greg at cosc.canterbury.ac.nz  Mon Dec 18 01:02:32 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 18 Dec 2000 13:02:32 +1300 (NZDT)
Subject: [Python-Dev] Use of %c and Py_UNICODE
In-Reply-To: <3A3A9A22.E9BA9551@lemburg.com>
Message-ID: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz>

"M.-A. Lemburg" <mal at lemburg.com>:

> Format characters will always
> be ASCII and thus 7-bit -- theres really no need to expand the
> set of possibilities beyond 8 bits ;-)

But the error message is being produced because the
character is NOT a valid format character. One of the
reasons for that might be because it's not in the
7-bit range!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From MarkH at ActiveState.com  Mon Dec 18 07:02:27 2000
From: MarkH at ActiveState.com (Mark Hammond)
Date: Mon, 18 Dec 2000 17:02:27 +1100
Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in?
In-Reply-To: <14909.21086.92774.940814@beluga.mojam.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPIEJJCMAA.MarkH@ActiveState.com>

> I found the following at line 20 of Modules/Makefile.pre.in:
>
>     @SET_CXX@

I dont have time to investigate this specific problem, but I definitely had
problems with SET_CXX around 6 months back.  This was trying to build an
external C++ application, so may be different.  My message and other
followups at the time implied noone really knew and everyone agreed it was
likely SET_CXX was broken :-(  I even referenced the CVS chekin that I
thought broke it.

Mark.




From mal at lemburg.com  Mon Dec 18 10:58:37 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 10:58:37 +0100
Subject: [Python-Dev] Pragmas and warnings
References: <3A3D1C4B.8F08A744@ActiveState.com>
Message-ID: <3A3DDFCD.34AB05B2@lemburg.com>

Paul Prescod wrote:
> 
> A couple of other threads started me to thinking that there are a couple
> of things missing from our warnings framework.
> 
> Many languages have pragmas that allow you turn warnings on and off in
> code. For instance, I should be able to put a pragma at the top of a
> module that uses string functions to say: "I know that this module
> doesn't adhere to the latest Python conventions. Please don't warn me
> about it." I should also be able to put a declaration that says: "I'm
> really paranoid about shadowing globals and builtins. Please warn me
> when I do that."
> 
> Batch and visual linters could also use the declarations to customize
> their behaviors.
> 
> And of course we have a stack of other features that could use pragmas:
> 
>  * type signatures
>  * Unicode syntax declarations
>  * external object model language binding hints
>  * ...
> 
> A case could be made that warning pragmas could use a totally different
> syntax from "user-defined" pragmas. I don't care much.

There was a long thread about this some months ago. We agreed
to add a new keyword to the language (I think it was "define")
which then uses a very simple syntax which can be interpreted 
at compile time to modify the behaviour of the compiler, e.g.

define <identifier> = <literal>

There was also a discussion about allowing limited forms of
expressions instead of the constant literal.

define source_encoding = "utf-8"

was the original motivation for this, but (as always ;) the
usefulness for other application areas was quickly recognized, e.g.
to enable compilation in optimization mode on a per module
basis.

PS: "define" is perhaps not obscure enough as keyword...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Dec 18 11:04:08 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 11:04:08 +0100
Subject: [Python-Dev] Use of %c and Py_UNICODE
References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz>
Message-ID: <3A3DE118.3355896D@lemburg.com>

Greg Ewing wrote:
> 
> "M.-A. Lemburg" <mal at lemburg.com>:
> 
> > Format characters will always
> > be ASCII and thus 7-bit -- theres really no need to expand the
> > set of possibilities beyond 8 bits ;-)
> 
> But the error message is being produced because the
> character is NOT a valid format character. One of the
> reasons for that might be because it's not in the
> 7-bit range!

True. 

I think removing %c completely in that case is the right
solution (in case you don't want to convert the Unicode char
using the default encoding to a string first).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Dec 18 11:09:16 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 11:09:16 +0100
Subject: [Python-Dev] What to do about PEP 229?
References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com> <20001216191739.B6703@kronos.cnri.reston.va.us>
Message-ID: <3A3DE24C.DA0B2F6C@lemburg.com>

Andrew Kuchling wrote:
> 
> On Fri, Dec 15, 2000 at 11:39:18PM +0100, M.-A. Lemburg wrote:
> >Can't distutils try both and then settle for the working combination ?
> 
> I'm worried about subtle problems; what if an unneeded -lfoo drags in
> a customized malloc, or has symbols which conflict with some other
> library.

In that case, I think the user will have to decide. setup.py should
then default to not integrating the module in question and issue
a warning telling the use what to look for and how to call setup.py
in order to add the right combination of libs.
 
> >... BTW, where is Greg ? I haven't heard from him in quite a while.]
> 
> Still around; he just hasn't been posting much these days.

Good to know :)
 
> >Why not parse Setup and use it as input to distutils setup.py ?
> 
> That was option 1.  The existing Setup format doesn't really contain
> enough intelligence, though; the intelligence is usually in comments
> such as "Uncomment the following line for Solaris".  So either the
> Setup format is modified (bad, since we'd break existing 3rd-party
> packages that still use a Makefile.pre.in), or I give up and just do
> everything in a setup.py.

I would still like a simple input to setup.py -- one that doesn't
require hacking setup.py just to enable a few more modules.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fredrik at effbot.org  Mon Dec 18 11:15:26 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Mon, 18 Dec 2000 11:15:26 +0100
Subject: [Python-Dev] Use of %c and Py_UNICODE
References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> <3A3DE118.3355896D@lemburg.com>
Message-ID: <004a01c068db$72403170$3c6340d5@hagrid>

mal wrote:

> > But the error message is being produced because the
> > character is NOT a valid format character. One of the
> > reasons for that might be because it's not in the
> > 7-bit range!
> 
> True. 
> 
> I think removing %c completely in that case is the right
> solution (in case you don't want to convert the Unicode char
> using the default encoding to a string first).

how likely is it that a human programmer will use a bad formatting
character that's not in the ASCII range?

-1 on removing it -- people shouldn't have to learn the octal ASCII
table just to be able to fix trivial typos.

+1 on mapping the character back to a string in the same was as
"repr" -- that is, print ASCII characters as is, map anything else to
an octal escape.

+0 on leaving it as it is, or mapping non-printables to "?".

</F>




From mal at lemburg.com  Mon Dec 18 11:34:02 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 11:34:02 +0100
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com>
Message-ID: <3A3DE81A.4B825D89@lemburg.com>

> Here some results, dictionaries have 1000 entries:
> 
> timing for strings              old=  5.097 new= 5.088
> timing for bad integers (<<10)  old=101.540 new=12.610
> timing for bad integers (<<16)  old=571.210 new=19.220

Even though I think concentrating on string keys would provide more
performance boost for Python in general, I think you have a point
there. +1 from here.

BTW, would changing the hash function on strings from the simple
XOR scheme to something a little smarter help improve the performance
too (e.g. most strings used in programming never use the 8-th
bit) ?

I also think that we could inline the string compare function
in dictobject:lookdict_string to achieve even better performance.
Currently it uses a function which doesn't trigger compiler
inlining.

And finally: I think a generic PyString_Compare() API would
be useful in a lot of places where strings are being compared
(e.g. dictionaries and keyword parameters). Unicode already
has such an API (along with dozens of other useful APIs which
are not available for strings).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Dec 18 11:41:38 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 11:41:38 +0100
Subject: [Python-Dev] Use of %c and Py_UNICODE
References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> <3A3DE118.3355896D@lemburg.com> <004a01c068db$72403170$3c6340d5@hagrid>
Message-ID: <3A3DE9E2.77FF0FA9@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:
> 
> > > But the error message is being produced because the
> > > character is NOT a valid format character. One of the
> > > reasons for that might be because it's not in the
> > > 7-bit range!
> >
> > True.
> >
> > I think removing %c completely in that case is the right
> > solution (in case you don't want to convert the Unicode char
> > using the default encoding to a string first).
> 
> how likely is it that a human programmer will use a bad formatting
> character that's not in the ASCII range?

Not very likely... the most common case of this error is probably
the use of % as percent sign in a formatting string. The next
character in those cases is usually whitespace.
 
> -1 on removing it -- people shouldn't have to learn the octal ASCII
> table just to be able to fix trivial typos.
> 
> +1 on mapping the character back to a string in the same was as
> "repr" -- that is, print ASCII characters as is, map anything else to
> an octal escape.
> 
> +0 on leaving it as it is, or mapping non-printables to "?".

Agreed.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tismer at tismer.com  Mon Dec 18 12:08:34 2000
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 18 Dec 2000 13:08:34 +0200
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> <3A3DE81A.4B825D89@lemburg.com>
Message-ID: <3A3DF032.5F86AD15@tismer.com>


"M.-A. Lemburg" wrote:
> 
> > Here some results, dictionaries have 1000 entries:
> >
> > timing for strings              old=  5.097 new= 5.088
> > timing for bad integers (<<10)  old=101.540 new=12.610
> > timing for bad integers (<<16)  old=571.210 new=19.220
> 
> Even though I think concentrating on string keys would provide more
> performance boost for Python in general, I think you have a point
> there. +1 from here.
> 
> BTW, would changing the hash function on strings from the simple
> XOR scheme to something a little smarter help improve the performance
> too (e.g. most strings used in programming never use the 8-th
> bit) ?

Yes, it would. I spent the rest of last night to do more
accurate tests, also refined the implementation (using
longs for the shifts etc), and turned from timing over to
trip counting, i.e. a dict counts every round through the
re-hash. That showed two things:
- The bits used from the string hash are not well distributed
- using a "warmup wheel" on the hash to suck all bits in
  gives the same quality of hashes like random numbers.

I will publish some results later today.

> I also think that we could inline the string compare function
> in dictobject:lookdict_string to achieve even better performance.
> Currently it uses a function which doesn't trigger compiler
> inlining.

Sure!

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From guido at python.org  Mon Dec 18 15:20:22 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 09:20:22 -0500
Subject: [Python-Dev] The Dictionary Gem is polished!
In-Reply-To: Your message of "Sun, 17 Dec 2000 19:38:31 +0200."
             <3A3CFA17.ED26F51A@tismer.com> 
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com>  
            <3A3CFA17.ED26F51A@tismer.com> 
Message-ID: <200012181420.JAA25063@cj20424-a.reston1.va.home.com>

> Problem: There are hash functions which are "good" in this sense,
> but they do not spread their randomness uniformly over the
> 32 bits.
> 
> Example: Integers use their own value as hash.
> This is ok, as far as the integers are uniformly distributed.
> But if they all contain a high power of two, for instance,
> the low bits give a very bad hash function.
> 
> Take a dictionary with integers range(1000) as keys and access
> all entries. Then use a dictionay with the integers shifted
> left by 16.
> Access time is slowed down by a factor of 100, since every
> access is a linear search now.

Ai.  I think what happened is this: long ago, the hash table sizes
were primes, or at least not powers of two!

I'll leave it to the more mathematically-inclined to judge your
solution...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Dec 18 15:52:35 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 09:52:35 -0500
Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in?
In-Reply-To: Your message of "Sun, 17 Dec 2000 17:55:10 CST."
             <14909.21086.92774.940814@beluga.mojam.com> 
References: <14909.21086.92774.940814@beluga.mojam.com> 
Message-ID: <200012181452.JAA04372@cj20424-a.reston1.va.home.com>

> Make distclean succeeded so I tried the following:
> 
>     make distclean
>     ./configure
>     make clean
> 
> but the last step still failed.  Any idea why make clean is now failing (for
> me)?  Can anyone else reproduce this problem?

Yes.  I don't understand it, but this takes care of it:

    make distclean
    ./configure
    make Makefiles		# <--------- !!!
    make clean

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Dec 18 15:54:20 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 09:54:20 -0500
Subject: [Python-Dev] Pragmas and warnings
In-Reply-To: Your message of "Mon, 18 Dec 2000 10:58:37 +0100."
             <3A3DDFCD.34AB05B2@lemburg.com> 
References: <3A3D1C4B.8F08A744@ActiveState.com>  
            <3A3DDFCD.34AB05B2@lemburg.com> 
Message-ID: <200012181454.JAA04394@cj20424-a.reston1.va.home.com>

> There was a long thread about this some months ago. We agreed
> to add a new keyword to the language (I think it was "define")

I don't recall agreeing.  :-)

This is PEP material.  For 2.2, please!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Mon Dec 18 15:56:33 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 15:56:33 +0100
Subject: [Python-Dev] Pragmas and warnings
References: <3A3D1C4B.8F08A744@ActiveState.com>  
	            <3A3DDFCD.34AB05B2@lemburg.com> <200012181454.JAA04394@cj20424-a.reston1.va.home.com>
Message-ID: <3A3E25A1.DFD2BDBF@lemburg.com>

Guido van Rossum wrote:
> 
> > There was a long thread about this some months ago. We agreed
> > to add a new keyword to the language (I think it was "define")
> 
> I don't recall agreeing.  :-)

Well, maybe it was a misinterpretation on my part... you said
something like "add a new keyword and live with the consequences".
AFAIR, of course :-)

> This is PEP material.  For 2.2, please!

Ok.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at python.org  Mon Dec 18 16:15:26 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 10:15:26 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Sun, 17 Dec 2000 12:09:47 MST."
             <001f01c0685c$ef555200$7bdb5da6@vaio> 
References: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com> <200012152123.QAA03566@cj20424-a.reston1.va.home.com>  
            <001f01c0685c$ef555200$7bdb5da6@vaio> 
Message-ID: <200012181515.KAA04571@cj20424-a.reston1.va.home.com>

[Mark Lutz]
> So please: can we keep string around?  Like it or not, we're 
> way past the point of removing such core modules at this point.

Of course we're keeping string around.  I already said that for
backwards compatibility reasons it would not disappear before Py3K.

I think there's a misunderstanding about the meaning of deprecation,
too.  That word doesn't mean to remove a feature.  It doesn't even
necessarily mean to warn every time a feature is used.  It just means
(to me) that at some point in the future the feature will change or
disappear, there's a new and better way to do it, and that we
encourage users to start using the new way, to save them from work
later.

In my mind, there's no reason to start emitting warnings about every
deprecated feature.  The warnings are only needed late in the
deprecation cycle.  PEP 5 says "There must be at least a one-year
transition period between the release of the transitional version of
Python and the release of the backwards incompatible version."

Can we now stop getting all bent out of shape over this?  String
methods *are* recommended over equivalent string functions.  Those
string functions *are* already deprecated, in the informal sense
(i.e. just that it is recommended to use string methods instead).
This *should* (take notice, Fred!) be documented per 2.1.  We won't
however be issuing run-time warnings about the use of string functions
until much later.  (Lint-style tools may start warning sooner --
that's up to the author of the lint tool to decide.)

Note that I believe Java makes a useful distinction that PEP 5 misses:
it defines both deprecated features and obsolete features.
*Deprecated* features are simply features for which a better
alternative exists.  *Obsolete* features are features that are only
being kept around for backwards compatibility.  Deprecated features
may also be (and usually are) *obsolescent*, meaning they will become
obsolete in the future.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Dec 18 16:22:09 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 10:22:09 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Mon, 18 Dec 2000 00:45:56 +0100."
             <200012172345.AAA00877@loewis.home.cs.tu-berlin.de> 
References: <200012172345.AAA00877@loewis.home.cs.tu-berlin.de> 
Message-ID: <200012181522.KAA04597@cj20424-a.reston1.va.home.com>

> At some time, there were only string exceptions. Then, instance
> exceptions were added, some releases later they were considered the
> better choice, so the standard library was converted to use them.
> Still, there is no sign whatsoever that anybody plans to deprecate
> string exceptions.

Now there is: I hereby state that I officially deprecate string
exceptions.  Py3K won't support them, and it *may* even require that
all exception classes are derived from Exception.

> I believe the string module will get less importance over
> time. Comparing it with string exception, that may be well 5 years.
> It seems there are two ways of "deprecation": a loud "we will remove
> that, change your code", and a silent "strings have methods"
> (i.e. don't mention the module when educating people). The latter
> approach requires educators to agree that the module is
> "uninteresting", and people to really not use once they find out it
> exists.

Exactly.  This is what I hope will happen.  I certainly hope that Mark
Lutz has already started teaching string methods!

> I think deprecation should be only attempted once there is a clear
> sign that people don't use it massively for new code anymore.

Right.  So now we're on the first step: get the word out!

> Removal should only occur if keeping the module [is] less pain than
> maintaining it.

Exactly.  Guess where the string module falls today. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From Barrett at stsci.edu  Mon Dec 18 17:50:49 2000
From: Barrett at stsci.edu (Paul Barrett)
Date: Mon, 18 Dec 2000 11:50:49 -0500 (EST)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons 
Message-ID: <14910.16431.554136.374725@nem-srvr.stsci.edu>

Guido van Rossum writes:
 > > 
 > > 1. The current boolean operator behavior does not have to change, and
 > >    hence will be backward compatible.
 > 
 > What incompatibility do you see in the current proposal?

You have to choose between using rich comparisons or boolean
comparisons.  You can't use both for the same (rich/complex) object.

 > > 2. It eliminates the need to decide whether or not rich comparisons
 > >    takes precedence over boolean comparisons.
 > 
 > Only if you want different semantics -- that's only an issue for NumPy.

No. I think NumPy is the tip of the iceberg, when discussing new
semantics.  Most users don't consider these broader semantic issues,
because Python doesn't give them the opportunity to do so.  I can see
possible scenarios of using both boolean and non-boolean comparisons
for Python lists and dictionaries in addition to NumPy.

I chose to use Python because it provides a richer framework than
other languages.  When Python fails to provide such benefits, I'll
move to another language.  I moved from PERL to Python because the
multi-dimensional array syntax is vastly better in Python than PERL,
though as a novice I don't have to know that it exists.  What I'm
proposing here is in a similar vein.

 > > 3. The new operators add additional behavior without directly impacting 
 > >    current behavior and the use of them is unambigous, at least in
 > >    relation to current Python behavior.  You know by the operator what 
 > >    type of comparison will be returned.  This should appease Jim
 > >    Fulton, based on his arguments in 1998 about comparison operators
 > >    always returning a boolean value.
 > 
 > As you know, I'm now pretty close to Jim. :-)  He seemed pretty mellow
 > about this now.

Yes, I would hope so!

It appears though that you misunderstand me.  My point was that I tend
to agree with Jim Fulton's arguments for a limited interpretation of
the current comparison operators.  I too expect them to return a
boolean result.  I have never felt comfortable using such comparison
operators in an array context, e.g. as in the array language, IDL. It
just looks wrong.  So my suggestion is to create new ones whose
implicit meaning is to provide element-wise or rich comparison
behavior.  And to add similar behavior for the other operators for
consistency.

Can someone provide an example in mathematics where comparison
operators are used in a non-boolean, ie. rich comparison, context.
If so, this might shut me up!

 > > 4. Compound objects, such as lists, could implement both rich
 > >    and boolean comparisons.  The boolean comparison would remain as
 > >    is, while the rich comparison would return a list of boolean
 > >    values.  Current behavior doesn't change; just a new feature, which
 > >    you may or may not choose to use, is added.
 > > 
 > > If we go one step further and add the matrix-style operators along
 > > with the comparison operators, we can provide a consistent user
 > > interface to array/complex operations without changing current Python
 > > behavior.  If a user has no need for these new operators, he doesn't
 > > have to use them or even know about them.  All we've done is made
 > > Python richer, but I believe with making it more complex.  For

Phrase should be: "but I believe without making it more complex.".
                                 -------

 > > example, all element-wise operations could have a ':' appended to
 > > them, e.g. '+:', '<:', etc.; and will define element-wise addition,
 > > element-wise less-than, etc.  The traditional '*', '/', etc. operators
 > > can then be used for matrix operations, which will appease the Matlab
 > > people.
 > > 
 > > Therefore, I don't think rich comparisons and matrix-type operators
 > > should be considered separable.  I really think you should consider
 > > this suggestion.  It appeases many groups while providing a consistent 
 > > and clear user interface, while greatly impacting current Python
 > > behavior. 

The last phrase should read: "while not greatly impacting current
                                    ---
Python behavior."

 > > 
 > > Always-causing-havoc-at-the-last-moment-ly Yours,
 > 
 > I think you misunderstand.  Rich comparisons are mostly about allowing
 > the separate overloading of <, <=, ==, !=, >, and >=.  This is useful
 > in its own light.

No, I do understand.  I've read most of the early discussions on this
issue and one of those issues was about having to choose between
boolean and rich comparisons and what should take precedence, when
both may be appropriate.  I'm suggesting an alternative here.

 > If you don't want to use this overloading facility for elementwise
 > comparisons in NumPy, that's fine with me.  Nobody says you have to --
 > it's just that you *could*.

Yes, I understand.

 > Red my lips: there won't be *any* new operators in 2.1.

OK, I didn't expect this to make it into 2.1.

 > There will a better way to overload the existing Boolean operators,
 > and they will be able to return non-Boolean results.  That's useful in
 > other situations besides NumPy.

Yes, I agree, this should be done anyway.  I'm just not sure that the
implicit meaning that these comparison operators are being given is
the best one.  I'm just looking for ways to incorporate rich
comparisons into a broader framework, numpy just currently happens to
be the primary example of this proposal.

Assuming the current comparison operator overloading is already
implemented and has been used to implement rich comparisons for some
objects, then my rich comparison proposal would cause confusion.  This 
is what I'm trying to avoid.

 > Feel free to lobby for elementwise operators -- but based on the
 > discussion about this subject so far, I don't give it much of a chance
 > even past Python 2.1.  They would add a lot of baggage to the language
 > (e.g. the table of operators in all Python books would be about twice
 > as long) and by far the most users don't care about them.  (Read the
 > intro to 211 for some of the concerns -- this PEP tries to make the
 > addition palatable by adding exactly *one* new operator.)

So!  Introductory books don't have to discuss these additional
operators.  I don't have to know about XML and socket modules to start
using Python effectively, nor do I have to know about 'zip' or list
comprehensions.  These additions decrease the code size and increase
efficiency, but don't really add any new expressive power that can't
already be done by a 'for' loop.

I'll try to convince myself that this suggestion is crazy and not
bother you with this issue for awhile.

Cheers,
Paul




From guido at python.org  Mon Dec 18 18:18:11 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 12:18:11 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: Your message of "Mon, 18 Dec 2000 11:50:49 EST."
             <14910.16431.554136.374725@nem-srvr.stsci.edu> 
References: <14910.16431.554136.374725@nem-srvr.stsci.edu> 
Message-ID: <200012181718.MAA14030@cj20424-a.reston1.va.home.com>

Paul Barret:
>  > > 1. The current boolean operator behavior does not have to change, and
>  > >    hence will be backward compatible.

Guido van Rossum:
>  > What incompatibility do you see in the current proposal?

Paul Barret:
> You have to choose between using rich comparisons or boolean
> comparisons.  You can't use both for the same (rich/complex) object.

Sure.  I thought that the NumPy folks were happy with this.  Certainly
two years ago they seemed to be.

>  > > 2. It eliminates the need to decide whether or not rich comparisons
>  > >    takes precedence over boolean comparisons.
>  > 
>  > Only if you want different semantics -- that's only an issue for NumPy.
> 
> No. I think NumPy is the tip of the iceberg, when discussing new
> semantics.  Most users don't consider these broader semantic issues,
> because Python doesn't give them the opportunity to do so.  I can see
> possible scenarios of using both boolean and non-boolean comparisons
> for Python lists and dictionaries in addition to NumPy.

That's the same argument that has been made for new operators all
along.  I've explained already why they are not on the table for 2.1.

> I chose to use Python because it provides a richer framework than
> other languages.  When Python fails to provide such benefits, I'll
> move to another language.  I moved from PERL to Python because the
> multi-dimensional array syntax is vastly better in Python than PERL,
> though as a novice I don't have to know that it exists.  What I'm
> proposing here is in a similar vein.
> 
>  > > 3. The new operators add additional behavior without directly impacting 
>  > >    current behavior and the use of them is unambigous, at least in
>  > >    relation to current Python behavior.  You know by the operator what 
>  > >    type of comparison will be returned.  This should appease Jim
>  > >    Fulton, based on his arguments in 1998 about comparison operators
>  > >    always returning a boolean value.
>  > 
>  > As you know, I'm now pretty close to Jim. :-)  He seemed pretty mellow
>  > about this now.
> 
> Yes, I would hope so!
> 
> It appears though that you misunderstand me.  My point was that I tend
> to agree with Jim Fulton's arguments for a limited interpretation of
> the current comparison operators.  I too expect them to return a
> boolean result.  I have never felt comfortable using such comparison
> operators in an array context, e.g. as in the array language, IDL. It
> just looks wrong.  So my suggestion is to create new ones whose
> implicit meaning is to provide element-wise or rich comparison
> behavior.  And to add similar behavior for the other operators for
> consistency.
> 
> Can someone provide an example in mathematics where comparison
> operators are used in a non-boolean, ie. rich comparison, context.
> If so, this might shut me up!

Not me (I no longer consider myself a mathematician :-).  Why are you
requiring an example from math though?

Again, you will be able to make this argument to the NumPy folks when
they are ready to change the meaning of A<B to mean an array of
Booleans rather than a single Boolean.  Since you're part of the
design team for NumPy NG, you're in a pretty good position to hold out
for elementwise operators!

However, what would you do if elementwise operators were turned down
for ever (which is a realistic possibility)?

In the mean time, I see no harm in *allowing* the comparison operators
to be overridden to return something else besides a Boolean.  Someone
else may find this useful.  (Note that standard types won't use this
new freedom, so I'm not imposing this on anybody -- I'm only giving a
new option.)

>  > > 4. Compound objects, such as lists, could implement both rich
>  > >    and boolean comparisons.  The boolean comparison would remain as
>  > >    is, while the rich comparison would return a list of boolean
>  > >    values.  Current behavior doesn't change; just a new feature, which
>  > >    you may or may not choose to use, is added.
>  > > 
>  > > If we go one step further and add the matrix-style operators along
>  > > with the comparison operators, we can provide a consistent user
>  > > interface to array/complex operations without changing current Python
>  > > behavior.  If a user has no need for these new operators, he doesn't
>  > > have to use them or even know about them.  All we've done is made
>  > > Python richer, but I believe with making it more complex.  For
> 
> Phrase should be: "but I believe without making it more complex.".
>                                  -------
> 
>  > > example, all element-wise operations could have a ':' appended to
>  > > them, e.g. '+:', '<:', etc.; and will define element-wise addition,
>  > > element-wise less-than, etc.  The traditional '*', '/', etc. operators
>  > > can then be used for matrix operations, which will appease the Matlab
>  > > people.
>  > > 
>  > > Therefore, I don't think rich comparisons and matrix-type operators
>  > > should be considered separable.  I really think you should consider
>  > > this suggestion.  It appeases many groups while providing a consistent 
>  > > and clear user interface, while greatly impacting current Python
>  > > behavior. 
> 
> The last phrase should read: "while not greatly impacting current
>                                     ---
> Python behavior."

I don't see any argument for elementwise operators here that I haven't
heard before, and AFAIK it's all in the two PEPs.

>  > > Always-causing-havoc-at-the-last-moment-ly Yours,
>  > 
>  > I think you misunderstand.  Rich comparisons are mostly about allowing
>  > the separate overloading of <, <=, ==, !=, >, and >=.  This is useful
>  > in its own light.
> 
> No, I do understand.  I've read most of the early discussions on this
> issue and one of those issues was about having to choose between
> boolean and rich comparisons and what should take precedence, when
> both may be appropriate.  I'm suggesting an alternative here.

Note that Python doesn't decide which should take precedent.  The
implementer of an individual extension type decides what his
comparison operators will return.

>  > If you don't want to use this overloading facility for elementwise
>  > comparisons in NumPy, that's fine with me.  Nobody says you have to --
>  > it's just that you *could*.
> 
> Yes, I understand.
> 
>  > Red my lips: there won't be *any* new operators in 2.1.
> 
> OK, I didn't expect this to make it into 2.1.
> 
>  > There will a better way to overload the existing Boolean operators,
>  > and they will be able to return non-Boolean results.  That's useful in
>  > other situations besides NumPy.
> 
> Yes, I agree, this should be done anyway.  I'm just not sure that the
> implicit meaning that these comparison operators are being given is
> the best one.  I'm just looking for ways to incorporate rich
> comparisons into a broader framework, numpy just currently happens to
> be the primary example of this proposal.
> 
> Assuming the current comparison operator overloading is already
> implemented and has been used to implement rich comparisons for some
> objects, then my rich comparison proposal would cause confusion.  This 
> is what I'm trying to avoid.

AFAIK, rich comparisons haven't been used anywhere to return
non-Boolean results.

>  > Feel free to lobby for elementwise operators -- but based on the
>  > discussion about this subject so far, I don't give it much of a chance
>  > even past Python 2.1.  They would add a lot of baggage to the language
>  > (e.g. the table of operators in all Python books would be about twice
>  > as long) and by far the most users don't care about them.  (Read the
>  > intro to 211 for some of the concerns -- this PEP tries to make the
>  > addition palatable by adding exactly *one* new operator.)
> 
> So!  Introductory books don't have to discuss these additional
> operators.  I don't have to know about XML and socket modules to start
> using Python effectively, nor do I have to know about 'zip' or list
> comprehensions.  These additions decrease the code size and increase
> efficiency, but don't really add any new expressive power that can't
> already be done by a 'for' loop.
> 
> I'll try to convince myself that this suggestion is crazy and not
> bother you with this issue for awhile.

Happy holidays nevertheless. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Mon Dec 18 19:38:13 2000
From: tim.one at home.com (Tim Peters)
Date: Mon, 18 Dec 2000 13:38:13 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons 
In-Reply-To: <14910.16431.554136.374725@nem-srvr.stsci.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEGLIEAA.tim.one@home.com>

[Paul Barrett]
> ...
> Can someone provide an example in mathematics where comparison
> operators are used in a non-boolean, ie. rich comparison, context.
> If so, this might shut me up!

By my informal accounting, over the years there have been more requests for
three-outcome comparison operators than for elementwise ones, although the
three-outcome lobby isn't organized so is less visible.  It's a natural
request for anyone working with partial orderings (a < b -> one of {yes, no,
unordered}).  Another large group of requests comes from people working with
variants of fuzzy logic, where it's desired that the comparison operators be
definable to return floats (intuitively corresponding to the probability
that the stated relation "is true").  Another desire comes from the symbolic
math camp, which would like to be able to-- as is possible for "+", "*",
etc --define "<" so that e.g. "x < y" return an object capturing that
somebody *asked* for "x < y"; they're not interested in numeric or Boolean
results so much as symbolic expressions.  "<" is used for all these things
in the literature too.

Whatever.  "<" and friends are just collections of pixels.  Add 300 new
operator symbols, and people will want to redefine all of them at will too.

draw-a-line-in-the-sand-and-the-wind-blows-it-away-ly y'rs  - tim




From tim.one at home.com  Mon Dec 18 21:37:13 2000
From: tim.one at home.com (Tim Peters)
Date: Mon, 18 Dec 2000 15:37:13 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012152123.QAA03566@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHBIEAA.tim.one@home.com>

[Guido]
> ...
> If you're saying that we should give users ample time for the
> transition, I'm with you.

Then we're with each other, for suitably large values of "ample" <wink>.

> If you're saying that you think the string module is too prominent to
> ever start deprecating its use, I'm afraid we have a problem.

We may.  Time will tell.  It needs a conversion tool, else I think it's
unsellable.

> ...
> I'd also like to note that using the string module's wrappers incurs
> the overhead of a Python function call -- using string methods is
> faster.
>
> Finally, I like the look of fields[i].strip().lower() much better than
> that of string.lower(string.strip(fields[i])) -- an actual example
> from mimetools.py.

I happen to like string methods better myself; I don't think that's at issue
(except that loads of people apparently don't like "join" as a string
method -- idiots <wink>).

The issue to me is purely breaking old code someday -- "string" is in very
heavy use, and unlike as when deprecating regex in favor of re (either pre
or especially sre), string methods aren't orders of magnitude better than
the old way; and also unlike regex-vs-re it's not the case that the string
module has become unmaintainable (to the contrary, string.py has become
trivial).  IOW, this one would be unprecedented fiddling.

> ...
> Note that I believe Java makes a useful distinction that PEP 5 misses:
> it defines both deprecated features and obsolete features.
> *Deprecated* features are simply features for which a better
> alternative exists.  *Obsolete* features are features that are only
> being kept around for backwards compatibility.  Deprecated features
> may also be (and usually are) *obsolescent*, meaning they will become
> obsolete in the future.

I agree it would be useful to define these terms, although those particular
definitions appear to be missing the most important point from the user's
POV (not a one says "going away someday").  A Google search on "java
obsolete obsolescent deprecated" doesn't turn up anything useful, so I doubt
the usages you have in mind come from Java (it has "deprecated", but doesn't
appear to have any well-defined meaning for the others).

In keeping with the religious nature of the battle-- and religion offers
precise terms for degrees of damnation! --I suggest:

    struggling -- a supported feature; the initial state of
        all features; may transition to Anathematized

    anathematized -- this feature is now cursed, but is supported;
        may transition to Condemned or Struggling; intimacy with
        Anathematized features is perilous

    condemned -- a feature scheduled for crucifixion; may transition
        to Crucified, Anathematized (this transition is called "a pardon"),
        or Struggling (this transition is called "a miracle"); intimacy
        with Condemned features is suicidal

    crucified -- a feature that is no longer supported; may transition
        to Resurrected

    resurrected -- a once-Crucified feature that is again supported;
        may transition to Condemned, Anathematized or Struggling;
        although since Resurrection is a state of grace, there may be
        no point in human time at which a feature is identifiably
        Resurrected (i.e., it may *appear*, to the unenlightened, that
        a feature moved directly from Crucified to Anathematized or
        Struggling or Condemned -- although saying so out loud is heresy).




From tismer at tismer.com  Mon Dec 18 23:58:03 2000
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 18 Dec 2000 23:58:03 +0100
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com>  
	            <3A3CFA17.ED26F51A@tismer.com> <200012181420.JAA25063@cj20424-a.reston1.va.home.com>
Message-ID: <3A3E967B.BE404114@tismer.com>


Guido van Rossum wrote:
[me, expanding on hashes, integers,and how to tame them cheaply]

> Ai.  I think what happened is this: long ago, the hash table sizes
> were primes, or at least not powers of two!

At some time I will wake up and they tell me that I'm reducible :-)

> I'll leave it to the more mathematically-inclined to judge your
> solution...

I love small lists! - ciao - chris

+1   (being a member, hopefully)

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From greg at cosc.canterbury.ac.nz  Tue Dec 19 00:04:42 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 19 Dec 2000 12:04:42 +1300 (NZDT)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEGLIEAA.tim.one@home.com>
Message-ID: <200012182304.MAA02642@s454.cosc.canterbury.ac.nz>

[Paul Barrett]
> ...
> Can someone provide an example in mathematics where comparison
> operators are used in a non-boolean, ie. rich comparison, context.
> If so, this might shut me up!

Not exactly mathematical, but some day I'd like to create
a database access module which lets you say things like

  mydb = OpenDB("inventory")
  parts = mydb.parts
  tuples = mydb.retrieve(parts.name, parts.number).where(parts.quantity >= 42)

Of course, to really make this work I need to be able
to overload "and" and "or" as well, but that's a whole
'nother PEP...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From guido at python.org  Tue Dec 19 00:32:51 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 18:32:51 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: Your message of "Tue, 19 Dec 2000 12:04:42 +1300."
             <200012182304.MAA02642@s454.cosc.canterbury.ac.nz> 
References: <200012182304.MAA02642@s454.cosc.canterbury.ac.nz> 
Message-ID: <200012182332.SAA18456@cj20424-a.reston1.va.home.com>

> Not exactly mathematical, but some day I'd like to create
> a database access module which lets you say things like
> 
>   mydb = OpenDB("inventory")
>   parts = mydb.parts
>   tuples = mydb.retrieve(parts.name, parts.number).where(parts.quantity >= 42)
> 
> Of course, to really make this work I need to be able
> to overload "and" and "or" as well, but that's a whole
> 'nother PEP...

Believe it or not, in 1998 we already had a suggestion for overloading
these too.  This is hinted at in David Ascher's proposal (the Appendix
of PEP 208) where objects could define __boolean_and__ to overload
x<y<z.  It doesn't get all the details right: it's not enough to check
if the left operand is true, since that leaves 'or' out in the cold,
but a different test (i.e. the presence of __boolean_and__) would
work.

I am leaving this out of the current PEP because the bytecode you have
to generate for this is very hairy.  A simple expression like ``f()
and g()'' would become something like:

  outcome = f()
  if hasattr(outcome, '__boolean_and__'):
      outcome = outcome.__boolean_and__(g())
  elif outcome:
      outcome = g()

The problem I have with this is that the code to evaluate g() has to
be generated twice!  In general, g() could be an arbitrary expression.
We can't evaluate g() ahead of time, because it should not be
evaluated at all when outcome is false and doesn't define
__boolean_and__().

For the same reason the current PEP doesn't support x<y<z when x<y
doesn't return a Boolean result; a similar solution would be possible.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Tue Dec 19 01:09:35 2000
From: tim.one at home.com (Tim Peters)
Date: Mon, 18 Dec 2000 19:09:35 -0500
Subject: [Python-Dev] RE: The Dictionary Gem is polished!
In-Reply-To: <3A3CFA17.ED26F51A@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHJIEAA.tim.one@home.com>

Sounds good to me!  It's a very cheap way to get the high bits into play.

>        i = (~_hash) & mask

The ~ here seems like pure superstition to me (and the comments in the C
code don't justify it at all -- I added a nag of my own about that the last
time I checked in dictobject.c -- and see below for a bad consequence of
doing ~).

>            # note that we do not mask!
>            # even the shifting my not be worth it.
>            incr = _hash ^ (_hash >> 3)

The shifting was just another cheap trick to get *some* influence from the
high bits.  It's very limited, though.  Toss it (it appears to be from the
"random operations yield random results" <wink/sigh> matchbook school of
design).

[MAL]
> BTW, would changing the hash function on strings from the simple
> XOR scheme to something a little smarter help improve the performance
> too (e.g. most strings used in programming never use the 8-th
> bit) ?

Don't understand -- the string hash uses multiplication:

    x = (1000003*x) ^ *p++;

in a loop.  Replacing "^" there by "+" should yield slightly better results.
As is, string hashes are a lot like integer hashes, in that "consecutive"
strings

   J001
   J002
   J003
   J004
   ...

yield hashes very close together in value.  But, because the current dict
algorithm uses ~ on the full hash but does not use ~ on the initial
increment, (~hash)+incr too often yields the same result for distinct hashes
(i.e., there's a systematic (but weak) form of clustering).

Note that Python is doing something very unusual:  hashes are *usually*
designed to yield an approximation to randomness across all bits.  But
Python's hashes never achieve that.  This drives theoreticians mad (like the
fellow who originally came up with the GF idea), but tends to work "better
than random" in practice (e.g., a truly random hash function would almost
certainly produce many collisions when fed a fat range of consecutive
integers but still less than half the table size; but Python's trivial
"identity" integer hash produces no collisions in that common case).

[Christian]
> - The bits used from the string hash are not well distributed
> - using a "warmup wheel" on the hash to suck all bits in
>   gives the same quality of hashes like random numbers.

See above and be very cautious:  none of Python's hash functions produce
well-distributed bits, and-- in effect --that's why Python dicts often
perform "better than random" on common data.  Even what you've done so far
appears to provide marginally worse statistics for Guido's favorite kind of
test case ("worse" in two senses:  total number of collisions (a measure of
amortized lookup cost), and maximum collision chain length (a measure of
worst-case lookup cost)):

   d = {}
   for i in range(N):
       d[repr(i)] = i

check-in-one-thing-then-let-it-simmer-ly y'rs  - tim




From tismer at tismer.com  Tue Dec 19 02:16:27 2000
From: tismer at tismer.com (Christian Tismer)
Date: Tue, 19 Dec 2000 02:16:27 +0100
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <Pine.LNX.4.10.10012180848140.20569-100000@akbar.nevex.com>
Message-ID: <3A3EB6EB.C79A3896@tismer.com>



Greg Wilson wrote:
> 
> > > > Here some results, dictionaries have 1000 entries:
> > I will publish some results later today.
> 
> In Doctor Dobb's Journal, right? :-)  We'd *really* like this article...

Well, the results are not so bad:

I stopped testing computation time for the Python dictionary
implementation, in favor of "trips". How many trips does
the re-hash take in a dictionary?

Tests were run for dictionaries of size 1000, 2000, 3000, 4000.

Dictionary 1 consists of i, formatted as string.
Dictionary 2 consists of strings containig the binary of i.
Dictionary 3 consists of random numbers.
Dictionary 4 consists of i << 16.

Algorithms:
old is the original dictionary algorithm implemented
in Python (probably quite correct now, using longs :-)

new is the proposed incremental bits-suck-in-division
algorithm.

new2 is a version of new, where all extra bits of the
hash function are wheeled in in advance. The computation
time of this is not neglectible, so please use this result
for reference, only.


Here the results:
(bad integers(old) not computed for n>1000 )

"""
D:\crml_doc\platf\py>python dictest.py
N=1000
trips for strings old=293 new=302 new2=221
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=499500 new=13187 new2=999
trips for random integers old=377 new=371 new2=393
trips for windows names old=230 new=207 new2=200
N=2000
trips for strings old=1093 new=1109 new2=786
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=26455 new2=1999
trips for random integers old=691 new=710 new2=741
trips for windows names old=503 new=542 new2=564
N=3000
trips for strings old=810 new=839 new2=609
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=38681 new2=2999
trips for random integers old=762 new=740 new2=735
trips for windows names old=712 new=711 new2=691
N=4000
trips for strings old=1850 new=1843 new2=1375
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=52994 new2=3999
trips for random integers old=1440 new=1450 new2=1414
trips for windows names old=1449 new=1434 new2=1457

D:\crml_doc\platf\py>
"""

Interpretation:
---------------
Short numeric strings show a slightly too high trip number.
This means that the hash() function could be enhanced.
But the effect would be below 10 percent compared to
random hashes, therefore not worth it.

Binary representations of numbers as strings still create
perfect hash numbers.

Bad integers (complete hash clash due to high power of 2)
are handled fairly well by the new algorithm. "new2"
shows that they can be brought down to nearly perfect
hashes just by applying the "hash melting wheel":

Windows names are almost upper case, and almost verbose.
They appear to perform nearly as well as random numbers.
This means: The Python string has function is very good
for a wide area of applications.

In Summary: I would try to modify the string hash function
slightly for short strings, but only if this does not
negatively affect the results of above.

Summary of summary:
There is no really low hanging fruit in string hashing.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com
-------------- next part --------------
## dictest.py
## Test of a new rehash algorithm
## Chris Tismer
## 2000-12-17
## Mission Impossible 5oftware Team

# The following is a partial re-implementation of
# Python dictionaries in Python.
# The original algorithm was literally turned
# into Python code.

##/*
##Table of irreducible polynomials to efficiently cycle through
##GF(2^n)-{0}, 2<=n<=30.
##*/
polys = [
    4 + 3,
    8 + 3,
    16 + 3,
    32 + 5,
    64 + 3,
    128 + 3,
    256 + 29,
    512 + 17,
    1024 + 9,
    2048 + 5,
    4096 + 83,
    8192 + 27,
    16384 + 43,
    32768 + 3,
    65536 + 45,
    131072 + 9,
    262144 + 39,
    524288 + 39,
    1048576 + 9,
    2097152 + 5,
    4194304 + 3,
    8388608 + 33,
    16777216 + 27,
    33554432 + 9,
    67108864 + 71,
    134217728 + 39,
    268435456 + 9,
    536870912 + 5,
    1073741824 + 83,
    0
]
polys = map(long, polys)

class NULL: pass

class Dictionary:
    dummy = "<dummy key>"
    def __init__(mp, newalg=0):
        mp.ma_size = 0
        mp.ma_poly = 0
        mp.ma_table = []
        mp.ma_fill = 0
        mp.ma_used = 0
        mp.oldalg = not newalg
        mp.warmup = newalg>1
        mp.trips = 0

    def getTrips(self):
        trips = self.trips
        self.trips = 0
        return trips

    def lookdict(mp, key, _hash):
        me_hash, me_key, me_value = range(3) # rec slots
        dummy = mp.dummy
        
        mask = mp.ma_size-1
        ep0 = mp.ma_table
        i = (~_hash) & mask
        ep = ep0[i]
        if ep[me_key] is NULL or ep[me_key] == key:
            return ep
        if ep[me_key] == dummy:
            freeslot = ep
        else:
            if (ep[me_hash] == _hash and
                cmp(ep[me_key], key) == 0) :
                return ep
            freeslot = NULL

###### FROM HERE
        if mp.oldalg:
            incr = (_hash ^ (_hash >> 3)) & mask
        else:
            # note that we do not mask!
            # the shifting is worth it in the incremental case.

            ## added after posting to python-dev:
            uhash = _hash & 0xffffffffl
            if mp.warmup:
                incr = uhash
                mask2 = 0xffffffffl ^ mask
                while mask2 > mask:
                    if (incr & 1):
                        incr = incr ^ mp.ma_poly
                    incr = incr >> 1
                    mask2 = mask2>>1
                # this loop *can* be sped up by tables
                # with precomputed multiple shifts.
                # But I'm not sure if it is worth it at all.
            else:
                incr = uhash ^ (uhash >> 3)

###### TO HERE
            
        if (not incr):
            incr = mask
        while 1:
            mp.trips = mp.trips+1
            
            ep = ep0[int((i+incr)&mask)]
            if (ep[me_key] is NULL) :
                if (freeslot is not NULL) :
                    return freeslot
                else:
                    return ep
            if (ep[me_key] == dummy) :
                if (freeslot == NULL):
                    freeslot = ep
            elif (ep[me_key] == key or
                 (ep[me_hash] == _hash and
                  cmp(ep[me_key], key) == 0)) :
                return ep

            # Cycle through GF(2^n)-{0}
###### FROM HERE
            if mp.oldalg:
                incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            else:
                # new algorithm: do a division
                if (incr & 1):
                    incr = incr ^ mp.ma_poly
                incr = incr >> 1
###### TO HERE

    def insertdict(mp, key, _hash, value):
        me_hash, me_key, me_value = range(3) # rec slots
        
        ep = mp.lookdict(key, _hash)
        if (ep[me_value] is not NULL) :
            old_value = ep[me_value]
            ep[me_value] = value
        else :
            if (ep[me_key] is NULL):
                mp.ma_fill=mp.ma_fill+1
            ep[me_key] = key
            ep[me_hash] = _hash
            ep[me_value] = value
            mp.ma_used = mp.ma_used+1

    def dictresize(mp, minused):
        me_hash, me_key, me_value = range(3) # rec slots

        oldsize = mp.ma_size
        oldtable = mp.ma_table
        MINSIZE = 4
        newsize = MINSIZE
        for i in range(len(polys)):
            if (newsize > minused) :
                newpoly = polys[i]
                break
            newsize = newsize << 1
        else:
            return -1

        _nullentry = range(3)
        _nullentry[me_hash] = 0
        _nullentry[me_key] = NULL
        _nullentry[me_value] = NULL

        newtable = map(lambda x,y=_nullentry:y[:], range(newsize))      

        mp.ma_size = newsize
        mp.ma_poly = newpoly
        mp.ma_table = newtable
        mp.ma_fill = 0
        mp.ma_used = 0

        for ep in oldtable:
            if (ep[me_value] is not NULL):
                mp.insertdict(ep[me_key],ep[me_hash],ep[me_value])
        return 0

    # PyDict_GetItem
    def __getitem__(op, key):
        me_hash, me_key, me_value = range(3) # rec slots

        if not op.ma_table:
            raise KeyError, key
        _hash = hash(key)
        return op.lookdict(key, _hash)[me_value]

    # PyDict_SetItem
    def __setitem__(op, key, value):
        mp = op
        _hash = hash(key)
##      /* if fill >= 2/3 size, double in size */
        if (mp.ma_fill*3 >= mp.ma_size*2) :
            if (mp.dictresize(mp.ma_used*2) != 0):
                if (mp.ma_fill+1 > mp.ma_size):
                    raise MemoryError
        mp.insertdict(key, _hash, value)

    # more interface functions
    def keys(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _key)
        return res

    def values(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _value)
        return res

    def items(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( (_key, _value) )
        return res

    def __cmp__(self, other):
        mine = self.items()
        others = other.items()
        mine.sort()
        others.sort()
        return cmp(mine, others)

######################################################
## tests

def test(lis, dic):
    for key in lis: dic[key]

def nulltest(lis, dic):
    for key in lis: dic

def string_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    for i in range(n):
        s = str(i) #* 5
        #s = chr(i%256) + chr(i>>8)##
        d1[s] = d2[s] = d3[s] = i
    return d1, d2, d3

def istring_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    for i in range(n):
        s = chr(i%256) + chr(i>>8)
        d1[s] = d2[s] = d3[s] = i
    return d1, d2, d3

def random_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    from whrandom import randint
    import sys
    keys = []
    for i in range(n):
        keys.append(randint(0, sys.maxint-1))
    for i in keys:
        d1[i] = d2[i] = d3[i] = i
    return d1, d2, d3

def badnum_dicts(n):
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	d3 = Dictionary(2)  # with warmup
	shift = 10
	if EXTREME:
		shift = 16
	for i in range(n):
		bad = i << 16
		d2[bad] = d3[bad] = i
		if n <= 1000: d1[bad] = i
	return d1, d2, d3

def names_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    import win32con
    keys = win32con.__dict__.keys()
    if len(keys) < n:
        keys = []
    for s in keys[:n]:
        d1[s] = d2[s] = d3[s] = s
    return d1, d2, d3

def do_test(dict):
    keys = dict.keys()
    dict.getTrips() # reset
    test(keys, dict)
    return dict.getTrips()

EXTREME=1

if __name__ == "__main__":

    for N in (1000,2000,3000,4000):    

        sdold, sdnew, sdnew2 = string_dicts(N)
        idold, idnew, idnew2 = istring_dicts(N)
        bdold, bdnew, bdnew2 = badnum_dicts(N)
        rdold, rdnew, rdnew2 = random_dicts(N)
        ndold, ndnew, ndnew2 = names_dicts(N)

        print "N=%d" %N        
        print "trips for strings old=%d new=%d new2=%d" % tuple(
            map(do_test, (sdold, sdnew, sdnew2)) )
        print "trips for bin strings old=%d new=%d new2=%d" % tuple(
            map(do_test, (idold, idnew, idnew2)) )
        print "trips for bad integers old=%d new=%d new2=%d" % tuple(
            map(do_test, (bdold, bdnew, bdnew2)))
        print "trips for random integers old=%d new=%d new2=%d" % tuple(
            map(do_test, (rdold, rdnew, rdnew2)))
        print "trips for windows names old=%d new=%d new2=%d" % tuple(
            map(do_test, (ndold, ndnew, ndnew2)))

"""
Results with a shift of 10 (EXTREME=0):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.097 new=5.088
timing for bad integers old=101.540 new=12.610

Results with a shift of 16 (EXTREME=1):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.218 new=5.147
timing for bad integers old=571.210 new=19.220
"""


From tismer at tismer.com  Tue Dec 19 02:51:32 2000
From: tismer at tismer.com (Christian Tismer)
Date: Tue, 19 Dec 2000 02:51:32 +0100
Subject: [Python-Dev] Re: The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCGEHJIEAA.tim.one@home.com>
Message-ID: <3A3EBF23.750CF761@tismer.com>


Tim Peters wrote:
> 
> Sounds good to me!  It's a very cheap way to get the high bits into play.

That's what I wanted to hear. It's also the reason why I try
to stay conservative: Just do an obviously useful bit, but
do not break any of the inherent benefits, like those
"better than random" amenities.
Python's dictionary algorithm appears to be "near perfect"
and of "never touch but veery carefully or redo it completely".
I tried the tightrope walk of just adding a tiny topping.

> >        i = (~_hash) & mask

Yes that stuff was 2 hours last nite :-)
I just decided to not touch it. Arbitrary crap!
Although an XOR with hash >> number of mask bits
would perform much better (in many cases but not all).
Anyway, simple shifting cannot solve general bit
distribution problems. Nor can I :-)

> The ~ here seems like pure superstition to me (and the comments in the C
> code don't justify it at all -- I added a nag of my own about that the last
> time I checked in dictobject.c -- and see below for a bad consequence of
> doing ~).
> 
> >            # note that we do not mask!
> >            # even the shifting my not be worth it.
> >            incr = _hash ^ (_hash >> 3)
> 
> The shifting was just another cheap trick to get *some* influence from the
> high bits.  It's very limited, though.  Toss it (it appears to be from the
> "random operations yield random results" <wink/sigh> matchbook school of
> design).

Now, comment it out, and you see my new algorithm perform much worse.
I just kept it since it had an advantage on "my case". (bad guy I know).
And I wanted to have an argument for my change to get accepted.
"No cost, just profit, nearly the same" was what I tried to sell.

> [MAL]
> > BTW, would changing the hash function on strings from the simple
> > XOR scheme to something a little smarter help improve the performance
> > too (e.g. most strings used in programming never use the 8-th
> > bit) ?
> 
> Don't understand -- the string hash uses multiplication:
> 
>     x = (1000003*x) ^ *p++;
> 
> in a loop.  Replacing "^" there by "+" should yield slightly better results.

For short strings, this prime has bad influence on the low bits,
making it perform supoptimally for small dicts.
See the new2 algo which funnily corrects for that.
The reason is obvious: Just look at the bit pattern
of 1000003:  '0xf4243'

Without giving proof, this smells like bad bit distribution on small
strings to me. You smell it too, right?

> As is, string hashes are a lot like integer hashes, in that "consecutive"
> strings
> 
>    J001
>    J002
>    J003
>    J004
>    ...
> 
> yield hashes very close together in value. 

A bad generator in that case. I'll look for a better one.

> But, because the current dict
> algorithm uses ~ on the full hash but does not use ~ on the initial
> increment, (~hash)+incr too often yields the same result for distinct hashes
> (i.e., there's a systematic (but weak) form of clustering).

You name it.

> Note that Python is doing something very unusual:  hashes are *usually*
> designed to yield an approximation to randomness across all bits.  But
> Python's hashes never achieve that.  This drives theoreticians mad (like the
> fellow who originally came up with the GF idea), but tends to work "better
> than random" in practice (e.g., a truly random hash function would almost
> certainly produce many collisions when fed a fat range of consecutive
> integers but still less than half the table size; but Python's trivial
> "identity" integer hash produces no collisions in that common case).

A good reason to be careful with changes(ahem).

> [Christian]
> > - The bits used from the string hash are not well distributed
> > - using a "warmup wheel" on the hash to suck all bits in
> >   gives the same quality of hashes like random numbers.
> 
> See above and be very cautious:  none of Python's hash functions produce
> well-distributed bits, and-- in effect --that's why Python dicts often
> perform "better than random" on common data.  Even what you've done so far
> appears to provide marginally worse statistics for Guido's favorite kind of
> test case ("worse" in two senses:  total number of collisions (a measure of
> amortized lookup cost), and maximum collision chain length (a measure of
> worst-case lookup cost)):
> 
>    d = {}
>    for i in range(N):
>        d[repr(i)] = i

Nah, I did quite a lot of tests, and the trip number shows a
variation of about 10%, without judging old or new for better.
This is just the randomness inside.

> check-in-one-thing-then-let-it-simmer-ly y'rs  - tim

This is why I think to be even more conservative:
Try to use a division wheel, but with the inverses
of the original primitive roots, just in order to
get at Guido's results :-)

making-perfect-hashes-of-interneds-still-looks-promising - ly y'rs

   - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From greg at cosc.canterbury.ac.nz  Tue Dec 19 04:07:56 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 19 Dec 2000 16:07:56 +1300 (NZDT)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <200012182332.SAA18456@cj20424-a.reston1.va.home.com>
Message-ID: <200012190307.QAA02663@s454.cosc.canterbury.ac.nz>

> The problem I have with this is that the code to evaluate g() has to
> be generated twice!

I have an idea how to fix that. There need to be two methods,
__boolean_and_1__ and __boolean_and_2__. The first operand
is evaluated and passed to __boolean_and_1__. If it returns
a result, that becomes the result of the expression, and the
second operand is short-circuited.

If __boolean_and_1__ raises a NeedOtherOperand exception
(or there is no __boolean_and_1__ method), the second operand 
is evaluated, and both operands are passed to __boolean_and_2__.

The bytecode would look something like

        <evaluate first operand>
        BOOLEAN_AND_1 label
        <evaluate second operand>
        BOOLEAN_AND_2
label:  ...

I'll make a PEP out of this one day if I get enthusiastic
enough.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From tim.one at home.com  Tue Dec 19 05:55:33 2000
From: tim.one at home.com (Tim Peters)
Date: Mon, 18 Dec 2000 23:55:33 -0500
Subject: [Python-Dev] The Dictionary Gem is polished!
In-Reply-To: <3A3EB6EB.C79A3896@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEIAIEAA.tim.one@home.com>

Something else to ponder:  my tests show that the current ("old") algorithm
performs much better (somewhat worse than "new2" == new algorithm + warmup)
if incr is simply initialized like so instead:

        if mp.oldalg:
            incr = (_hash & 0xffffffffL) % (mp.ma_size - 1)

That's another way to get all the bits to contribute to the result.  Note
that a mod by size-1 is analogous to "casting out nines" in decimal:  it's
the same as breaking hash into fixed-sized pieces from the right (10 bits
each if size=2**10, etc), adding the pieces together, and repeating that
process until only one piece remains.  IOW, it's a degenerate form of
division, but works well all the same.  It didn't improve over that when I
tried a mod by the largest prime less than the table size (which suggests
we're sucking all we can out of the *probe* sequence given a sometimes-poor
starting index).

However, it's subject to the same weak clustering phenomenon as the old
method due to the ill-advised "~hash" operation in computing the initial
index.  If ~ is also thrown away, it's as good as new2 (here I've tossed out
the "windows names", and "old" == existing algorithm except (a) get rid of ~
when computing index and (b) do mod by size-1 when computing incr):

N=1000
trips for strings old=230 new=261 new2=239
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=999 new=13212 new2=999
trips for random integers old=399 new=421 new2=410
N=2000
trips for strings old=787 new=1066 new2=827
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=26481 new2=1999
trips for random integers old=652 new=733 new2=650
N=3000
trips for strings old=547 new=760 new2=617
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=38701 new2=2999
trips for random integers old=724 new=743 new2=768
N=4000
trips for strings old=1311 new=1657 new2=1315
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=53014 new2=3999
trips for random integers old=1476 new=1587 new2=1493

The new and new2 values differ in minor ways from the ones you posted
because I got rid of the ~ (the ~ has a bad interaction with "additive"
means of computing incr, because the ~ tends to move the index in the
opposite direction, and these moves in opposite directions tend to cancel
out when computing incr+index the first time).

too-bad-mod-is-expensive!-ly y'rs  - tim




From tim.one at home.com  Tue Dec 19 06:50:01 2000
From: tim.one at home.com (Tim Peters)
Date: Tue, 19 Dec 2000 00:50:01 -0500
Subject: [Python-Dev] SourceForge SSH silliness
In-Reply-To: <20001217220008.D29681@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEIDIEAA.tim.one@home.com>

[Tim]
> Starting last night, I get this msg whenever I update Python code w/
> CVSROOT=:ext:tim_one at cvs.python.sourceforge.net:/cvsroot/python:
>
> """
> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> @       WARNING: HOST IDENTIFICATION HAS CHANGED!         @
> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
> Someone could be eavesdropping on you right now
> (man-in-the-middle attack)!
> It is also possible that the host key has just been changed.
> Please contact your system administrator.
> Add correct host key in C:\Code/.ssh/known_hosts to get rid of
> this message.
> Password authentication is disabled to avoid trojan horses.
> """
>
> This is SourceForge's doing, and is permanent (they've changed
> keys on their end). ...

[Thomas Wouters]
> What sourceforge did was switch Linux distributions, and upgrade.
> The switch doesn't really matter for the SSH problem, because recent
> Debian and recent RedHat releases both use a new ssh, the OpenBSD
> ssh imlementation.
> Apparently, it isn't entirely backwards compatible to old versions of
> F-secure ssh. For one thing, it doesn't support the 'idea' cypher. This
> might or might not be your problem; if it is, you should get a decent
> message that gives a relatively clear message such as 'cypher type 'idea'
> not supported'.
> ... [and quite a bit more] ...

I hope you're feeling better today <wink>.  "The problem" was one the wng
msg spelled out:  "It is also possible that the host key has just been
changed.".  SF changed keys.  That's the whole banana right there.  Deleting
the sourceforge keys from known_hosts fixed it (== convinced ssh to install
new SF keys the next time I connected).




From tim.one at home.com  Tue Dec 19 06:58:45 2000
From: tim.one at home.com (Tim Peters)
Date: Tue, 19 Dec 2000 00:58:45 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <200012171438.JAA21603@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEIEIEAA.tim.one@home.com>

[Tim]
> I expect it would do less harm to introduce a compile-time warning for
> locals that are never referenced (such as the "a" in "set").

[Guido]
> Another warning that would be quite useful (and trap similar cases)
> would be "local variable used before set".

Java elevated that last one to a compile-time error, via its "definite
assignment" rules:  you not only have to make sure a local is bound before
reference, you have to make it *obvious* to the compiler that it's bound
before reference.  I think this is a Good Thing, because with intense
training, people can learn to think like a compiler too <wink>.

Seriously, in several of the cases where gcc warned about "maybe used before
set" in the Python implementation, the warnings were bogus but it was
non-trivial to deduce that.  Such code is very brittle under modification,
and the definite assignment rules make that path to error a non-starter.

Example:

def f(N):
    if N > 0:
        for i in range(N):
            if i == 0:
                j = 42
            else:
                f2(i)
    elif N <= 0:
        j = 24
    return j

It's a Crime Against Humanity to make the code reader *deduce* that j is
always bound by the time "return" is executed.





From guido at python.org  Tue Dec 19 07:08:14 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 01:08:14 -0500
Subject: [Python-Dev] Error: syncmail script missing
Message-ID: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>

I just checked in the documentation for the warnings module.  (Check
it out!)

When I ran "cvs commit" in the Doc directory, it said, amongst other
things:

    sh: /cvsroot/python/CVSROOT/syncmail: No such file or directory

I suppose this may be a side effect of the transition to new hardware
of the SourceForge CVS archive.  (Which, by the way, has dramatically
improved the performance of typical CVS operations -- I am no longer
afraid to do a cvs diff or cvs log in Emacs, or to do a cvs update
just to be sure.)

Could some of the Powers That Be (Fred or Barry :-) check into what
happened to the syncmail script?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at acm.org  Tue Dec 19 07:10:04 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 19 Dec 2000 01:10:04 -0500 (EST)
Subject: [Python-Dev] Error: syncmail script missing
In-Reply-To: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
Message-ID: <14910.64444.662460.48236@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > Could some of the Powers That Be (Fred or Barry :-) check into what
 > happened to the syncmail script?

  We've seen this before, but I'm not sure what it was.  Barry, do you
recall?  Had the Python interpreter landed in a different directory?
Or perhaps the location of the CVS repository is different, so
syncmail isn't where loginfo says.
  Tomorrow... scp to SF appears broken as well.  ;(


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tim.one at home.com  Tue Dec 19 07:16:15 2000
From: tim.one at home.com (Tim Peters)
Date: Tue, 19 Dec 2000 01:16:15 -0500
Subject: [Python-Dev] Error: syncmail script missing
In-Reply-To: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEIFIEAA.tim.one@home.com>

[Guido]
> I just checked in the documentation for the warnings module.  (Check
> it out!)

Everyone should note that this means Guido will be taking his traditional
post-release vacation almost immediately <wink -- but he is about to
leave!>.

> When I ran "cvs commit" in the Doc directory, it said, amongst other
> things:
>
>     sh: /cvsroot/python/CVSROOT/syncmail: No such file or directory
>
> I suppose this may be a side effect of the transition to new hardware
> of the SourceForge CVS archive.

The lack of checkin mail was first noted on a Jython list.  Finn wisely
replied that he'd just sit back and wait for the CPython people to figure
out how to fix it.

> ...
> Could some of the Powers That Be (Fred or Barry :-) check into what
> happened to the syncmail script?

Don't worry, I'll do my part by nagging them in your absence <wink>.  Bon
holiday voyage!




From cgw at fnal.gov  Tue Dec 19 07:32:15 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Tue, 19 Dec 2000 00:32:15 -0600 (CST)
Subject: [Python-Dev] cycle-GC question
Message-ID: <14911.239.12288.546710@buffalo.fnal.gov>

The following program:

import rexec
while 1:
      x = rexec.RExec()
      del x

leaks memory at a fantastic rate.

It seems clear (?) that this is due to the call to "set_rexec" at
rexec.py:140, which creates a circular reference between the `rexec'
and `hooks' objects.  (There's even a nice comment to that effect).

I'm curious however as to why the spiffy new cyclic-garbage collector
doesn't pick this up?

Just-wondering-ly y'rs,
		  cgw











From tim_one at email.msn.com  Tue Dec 19 10:24:18 2000
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 19 Dec 2000 04:24:18 -0500
Subject: [Python-Dev] RE: The Dictionary Gem is polished!
In-Reply-To: <3A3EBF23.750CF761@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEIKIEAA.tim_one@email.msn.com>

[Christian Tismer]
> ...
> For short strings, this prime has bad influence on the low bits,
> making it perform supoptimally for small dicts.
> See the new2 algo which funnily corrects for that.
> The reason is obvious: Just look at the bit pattern
> of 1000003:  '0xf4243'
>
> Without giving proof, this smells like bad bit distribution on small
> strings to me. You smell it too, right?
> ...

[Tim]
> As is, string hashes are a lot like integer hashes, in that
> "consecutive" strings
>
>    J001
>    J002
>    J003
>    J004
>    ...
>
> yield hashes very close together in value.

[back to Christian]
> A bad generator in that case. I'll look for a better one.

Not necessarily!  It's for that same reason "consecutive strings" can have
"better than random" behavior today.  And consecutive strings-- like
consecutive ints --are a common case.

Here are the numbers for the synthesized string cases:

N=1000
trips for strings old=293 new=302 new2=221
trips for bin strings old=0 new=0 new2=0
N=2000
trips for strings old=1093 new=1109 new2=786
trips for bin strings old=0 new=0 new2=0
N=3000
trips for strings old=810 new=839 new2=609
trips for bin strings old=0 new=0 new2=0
N=4000
trips for strings old=1850 new=1843 new2=1375
trips for bin strings old=0 new=0 new2=0

Here they are again, after doing nothing except changing the "^" to "+" in
the string hash, i.e. replacing

		x = (1000003*x) ^ *p++;
by
		x = (1000003*x) + *p++;

N=1000
trips for strings old=140 new=127 new2=108
trips for bin strings old=0 new=0 new2=0
N=2000
trips for strings old=480 new=434 new2=411
trips for bin strings old=0 new=0 new2=0
N=3000
trips for strings old=821 new=857 new2=631
trips for bin strings old=0 new=0 new2=0
N=4000
trips for strings old=1892 new=1852 new2=1476
trips for bin strings old=0 new=0 new2=0

The first two sizes are dramatically better, the last two a wash.  If you
want to see a real disaster, replace the "+" with "*" <wink>:

N=1000
trips for strings old=71429 new=6449 new2=2048
trips for bin strings old=81187 new=41117 new2=41584
N=2000
trips for strings old=26882 new=9300 new2=6103
trips for bin strings old=96018 new=46932 new2=42408

I got tired of waiting at that point ...

suspecting-a-better-string-hash-is-hard-to-find-ly y'rs  - tim





From martin at loewis.home.cs.tu-berlin.de  Tue Dec 19 12:58:17 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 19 Dec 2000 12:58:17 +0100
Subject: [Python-Dev] Death to string functions!
Message-ID: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>

> I agree it would be useful to define these terms, although those
> particular definitions appear to be missing the most important point
> from the user's POV (not a one says "going away someday").

PEP 4 says

# Usage of a module may be `deprecated', which means that it may be
# removed from a future Python release.

Proposals for better wording are welcome (and yes, I still have to get
the comments that I got into the document).

Regards,
Martin



From guido at python.org  Tue Dec 19 15:48:47 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 09:48:47 -0500
Subject: [Python-Dev] cycle-GC question
In-Reply-To: Your message of "Tue, 19 Dec 2000 00:32:15 CST."
             <14911.239.12288.546710@buffalo.fnal.gov> 
References: <14911.239.12288.546710@buffalo.fnal.gov> 
Message-ID: <200012191448.JAA28737@cj20424-a.reston1.va.home.com>

> The following program:
> 
> import rexec
> while 1:
>       x = rexec.RExec()
>       del x
> 
> leaks memory at a fantastic rate.
> 
> It seems clear (?) that this is due to the call to "set_rexec" at
> rexec.py:140, which creates a circular reference between the `rexec'
> and `hooks' objects.  (There's even a nice comment to that effect).
> 
> I'm curious however as to why the spiffy new cyclic-garbage collector
> doesn't pick this up?

Me too.  I turned on gc debugging (gc.set_debug(077) :-) and got
messages suggesting that it is not collecting everything.  The
output looks like this:

   .
   .
   .
gc: collecting generation 0...
gc: objects in each generation: 764 6726 89174
gc: done.
gc: collecting generation 1...
gc: objects in each generation: 0 8179 89174
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 764 0 97235
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 757 747 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 764 1386 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 757 2082 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 764 2721 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 757 3417 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 764 4056 97184
gc: done.
   .
   .
   .

With the third number growing each time a "generation 1" collection is
done.

Maybe Neil can shed some light?  The gc.garbage list is empty.

This is about as much as I know about the GC stuff...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From petrilli at amber.org  Tue Dec 19 16:25:18 2000
From: petrilli at amber.org (Christopher Petrilli)
Date: Tue, 19 Dec 2000 10:25:18 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Tue, Dec 19, 2000 at 12:58:17PM +0100
References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>
Message-ID: <20001219102518.A14288@trump.amber.org>

So I was thinking about this whole thing, and wondering why it was
that seeing things like:

     " ".join(aList)

bugged me to no end, while:

     aString.lower()

didn't seem to look wrong. I finally put my finger on it, and I
haven't seen anyone mention it, so I guess I'll do so.  To me, the
concept of "join" on a string is just not quite kosher, instead it
should be something like this:

     aList.join(" ")

or if you want it without the indirection:

     ['item', 'item', 'item'].join(" ")

Now *THAT* looks right to me.  The example of a join method on a
string just doesn't quite gel in my head, and I did some thinking and
digging, and well, when I pulled up my Smalltalk browser, things like
join are done on Collections, not on Strings.  You're joining the
collection, not the string.

Perhaps in a rush to move some things that were "string related" in
the string module into methods on the strings themselves (something I
whole-heartedly support), we moved a few too many things
there---things that symantically don't really belong as methods on a
string object.

How this gets resolved, I don't know... but I know a lot of people
have looked at the string methods---and they each keep coming back to
1 or 2 that bug them... and I think it's those that really aren't
methods of a string, but instead something that operates with strings, 
but expects other things.

Chris
-- 
| Christopher Petrilli
| petrilli at amber.org



From guido at python.org  Tue Dec 19 16:37:15 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 10:37:15 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Tue, 19 Dec 2000 10:25:18 EST."
             <20001219102518.A14288@trump.amber.org> 
References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>  
            <20001219102518.A14288@trump.amber.org> 
Message-ID: <200012191537.KAA28909@cj20424-a.reston1.va.home.com>

> So I was thinking about this whole thing, and wondering why it was
> that seeing things like:
> 
>      " ".join(aList)
> 
> bugged me to no end, while:
> 
>      aString.lower()
> 
> didn't seem to look wrong. I finally put my finger on it, and I
> haven't seen anyone mention it, so I guess I'll do so.  To me, the
> concept of "join" on a string is just not quite kosher, instead it
> should be something like this:
> 
>      aList.join(" ")
> 
> or if you want it without the indirection:
> 
>      ['item', 'item', 'item'].join(" ")
> 
> Now *THAT* looks right to me.  The example of a join method on a
> string just doesn't quite gel in my head, and I did some thinking and
> digging, and well, when I pulled up my Smalltalk browser, things like
> join are done on Collections, not on Strings.  You're joining the
> collection, not the string.
> 
> Perhaps in a rush to move some things that were "string related" in
> the string module into methods on the strings themselves (something I
> whole-heartedly support), we moved a few too many things
> there---things that symantically don't really belong as methods on a
> string object.
> 
> How this gets resolved, I don't know... but I know a lot of people
> have looked at the string methods---and they each keep coming back to
> 1 or 2 that bug them... and I think it's those that really aren't
> methods of a string, but instead something that operates with strings, 
> but expects other things.

Boy, are you stirring up a can of worms that we've been through many
times before!  Nothing you say hasn't been said at least a hundred
times before, on this list as well as on c.l.py.

The problem is that if you want to make this a method on lists, you'll
also have to make it a method on tuples, and on arrays, and on NumPy
arrays, and on any user-defined type that implements the sequence
protocol...  That's just not reasonable to expect.

There really seem to be only two possibilities that don't have this
problem: (1) make it a built-in, or (2) make it a method on strings.

We chose for (2) for uniformity, and to avoid the potention with
os.path.join(), which is sometimes imported as a local.  If
" ".join(L) bugs you, try this:

    space = " "	 # This could be a global
     .
     .
     .
    s = space.join(L)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Tue Dec 19 16:46:55 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 10:46:55 -0500
Subject: [Python-Dev] Death to string functions!
References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>
	<20001219102518.A14288@trump.amber.org>
Message-ID: <14911.33519.764029.306876@anthem.concentric.net>

>>>>> "CP" == Christopher Petrilli <petrilli at amber.org> writes:

    CP> So I was thinking about this whole thing, and wondering why it
    CP> was that seeing things like:

    CP>      " ".join(aList)

    CP> bugged me to no end, while:

    CP>      aString.lower()

    CP> didn't seem to look wrong. I finally put my finger on it, and
    CP> I haven't seen anyone mention it, so I guess I'll do so.

Actually, it has been debated to death. ;)  This looks better:

    SPACE = ' '
    SPACE.join(aList)

That reads good to me ("space-join this list") and that's how I always
write it.  That said, there are certainly lots of people who agree
with you.  You can't put join() on sequences though, until you have
builtin base-classes, or interfaces, or protocols or some such
construct, because otherwise you'd have to add it to EVERY sequence,
including classes that act like sequences.

One idea that I believe has merit is to consider adding join() to the
builtins, probably with a signature like:

    join(aList, aString) -> aString

This horse has been whacked pretty good too, but I don't remember
seeing a patch or a pronouncement.

-Barry



From nas at arctrix.com  Tue Dec 19 09:53:36 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 19 Dec 2000 00:53:36 -0800
Subject: [Python-Dev] cycle-GC question
In-Reply-To: <200012191448.JAA28737@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Dec 19, 2000 at 09:48:47AM -0500
References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com>
Message-ID: <20001219005336.A303@glacier.fnational.com>

On Tue, Dec 19, 2000 at 09:48:47AM -0500, Guido van Rossum wrote:
> > import rexec
> > while 1:
> >       x = rexec.RExec()
> >       del x
> > 
> > leaks memory at a fantastic rate.
> > 
> > It seems clear (?) that this is due to the call to "set_rexec" at
> > rexec.py:140, which creates a circular reference between the `rexec'
> > and `hooks' objects.  (There's even a nice comment to that effect).

Line 140 is not the only place a circular reference is created.
There is another one which is trickier to find:

    def add_module(self, mname):
        if self.modules.has_key(mname):
            return self.modules[mname]
        self.modules[mname] = m = self.hooks.new_module(mname)
        m.__builtins__ = self.modules['__builtin__']
        return m

If the module being added is __builtin__ then m.__builtins__ = m.
The GC currently doesn't track modules.  I guess it should.  It
might be possible to avoid this circular reference but I don't
know enough about how RExec works.  Would something like:

    def add_module(self, mname):
        if self.modules.has_key(mname):
            return self.modules[mname]
        self.modules[mname] = m = self.hooks.new_module(mname)
        if mname != '__builtin__':
            m.__builtins__ = self.modules['__builtin__']
        return m
    
do the trick?

  Neil



From fredrik at effbot.org  Tue Dec 19 16:39:49 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 19 Dec 2000 16:39:49 +0100
Subject: [Python-Dev] Death to string functions!
References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> <20001219102518.A14288@trump.amber.org>
Message-ID: <008301c069d3$76560a20$3c6340d5@hagrid>

"Christopher Petrilli" wrote:
> didn't seem to look wrong. I finally put my finger on it, and I
> haven't seen anyone mention it, so I guess I'll do so.  To me, the
> concept of "join" on a string is just not quite kosher, instead it
> should be something like this:
> 
>      aList.join(" ")
> 
> or if you want it without the indirection:
> 
>      ['item', 'item', 'item'].join(" ")
> 
> Now *THAT* looks right to me.

why do we keep coming back to this?

aString.join can do anything string.join can do, but aList.join
cannot.  if you don't understand why, check the archives.

</F>




From martin at loewis.home.cs.tu-berlin.de  Tue Dec 19 16:44:48 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 19 Dec 2000 16:44:48 +0100
Subject: [Python-Dev] cycle-GC question
Message-ID: <200012191544.QAA11408@loewis.home.cs.tu-berlin.de>

> It seems clear (?) that this is due to the call to "set_rexec" at
> rexec.py:140, which creates a circular reference between the `rexec'
> and `hooks' objects.  (There's even a nice comment to that effect).

It's not all that clear that *this* is the cycle. In fact, it is not.

> I'm curious however as to why the spiffy new cyclic-garbage
> collector doesn't pick this up?

It's an interesting problem, so I spent this afternoon investigating
it. I soon found that I need a tool, so I introduced a new function
gc.getreferents which, when given an object, returns a list of objects
referring to that object. The patch for that feature is in

http://sourceforge.net/patch/?func=detailpatch&patch_id=102925&group_id=5470

Applying that function recursively, I can get an output that looks
like that:

<rexec.RExec instance at 0x81f5dcc>
 <method RExec.r_import of RExec instance at 0x81f5dcc>
  dictionary 0x81f4f24
 <method RExec.r_reload of RExec instance at 0x81f5dcc>
  dictionary 0x81f4f24 (seen)
 <method RExec.r_open of RExec instance at 0x81f5dcc>
  dictionary 0x81f4f24 (seen)
 <method RExec.r_exc_info of RExec instance at 0x81f5dcc>
  dictionary 0x8213bc4
 dictionary 0x820869c
  <rexec.RHooks instance at 0x8216cbc>
   dictionary 0x820866c
    <rexec.RExec instance at 0x81f5dcc> (seen)
   dictionary 0x8213bf4
    <ihooks.FancyModuleLoader instance at 0x81f7464>
     dictionary 0x820866c (seen)
     dictionary 0x8214144
      <ihooks.ModuleImporter instance at 0x8214124>
       dictionary 0x820866c (seen)

Each indentation level shows the objects which refer to the outer-next
object, e.g. the dictionary 0x820869c refers to the RExec instance,
and the RHooks instance refers to that dictionary. Clearly, the
dictionary 0x820869c is the RHooks' __dict__, and the reference
belongs to the 'rexec' key in that dictionary.

The recursion stops only when an object has been seen before (so its a
cycle, or other non-tree graph), or if there are no referents (the
lists created to do the iteration are ignored).

So it appears that the r_import method is referenced from some
dictionary, but that dictionary is not referenced anywhere???

Checking the actual structures shows that rexec creates a __builtin__
module, which has a dictionary that has an __import__ key. So the
reference to the method comes from the __builtin__ module, which in
turn is referenced as the RExec's .modules attribute, giving another
cycle.

However, module objects don't participate in garbage
collection. Therefore, gc.getreferents cannot traverse a module, and
the garbage collector won't find a cycle involving a garbage module.
I just submitted a bug report,

http://sourceforge.net/bugs/?func=detailbug&bug_id=126345&group_id=5470

which suggests that modules should also participate in garbage
collection.

Regards,
Martin



From guido at python.org  Tue Dec 19 17:01:46 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 11:01:46 -0500
Subject: [Python-Dev] cycle-GC question
In-Reply-To: Your message of "Tue, 19 Dec 2000 00:53:36 PST."
             <20001219005336.A303@glacier.fnational.com> 
References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com>  
            <20001219005336.A303@glacier.fnational.com> 
Message-ID: <200012191601.LAA29015@cj20424-a.reston1.va.home.com>

> might be possible to avoid this circular reference but I don't
> know enough about how RExec works.  Would something like:
> 
>     def add_module(self, mname):
>         if self.modules.has_key(mname):
>             return self.modules[mname]
>         self.modules[mname] = m = self.hooks.new_module(mname)
>         if mname != '__builtin__':
>             m.__builtins__ = self.modules['__builtin__']
>         return m
>     
> do the trick?

That's certainly a good thing to do (__builtin__ has no business
having a __builtins__!), but (in my feeble experiment) it doesn't make
the leaks go away.

Note that almost every module participates heavily in cycles: whenever
you define a function f(), f.func_globals is the module's __dict__,
which also contains a reference to f.  Similar for classes, with an
extra hop via the class object and its __dict__.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From cgw at fnal.gov  Tue Dec 19 17:06:06 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Tue, 19 Dec 2000 10:06:06 -0600 (CST)
Subject: [Python-Dev] cycle-GC question
In-Reply-To: <20001219005336.A303@glacier.fnational.com>
References: <14911.239.12288.546710@buffalo.fnal.gov>
	<200012191448.JAA28737@cj20424-a.reston1.va.home.com>
	<20001219005336.A303@glacier.fnational.com>
Message-ID: <14911.34670.664178.418523@buffalo.fnal.gov>

Neil Schemenauer writes:
 > 
 > Line 140 is not the only place a circular reference is created.
 > There is another one which is trickier to find:
 > 
 >     def add_module(self, mname):
 >         if self.modules.has_key(mname):
 >             return self.modules[mname]
 >         self.modules[mname] = m = self.hooks.new_module(mname)
 >         m.__builtins__ = self.modules['__builtin__']
 >         return m
 > 
 > If the module being added is __builtin__ then m.__builtins__ = m.
 > The GC currently doesn't track modules.  I guess it should.  It
 > might be possible to avoid this circular reference but I don't
 > know enough about how RExec works.  Would something like:
 > 
 >     def add_module(self, mname):
 >         if self.modules.has_key(mname):
 >             return self.modules[mname]
 >         self.modules[mname] = m = self.hooks.new_module(mname)
 >         if mname != '__builtin__':
 >             m.__builtins__ = self.modules['__builtin__']
 >         return m
 >     
 > do the trick?

No... if you change "add_module" in exactly the way you suggest
(without worrying about whether it breaks the functionality of rexec!)
and run the test

while 1:
      rexec.REXec()

you will find that it still leaks memory at a prodigious rate.

So, (unless there is yet another module-level cyclic reference) I
don't think this theory explains the problem.



From martin at loewis.home.cs.tu-berlin.de  Tue Dec 19 17:07:04 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 19 Dec 2000 17:07:04 +0100
Subject: [Python-Dev] cycle-GC question
Message-ID: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de>

> There is another one which is trickier to find:
[__builtin__.__builtins__ == __builtin__]

> Would something like:
[do not add builtins to builtin
> work?

No, because there is another one that is even trickier to find :-)

>>> print r
<rexec.RExec instance at 0x81f7cac>
>>> print r.modules['__builtin__'].open.im_self
<rexec.RExec instance at 0x81f7cac>

Please see my other message; I think modules should be gc'ed.

Regards,
Martin



From nas at arctrix.com  Tue Dec 19 10:24:29 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 19 Dec 2000 01:24:29 -0800
Subject: [Python-Dev] cycle-GC question
In-Reply-To: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Tue, Dec 19, 2000 at 05:07:04PM +0100
References: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de>
Message-ID: <20001219012429.A520@glacier.fnational.com>

On Tue, Dec 19, 2000 at 05:07:04PM +0100, Martin v. Loewis wrote:
> I think modules should be gc'ed.

I agree.  Its easy to do.  If no one does over Christmas I will
do it before 2.1 is released.

  Neil



From tismer at tismer.com  Tue Dec 19 16:48:58 2000
From: tismer at tismer.com (Christian Tismer)
Date: Tue, 19 Dec 2000 17:48:58 +0200
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCMEIAIEAA.tim.one@home.com>
Message-ID: <3A3F836A.DEDF1011@tismer.com>


Tim Peters wrote:
> 
> Something else to ponder:  my tests show that the current ("old") algorithm
> performs much better (somewhat worse than "new2" == new algorithm + warmup)
> if incr is simply initialized like so instead:
> 
>         if mp.oldalg:
>             incr = (_hash & 0xffffffffL) % (mp.ma_size - 1)

Sure. I did this as well, but didn't consider a division
since it said to be too slow. But this is very platform
dependant. On Pentiums this might be not noticeable.

> That's another way to get all the bits to contribute to the result.  Note
> that a mod by size-1 is analogous to "casting out nines" in decimal:  it's
> the same as breaking hash into fixed-sized pieces from the right (10 bits
> each if size=2**10, etc), adding the pieces together, and repeating that
> process until only one piece remains.  IOW, it's a degenerate form of
> division, but works well all the same.  It didn't improve over that when I
> tried a mod by the largest prime less than the table size (which suggests
> we're sucking all we can out of the *probe* sequence given a sometimes-poor
> starting index).

Again I tried this too. Instead of the largest near prime I used
the nearest prime. Remarkably the nearest prime is identical
to the primitive element in a lot of cases.
But no improvement over the modulus.

> 
> However, it's subject to the same weak clustering phenomenon as the old
> method due to the ill-advised "~hash" operation in computing the initial
> index.  If ~ is also thrown away, it's as good as new2 (here I've tossed out
> the "windows names", and "old" == existing algorithm except (a) get rid of ~
> when computing index and (b) do mod by size-1 when computing incr):
...
> The new and new2 values differ in minor ways from the ones you posted
> because I got rid of the ~ (the ~ has a bad interaction with "additive"
> means of computing incr, because the ~ tends to move the index in the
> opposite direction, and these moves in opposite directions tend to cancel
> out when computing incr+index the first time).

Remarkable.

> too-bad-mod-is-expensive!-ly y'rs  - tim

Yes. The wheel is cheapest yet.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From just at letterror.com  Tue Dec 19 18:11:55 2000
From: just at letterror.com (Just van Rossum)
Date: Tue, 19 Dec 2000 18:11:55 +0100
Subject: [Python-Dev] Death to string functions!
Message-ID: <l03102804b6653dd31c4e@[193.78.237.123]>

Barry wrote:
>Actually, it has been debated to death. ;)  This looks better:
>
>    SPACE = ' '
>    SPACE.join(aList)
>
>That reads good to me ("space-join this list") and that's how I always
>write it.

I just did a quick scan through the 1.5.2 library, and _most_
occurrances of string.join() are used with a string constant
for the second argument. There is a whole bunch of one-arg
string.join()'s, too. Recommending replacing all of these (not
to mention all the code "out there") with named constants seems
plain silly.

Sure, " ".join() is the most "logical" choice for Python as it
stands, but it's definitely not the most intuitive, as evidenced
by the number of times this comes up on c.l.py: to many people
it simply "looks wrong". Maybe this is the deal: joiner.join()
makes a whole lot of sense from an _implementation_ standpoint,
but a whole lot less as a public interface.

It's easy to explain why join() can't be a method of sequences
(in Python), but that alone doesn't justify a string method.
string.join() is not quite unlike map() and friends:
map() wouldn't be so bad as a sequence method, but that isn't
practical for exactly the same reasons: so it's a builtin.
(And not a function method...)

So, making join() a builtin makes a whole lot of sense. Not
doing this because people sometimes use a local reference to
os.path.join seems awfully backward. Hm, maybe joiner.join()
could become a special method: joiner.__join__(), that way
other objects could define their own implementation for
join(). (Hm, wouldn't be the worst thing for, say, a file
path object...)

Just





From barry at digicool.com  Tue Dec 19 18:20:07 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 12:20:07 -0500
Subject: [Python-Dev] Death to string functions!
References: <l03102804b6653dd31c4e@[193.78.237.123]>
Message-ID: <14911.39111.710940.342986@anthem.concentric.net>

>>>>> "JvR" == Just van Rossum <just at letterror.com> writes:

    JvR> Recommending replacing all of these (not to mention all the
    JvR> code "out there") with named constants seems plain silly.

Until there's a tool to do the migration, I don't (personally)
recommend wholesale migration.  For new code I write though, I usually
do it the way I described (which is intuitive to me, but then so is
moving your fingers at a blinding speed up and down 5 long strips of
metal to cause low bowel-tickling rumbly noises).

    JvR> So, making join() a builtin makes a whole lot of sense. Not
    JvR> doing this because people sometimes use a local reference to
    JvR> os.path.join seems awfully backward.

I agree.  Have we agreed on the semantics and signature of builtin
join() though?  Is it just string.join() stuck in builtins?

-Barry



From fredrik at effbot.org  Tue Dec 19 18:25:49 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 19 Dec 2000 18:25:49 +0100
Subject: [Python-Dev] Death to string functions!
References: <l03102804b6653dd31c4e@[193.78.237.123]> <14911.39111.710940.342986@anthem.concentric.net>
Message-ID: <012901c069e0$bd724fb0$3c6340d5@hagrid>

Barry wrote:
>     JvR> So, making join() a builtin makes a whole lot of sense. Not
>     JvR> doing this because people sometimes use a local reference to
>     JvR> os.path.join seems awfully backward.
> 
> I agree.  Have we agreed on the semantics and signature of builtin
> join() though?  Is it just string.join() stuck in builtins?

+1

(let's leave the __join__ slot and other super-generalized
variants for 2.2)

</F>




From thomas at xs4all.net  Tue Dec 19 18:54:34 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 19 Dec 2000 18:54:34 +0100
Subject: [Python-Dev] SourceForge SSH silliness
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEIDIEAA.tim.one@home.com>; from tim.one@home.com on Tue, Dec 19, 2000 at 12:50:01AM -0500
References: <20001217220008.D29681@xs4all.nl> <LNBBLJKPBEHFEDALKOLCIEIDIEAA.tim.one@home.com>
Message-ID: <20001219185434.E29681@xs4all.nl>

On Tue, Dec 19, 2000 at 12:50:01AM -0500, Tim Peters wrote:

> [Thomas Wouters]
> > What sourceforge did was switch Linux distributions, and upgrade.
> > ... [and quite a bit more] ...

> I hope you're feeling better today <wink>.  "The problem" was one the wng
> msg spelled out:  "It is also possible that the host key has just been
> changed.".  SF changed keys.  That's the whole banana right there.  Deleting
> the sourceforge keys from known_hosts fixed it (== convinced ssh to install
> new SF keys the next time I connected).

Well, if you'd read the thread <wink>, you'll notice that other people had
problems even after that. I'm glad you're not one of them, though :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From barry at digicool.com  Tue Dec 19 19:22:19 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 13:22:19 -0500
Subject: [Python-Dev] Error: syncmail script missing
References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCIEIFIEAA.tim.one@home.com>
Message-ID: <14911.42843.284822.935268@anthem.concentric.net>

Folks,

Python wasn't installed on the new SF CVS machine, which was why
syncmail was broken.  My thanks to the SF guys for quickly remedying
this situation!

Please give it a test.
-Barry



From barry at digicool.com  Tue Dec 19 19:23:32 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 13:23:32 -0500
Subject: [Python-Dev] Error: syncmail script missing
References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCIEIFIEAA.tim.one@home.com>
	<14911.42843.284822.935268@anthem.concentric.net>
Message-ID: <14911.42916.573600.922606@anthem.concentric.net>

>>>>> "BAW" == Barry A Warsaw <barry at digicool.com> writes:

    BAW> Python wasn't installed on the new SF CVS machine, which was
    BAW> why syncmail was broken.  My thanks to the SF guys for
    BAW> quickly remedying this situation!

BTW, it's currently Python 1.5.2.



From tismer at tismer.com  Tue Dec 19 18:34:14 2000
From: tismer at tismer.com (Christian Tismer)
Date: Tue, 19 Dec 2000 19:34:14 +0200
Subject: [Python-Dev] Re: The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCGEHJIEAA.tim.one@home.com>
Message-ID: <3A3F9C16.562F9D9F@tismer.com>

Again...

Tim Peters wrote:
> 
> Sounds good to me!  It's a very cheap way to get the high bits into play.
...
> [Christian]
> > - The bits used from the string hash are not well distributed
> > - using a "warmup wheel" on the hash to suck all bits in
> >   gives the same quality of hashes like random numbers.
> 
> See above and be very cautious:  none of Python's hash functions produce
> well-distributed bits, and-- in effect --that's why Python dicts often
> perform "better than random" on common data.  Even what you've done so far
> appears to provide marginally worse statistics for Guido's favorite kind of
> test case ("worse" in two senses:  total number of collisions (a measure of
> amortized lookup cost), and maximum collision chain length (a measure of
> worst-case lookup cost)):
> 
>    d = {}
>    for i in range(N):
>        d[repr(i)] = i

I will look into this.

> check-in-one-thing-then-let-it-simmer-ly y'rs  - tim

Are you saying I should check the thing in? Really?


In another reply to this message I was saying
"""
This is why I think to be even more conservative:
Try to use a division wheel, but with the inverses
of the original primitive roots, just in order to
get at Guido's results :-)
"""

This was a religious desire, but such an inverse cannot exist.
Well, all inverses exists, but it is an error to think
that they can produce similar bit patterns. Changing the
root means changing the whole system, since we have just
a *representation* of a goup, via polynomial coefficients.

A simple example which renders my thought useless is this:
There is no general pattern that can turn a physical right
shift into a left shift, for all bit combinations.

Anyway, how can I produce a nearly complete scheme like today
with the same "cheaper than random" properties?

Ok, we have to stick with the given polymomials to stay
compatible, and we also have to shift left.
How do we then rotate the random bits in?
Well, we can in fact do a rotation of the whole index, moving
the highest bit into the lowest.
Too bad that this isn't supported in C. It is a native
machine instruction on X86 machines.

We would then have:

                incr = ROTATE_LEFT(incr, 1)
                if (incr > mask):
                    incr = incr ^ mp.ma_poly

The effect is similar to the "old" algorithm, bits are shiftet
left. Only if the hash happens to have hight bits, they appear
in the modulus.
On the current "faster than random" cases, I assume that
high bits in the hash are less likely than low bits, so it is
more likely that an entry finds its good place in the dict,
before bits are rotated in. hence the "good" cases would be kept.

I did all tests again, now including maximum trip length, and
added a "rotate-left" version as well:

D:\crml_doc\platf\py>python dictest.py
N=1000
trips for strings old=293/9 new=302/7 new2=221/7 rot=278/5
trips for bad integers old=499500/999 new=13187/31 new2=999/1 rot=16754/31
trips for random integers old=360/8 new=369/8 new2=358/6 rot=356/7
trips for windows names old=230/5 new=207/7 new2=200/5 rot=225/5
N=2000
trips for strings old=1093/11 new=1109/10 new2=786/6 rot=1082/8
trips for bad integers old=0/0 new=26455/32 new2=1999/1 rot=33524/34
trips for random integers old=704/7 new=686/8 new2=685/7 rot=693/7
trips for windows names old=503/8 new=542/9 new2=564/6 rot=529/7
N=3000
trips for strings old=810/5 new=839/6 new2=609/5 rot=796/5
trips for bad integers old=0/0 new=38681/36 new2=2999/1 rot=49828/38
trips for random integers old=708/5 new=723/7 new2=724/5 rot=722/6
trips for windows names old=712/6 new=711/5 new2=691/5 rot=738/9
N=4000
trips for strings old=1850/9 new=1843/8 new2=1375/11 rot=1848/10
trips for bad integers old=0/0 new=52994/39 new2=3999/1 rot=66356/38
trips for random integers old=1395/9 new=1397/8 new2=1435/9 rot=1394/13
trips for windows names old=1449/8 new=1434/8 new2=1457/11 rot=1513/9

D:\crml_doc\platf\py>

Concerning trip length, rotate is better than old in most cases.
Random integers seem to withstand any of these procedures.
For bad integers, rot takes naturally more trips than new, since
the path to the bits is longer.

All in all I don't see more than marginal differences between
the approaches, and I tent to stick with "new", since it is
theapest to implement.
(it does not cost anything and might instead be a little cheaper
for some compilers, since it does not reference the mask variable).

I'd say let's do the patch   --   ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com
-------------- next part --------------
## dictest.py
## Test of a new rehash algorithm
## Chris Tismer
## 2000-12-17
## Mission Impossible 5oftware Team

# The following is a partial re-implementation of
# Python dictionaries in Python.
# The original algorithm was literally turned
# into Python code.

##/*
##Table of irreducible polynomials to efficiently cycle through
##GF(2^n)-{0}, 2<=n<=30.
##*/
polys = [
    4 + 3,
    8 + 3,
    16 + 3,
    32 + 5,
    64 + 3,
    128 + 3,
    256 + 29,
    512 + 17,
    1024 + 9,
    2048 + 5,
    4096 + 83,
    8192 + 27,
    16384 + 43,
    32768 + 3,
    65536 + 45,
    131072 + 9,
    262144 + 39,
    524288 + 39,
    1048576 + 9,
    2097152 + 5,
    4194304 + 3,
    8388608 + 33,
    16777216 + 27,
    33554432 + 9,
    67108864 + 71,
    134217728 + 39,
    268435456 + 9,
    536870912 + 5,
    1073741824 + 83,
    0
]
polys = map(long, polys)

class NULL: pass

class Dictionary:
    dummy = "<dummy key>"
    def __init__(mp, newalg=0):
        mp.ma_size = 0
        mp.ma_poly = 0
        mp.ma_table = []
        mp.ma_fill = 0
        mp.ma_used = 0
        mp.oldalg = not newalg
        mp.warmup = newalg==2
        mp.rotleft = newalg==3
        mp.trips = 0
        mp.tripmax = 0

    def getTrips(self):
        trips, tripmax = self.trips, self.tripmax
        self.trips = self.tripmax = 0
        return trips, tripmax

    def lookdict(mp, key, _hash):
        me_hash, me_key, me_value = range(3) # rec slots
        dummy = mp.dummy
        
        mask = mp.ma_size-1
        ep0 = mp.ma_table
        i = (~_hash) & mask
        ep = ep0[i]
        if ep[me_key] is NULL or ep[me_key] == key:
            return ep
        if ep[me_key] == dummy:
            freeslot = ep
        else:
            if (ep[me_hash] == _hash and
                cmp(ep[me_key], key) == 0) :
                return ep
            freeslot = NULL

###### FROM HERE
        if mp.oldalg:
            incr = (_hash ^ (_hash >> 3)) & mask
        else:
            # note that we do not mask!
            # the shifting is worth it in the incremental case.

            ## added after posting to python-dev:
            uhash = _hash & 0xffffffffl
            if mp.warmup:
                incr = uhash
                mask2 = 0xffffffffl ^ mask
                while mask2 > mask:
                    if (incr & 1):
                        incr = incr ^ mp.ma_poly
                    incr = incr >> 1
                    mask2 = mask2>>1
                # this loop *can* be sped up by tables
                # with precomputed multiple shifts.
                # But I'm not sure if it is worth it at all.
            else:
                incr = uhash ^ (uhash >> 3)

###### TO HERE
            
        if (not incr):
            incr = mask

        triplen = 0            
        while 1:
            mp.trips = mp.trips+1
            triplen = triplen+1
            if triplen > mp.tripmax:
                mp.tripmax = triplen
            
            ep = ep0[int((i+incr)&mask)]
            if (ep[me_key] is NULL) :
                if (freeslot is not NULL) :
                    return freeslot
                else:
                    return ep
            if (ep[me_key] == dummy) :
                if (freeslot == NULL):
                    freeslot = ep
            elif (ep[me_key] == key or
                 (ep[me_hash] == _hash and
                  cmp(ep[me_key], key) == 0)) :
                return ep

            # Cycle through GF(2^n)-{0}
###### FROM HERE
            if mp.oldalg:
                incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            elif mp.rotleft:
                if incr &0x80000000L:
                    incr = (incr << 1) | 1
                else:
                    incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            else:
                # new algorithm: do a division
                if (incr & 1):
                    incr = incr ^ mp.ma_poly
                incr = incr >> 1
###### TO HERE

    def insertdict(mp, key, _hash, value):
        me_hash, me_key, me_value = range(3) # rec slots
        
        ep = mp.lookdict(key, _hash)
        if (ep[me_value] is not NULL) :
            old_value = ep[me_value]
            ep[me_value] = value
        else :
            if (ep[me_key] is NULL):
                mp.ma_fill=mp.ma_fill+1
            ep[me_key] = key
            ep[me_hash] = _hash
            ep[me_value] = value
            mp.ma_used = mp.ma_used+1

    def dictresize(mp, minused):
        me_hash, me_key, me_value = range(3) # rec slots

        oldsize = mp.ma_size
        oldtable = mp.ma_table
        MINSIZE = 4
        newsize = MINSIZE
        for i in range(len(polys)):
            if (newsize > minused) :
                newpoly = polys[i]
                break
            newsize = newsize << 1
        else:
            return -1

        _nullentry = range(3)
        _nullentry[me_hash] = 0
        _nullentry[me_key] = NULL
        _nullentry[me_value] = NULL

        newtable = map(lambda x,y=_nullentry:y[:], range(newsize))      

        mp.ma_size = newsize
        mp.ma_poly = newpoly
        mp.ma_table = newtable
        mp.ma_fill = 0
        mp.ma_used = 0

        for ep in oldtable:
            if (ep[me_value] is not NULL):
                mp.insertdict(ep[me_key],ep[me_hash],ep[me_value])
        return 0

    # PyDict_GetItem
    def __getitem__(op, key):
        me_hash, me_key, me_value = range(3) # rec slots

        if not op.ma_table:
            raise KeyError, key
        _hash = hash(key)
        return op.lookdict(key, _hash)[me_value]

    # PyDict_SetItem
    def __setitem__(op, key, value):
        mp = op
        _hash = hash(key)
##      /* if fill >= 2/3 size, double in size */
        if (mp.ma_fill*3 >= mp.ma_size*2) :
            if (mp.dictresize(mp.ma_used*2) != 0):
                if (mp.ma_fill+1 > mp.ma_size):
                    raise MemoryError
        mp.insertdict(key, _hash, value)

    # more interface functions
    def keys(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _key)
        return res

    def values(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _value)
        return res

    def items(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( (_key, _value) )
        return res

    def __cmp__(self, other):
        mine = self.items()
        others = other.items()
        mine.sort()
        others.sort()
        return cmp(mine, others)

######################################################
## tests

def test(lis, dic):
    for key in lis: dic[key]

def nulltest(lis, dic):
    for key in lis: dic

def string_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    for i in range(n):
        s = str(i) #* 5
        #s = chr(i%256) + chr(i>>8)##
        d1[s] = d2[s] = d3[s] = d4[s] = i
    return d1, d2, d3, d4

def istring_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    for i in range(n):
        s = chr(i%256) + chr(i>>8)
        d1[s] = d2[s] = d3[s] = d4[s] = i
    return d1, d2, d3, d4

def random_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    from whrandom import randint
    import sys
    keys = []
    for i in range(n):
        keys.append(randint(0, sys.maxint-1))
    for i in keys:
        d1[i] = d2[i] = d3[i] = d4[i] = i
    return d1, d2, d3, d4

def badnum_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    shift = 10
    if EXTREME:
        shift = 16
    for i in range(n):
        bad = i << 16
        d2[bad] = d3[bad] = d4[bad] = i
        if n <= 1000: d1[bad] = i
    return d1, d2, d3, d4

def names_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    import win32con
    keys = win32con.__dict__.keys()
    if len(keys) < n:
        keys = []
    for s in keys[:n]:
        d1[s] = d2[s] = d3[s] = d4[s] = s
    return d1, d2, d3, d4

def do_test(dict):
    keys = dict.keys()
    dict.getTrips() # reset
    test(keys, dict)
    return "%d/%d" % dict.getTrips()

EXTREME=1

if __name__ == "__main__":

    for N in (1000,2000,3000,4000):    

        sdold, sdnew, sdnew2, sdrot = string_dicts(N)
        #idold, idnew, idnew2, idrot = istring_dicts(N)
        bdold, bdnew, bdnew2, bdrot = badnum_dicts(N)
        rdold, rdnew, rdnew2, rdrot = random_dicts(N)
        ndold, ndnew, ndnew2, ndrot = names_dicts(N)
        fmt = "old=%s new=%s new2=%s rot=%s"
        print "N=%d" %N        
        print ("trips for strings "+fmt) % tuple(
            map(do_test, (sdold, sdnew, sdnew2, sdrot)) )
        #print ("trips for bin strings "+fmt) % tuple(
        #    map(do_test, (idold, idnew, idnew2, idrot)) )
        print ("trips for bad integers "+fmt) % tuple(
            map(do_test, (bdold, bdnew, bdnew2, bdrot)))
        print ("trips for random integers "+fmt) % tuple(
            map(do_test, (rdold, rdnew, rdnew2, rdrot)))
        print ("trips for windows names "+fmt) % tuple(
            map(do_test, (ndold, ndnew, ndnew2, ndrot)))

"""
Results with a shift of 10 (EXTREME=0):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.097 new=5.088
timing for bad integers old=101.540 new=12.610

Results with a shift of 16 (EXTREME=1):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.218 new=5.147
timing for bad integers old=571.210 new=19.220
"""

From just at letterror.com  Tue Dec 19 19:46:18 2000
From: just at letterror.com (Just van Rossum)
Date: Tue, 19 Dec 2000 19:46:18 +0100
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <14911.39111.710940.342986@anthem.concentric.net>
References: <l03102804b6653dd31c4e@[193.78.237.123]>
Message-ID: <l03102806b6655cbd62fa@[193.78.237.123]>

At 12:20 PM -0500 19-12-2000, Barry A. Warsaw wrote:
>I agree.  Have we agreed on the semantics and signature of builtin
>join() though?  Is it just string.join() stuck in builtins?

Yep. I'm with /F that further generalization can be done later. Oh, does
this mean that "".join() becomes deprecated? (Nice test case for the
warning framework...)

Just





From barry at digicool.com  Tue Dec 19 19:56:45 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 13:56:45 -0500
Subject: [Python-Dev] Death to string functions!
References: <l03102804b6653dd31c4e@[193.78.237.123]>
	<l03102806b6655cbd62fa@[193.78.237.123]>
Message-ID: <14911.44909.414520.788073@anthem.concentric.net>

>>>>> "JvR" == Just van Rossum <just at letterror.com> writes:

    JvR> Oh, does this mean that "".join() becomes deprecated?

Please, no.




From guido at python.org  Tue Dec 19 19:56:39 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 13:56:39 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Tue, 19 Dec 2000 13:56:45 EST."
             <14911.44909.414520.788073@anthem.concentric.net> 
References: <l03102804b6653dd31c4e@[193.78.237.123]> <l03102806b6655cbd62fa@[193.78.237.123]>  
            <14911.44909.414520.788073@anthem.concentric.net> 
Message-ID: <200012191856.NAA30524@cj20424-a.reston1.va.home.com>

> >>>>> "JvR" == Just van Rossum <just at letterror.com> writes:
> 
>     JvR> Oh, does this mean that "".join() becomes deprecated?
> 
> Please, no.

No.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From just at letterror.com  Tue Dec 19 20:15:19 2000
From: just at letterror.com (Just van Rossum)
Date: Tue, 19 Dec 2000 20:15:19 +0100
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <14911.44909.414520.788073@anthem.concentric.net>
References: <l03102804b6653dd31c4e@[193.78.237.123]>
 <l03102806b6655cbd62fa@[193.78.237.123]>
Message-ID: <l03102808b665629fc5bf@[193.78.237.123]>

At 1:56 PM -0500 19-12-2000, Barry A. Warsaw wrote:
>>>>>> "JvR" == Just van Rossum <just at letterror.com> writes:
>
>    JvR> Oh, does this mean that "".join() becomes deprecated?
>
>Please, no.

And keep two non-deprecated ways to do the same thing? I'm not saying it
should be removed, just that the powers that be declare that _one_ of them
is the preferred way.

And-if-that-one-isn't-builtin-join()-I-don't-know-why-to-even-bother y'rs
-- Just





From greg at cosc.canterbury.ac.nz  Tue Dec 19 23:35:05 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 20 Dec 2000 11:35:05 +1300 (NZDT)
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012191537.KAA28909@cj20424-a.reston1.va.home.com>
Message-ID: <200012192235.LAA02763@s454.cosc.canterbury.ac.nz>

Guido:

> Boy, are you stirring up a can of worms that we've been through many
> times before!  Nothing you say hasn't been said at least a hundred
> times before, on this list as well as on c.l.py.

And I'll wager you'll continue to hear them said at regular intervals
for a long time to come, because you've done something which a lot of
people feel very strongly was a mistake, and they have some very
rational arguments as to why it was a mistake, whereas you don't seem
to have any arguments to the contrary which those people are likely to
find convincing.

> There really seem to be only two possibilities that don't have this
> problem: (1) make it a built-in, or (2) make it a method on strings.

False dichotomy. Some other possibilities:

(3) Use an operator.

(4) Leave it in the string module! Really, I don't see what
would be so bad about that. You still need somewhere to put
all the string-related constants, so why not keep the string
module for those, plus the few functions that don't have
any other obvious place?

> If " ".join(L) bugs you, try this:
>
>    space = " "	 # This could be a global
>     .
>     .
>     .
>    s = space.join(L)

Surely you must realise that this completely fails to
address Mr. Petrilli's concern?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From akuchlin at mems-exchange.org  Wed Dec 20 15:40:58 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 20 Dec 2000 09:40:58 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <E148ZW1-0000xd-00@usw-sf-web3.sourceforge.net>; from noreply@sourceforge.net on Tue, Dec 19, 2000 at 07:02:05PM -0800
References: <E148ZW1-0000xd-00@usw-sf-web3.sourceforge.net>
Message-ID: <20001220094058.A17623@kronos.cnri.reston.va.us>

On Tue, Dec 19, 2000 at 07:02:05PM -0800, noreply at sourceforge.net wrote:
>Date: 2000-Dec-19 19:02
>By: tim_one

>Unrelated to your patch but in the same area: the other msg, "ord()
>expected string or Unicode character", doesn't read right.  The type
>names in question are "string" and "unicode":
>
>>>> type("")
><type 'string'>
>>>> type(u"")
><type 'unicode'>
>>>>
>
>"character" is out of place, or not in enough places.  Just thought I'd mention that, since *you're* so cute!

Is it OK to refer to 8-bit strings under that name?  
How about "expected an 8-bit string or Unicode string", when the object passed to ord() isn't of the right type.

Similarly, when the value is of the right type but has length>1,
the message is "ord() expected a character, length-%d string found".  
Should that be "length-%d (string / unicode) found)" 

And should the type names be changed to '8-bit string'/'Unicode
string', maybe?

--amk



From barry at digicool.com  Wed Dec 20 16:39:30 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Wed, 20 Dec 2000 10:39:30 -0500
Subject: [Python-Dev] IGNORE - this is only a test
Message-ID: <14912.53938.280864.596141@anthem.concentric.net>

Testing the new MX for python.org...



From fdrake at acm.org  Wed Dec 20 17:57:09 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 20 Dec 2000 11:57:09 -0500 (EST)
Subject: [Python-Dev] scp with SourceForge
Message-ID: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>

  I've not been able to get scp to work with SourceForge since they
upgraded their machines.  ssh works fine.  Is this related to the
protocol mismatch problem that was discussed earlier?  My ssh tells me
"SSH Version OpenSSH-1.2.2, protocol version 1.5.", and the remote
sshd is sending it's version as "Remote protocol version 1.99, remote
software version OpenSSH_2.2.0p1".
  Was there a reasonable way to deal with this?  I'm running
Linux-Mandrake 7.1 with very little customization or extra stuff.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tismer at tismer.com  Wed Dec 20 17:31:00 2000
From: tismer at tismer.com (Christian Tismer)
Date: Wed, 20 Dec 2000 18:31:00 +0200
Subject: [Python-Dev] Re: The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCGEHJIEAA.tim.one@home.com> <3A3F9C16.562F9D9F@tismer.com>
Message-ID: <3A40DEC4.5F659E8E@tismer.com>


Christian Tismer wrote:
...
When talking about left rotation, an error crept in. Sorry!

> We would then have:
> 
>                 incr = ROTATE_LEFT(incr, 1)
>                 if (incr > mask):
>                     incr = incr ^ mp.ma_poly

If incr contains the high bits of the hash, then the
above must be replaced by

                incr = ROTATE_LEFT(incr, 1)
                if (incr & (mask+1)):
                    incr = incr ^ mp.ma_poly

or the multiplicative group is not guaranteed to be
generated, obviously.

This doesn't change my results, rotating right is still my choice.

ciao - chris

D:\crml_doc\platf\py>python dictest.py
N=1000
trips for strings old=293/9 new=302/7 new2=221/7 rot=272/8
trips for bad integers old=499500/999 new=13187/31 new2=999/1 rot=16982/27
trips for random integers old=339/9 new=337/7 new2=343/10 rot=342/8
trips for windows names old=230/5 new=207/7 new2=200/5 rot=225/6
N=2000
trips for strings old=1093/11 new=1109/10 new2=786/6 rot=1090/9
trips for bad integers old=0/0 new=26455/32 new2=1999/1 rot=33985/31
trips for random integers old=747/10 new=733/7 new2=734/7 rot=728/8
trips for windows names old=503/8 new=542/9 new2=564/6 rot=521/11
N=3000
trips for strings old=810/5 new=839/6 new2=609/5 rot=820/6
trips for bad integers old=0/0 new=38681/36 new2=2999/1 rot=50985/26
trips for random integers old=709/4 new=728/5 new2=767/5 rot=711/6
trips for windows names old=712/6 new=711/5 new2=691/5 rot=727/7
N=4000
trips for strings old=1850/9 new=1843/8 new2=1375/11 rot=1861/9
trips for bad integers old=0/0 new=52994/39 new2=3999/1 rot=67986/26
trips for random integers old=1584/9 new=1606/8 new2=1505/9 rot=1579/8
trips for windows names old=1449/8 new=1434/8 new2=1457/11 rot=1476/7
-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From tim.one at home.com  Wed Dec 20 20:52:40 2000
From: tim.one at home.com (Tim Peters)
Date: Wed, 20 Dec 2000 14:52:40 -0500
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>

[Fred L. Drake, Jr.]
>   I've not been able to get scp to work with SourceForge since they
> upgraded their machines.  ssh works fine.

Same here.  In particular, I can use ssh to log in to shell.sourceforge.net,
but attempts to scp there act like this (breaking long lines by hand with
\n\t):

> scp -v pep-0042.html
	tim_one at shell.sourceforge.net:/home/groups/python/htdocs/peps
Executing: host shell.sourceforge.net, user tim_one,
	command scp -v -t /home/groups/python/htdocs/peps
SSH Version 1.2.14 [winnt-4.0-x86], protocol version 1.4.
Standard version.  Does not use RSAREF.
ssh_connect: getuid 0 geteuid 0 anon 0
Connecting to shell.sourceforge.net [216.136.171.201] port 22.
Connection established.
Remote protocol version 1.99, remote software version OpenSSH_2.2.0p1
Waiting for server public key.
Received server public key (768 bits) and host key (1024 bits).
Host 'shell.sourceforge.net' is known and matches the host key.
Initializing random; seed file C:\Code/.ssh/random_seed
IDEA not supported, using 3des instead.
Encryption type: 3des
Sent encrypted session key.
Received encrypted confirmation.
Trying RSA authentication with key 'sourceforge'
Server refused our key.
Doing password authentication.
Password:  **** here tim enteredth his password ****
Sending command: scp -v -t /home/groups/python/htdocs/peps
Entering interactive session.

And there it sits forever.  Several others report the same symptom on SF
forums, and assorted unresolved SF Support and Bug reports.  We don't know
what your symptom is!

> Is this related to the protocol mismatch problem that was discussed
> earlier?

Doubt it.  Most commentators pin the blame elsewhere.

> ...
>   Was there a reasonable way to deal with this?

A new note was added to

http://sourceforge.net/support/?func=detailsupport&support_id=110235&group_i
d=1

today, including:

"""
Re: Shell server

We're also aware of the number of problems on the shell server with respect
to restricitive permissions on some programs - and sourcing of shell
environments.  We're also aware of the troubles with scp and transferring
files.  As a work around, we recommend either editing files on the shell
server, or scping files to the shell server from external hosts to the shell
server, whilst logged in to the shell server.
"""

So there you go:  scp files to the shell server from external hosts to the
shell server whilst logged in to the shell server <wink>.

Is scp working for *anyone*???




From fdrake at acm.org  Wed Dec 20 21:17:58 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 20 Dec 2000 15:17:58 -0500 (EST)
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>
References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>
Message-ID: <14913.5110.271684.107030@cj42289-a.reston1.va.home.com>

Tim Peters writes:
 > And there it sits forever.  Several others report the same symptom on SF
 > forums, and assorted unresolved SF Support and Bug reports.  We don't know
 > what your symptom is!

  Exactly the same.

 > So there you go:  scp files to the shell server from external hosts to the
 > shell server whilst logged in to the shell server <wink>.

  Yeah, that really helps.... NOT!  All I want to be able to do is
post a new development version of the documentation.  ;-(


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From bckfnn at worldonline.dk  Wed Dec 20 21:23:33 2000
From: bckfnn at worldonline.dk (Finn Bock)
Date: Wed, 20 Dec 2000 20:23:33 GMT
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
Message-ID: <3a411449.5247545@smtp.worldonline.dk>

[Fred L. Drake]

>  I've not been able to get scp to work with SourceForge since they
>upgraded their machines.  ssh works fine.  Is this related to the
>protocol mismatch problem that was discussed earlier?  My ssh tells me
>"SSH Version OpenSSH-1.2.2, protocol version 1.5.", and the remote
>sshd is sending it's version as "Remote protocol version 1.99, remote
>software version OpenSSH_2.2.0p1".
>  Was there a reasonable way to deal with this?  I'm running
>Linux-Mandrake 7.1 with very little customization or extra stuff.

I managed to update the jython website by logging into the shell machine
by ssh and doing a ftp back to my machine (using the IP number). That
isn't exactly reasonable, but I was desperate.

regards,
finn



From tim.one at home.com  Wed Dec 20 21:42:11 2000
From: tim.one at home.com (Tim Peters)
Date: Wed, 20 Dec 2000 15:42:11 -0500
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14913.5110.271684.107030@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENBIEAA.tim.one@home.com>

[Tim]
> So there you go:  scp files to the shell server from external
> hosts to the shell server whilst logged in to the shell server <wink>.

[Fred]
>   Yeah, that really helps.... NOT!  All I want to be able to do is
> post a new development version of the documentation.  ;-(

All I want to do is make a measly change to a PEP -- I'm afraid it doesn't
ask how trivial your intents are.  If some suck^H^H^H^Hdeveloper admits that
scp works for them, maybe we can mail them stuff and have *them* copy it
over.

no-takers-so-far-though-ly y'rs  - tim




From barry at digicool.com  Wed Dec 20 21:49:00 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Wed, 20 Dec 2000 15:49:00 -0500
Subject: [Python-Dev] scp with SourceForge
References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>
Message-ID: <14913.6972.934625.840781@anthem.concentric.net>

>>>>> "TP" == Tim Peters <tim.one at home.com> writes:

    TP> So there you go: scp files to the shell server from external
    TP> hosts to the shell server whilst logged in to the shell server
    TP> <wink>.

Psheesh, /that/ was obvious.  Did you even have to ask?

    TP> Is scp working for *anyone*???

Nope, same thing happens to me; it just hangs.
-Barry



From tim.one at home.com  Wed Dec 20 21:53:38 2000
From: tim.one at home.com (Tim Peters)
Date: Wed, 20 Dec 2000 15:53:38 -0500
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14913.6972.934625.840781@anthem.concentric.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKENEIEAA.tim.one@home.com>

[Tim, quoting a bit of immortal SF support prose]
>     TP> So there you go: scp files to the shell server from external
>     TP> hosts to the shell server whilst logged in to the shell server
>     TP> <wink>.

[Barry]
> Psheesh, /that/ was obvious.  Did you even have to ask?

Actually, isn't this easy to do on Linux?  That is, run an ssh server
(whatever) on your home machine, log in to the SF shell (which everyone
seems able to do), then

   scp  whatever  your_home_IP_address:your_home_path

from the SF shell?  Heck, I can even get that to work on Windows, except I
don't know how to set up anything on my end to accept the connection <wink>.

>     TP> Is scp working for *anyone*???

> Nope, same thing happens to me; it just hangs.

That's good to know -- since nobody else mentioned this, Fred probably
figured he was unique.

not-that-he-isn't-it's-just-that-he's-not-ly y'rs  - tim




From fdrake at acm.org  Wed Dec 20 21:52:10 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 20 Dec 2000 15:52:10 -0500 (EST)
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKENEIEAA.tim.one@home.com>
References: <14913.6972.934625.840781@anthem.concentric.net>
	<LNBBLJKPBEHFEDALKOLCKENEIEAA.tim.one@home.com>
Message-ID: <14913.7162.824838.63143@cj42289-a.reston1.va.home.com>

Tim Peters writes:
 > Actually, isn't this easy to do on Linux?  That is, run an ssh server
 > (whatever) on your home machine, log in to the SF shell (which everyone
 > seems able to do), then
 > 
 >    scp  whatever  your_home_IP_address:your_home_path
 > 
 > from the SF shell?  Heck, I can even get that to work on Windows, except I
 > don't know how to set up anything on my end to accept the connection <wink>.

  Err, yes, that's easy to do, but... that means putting your private
key on SourceForge.  They're a great bunch of guys, but they can't
have my private key!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tim.one at home.com  Wed Dec 20 22:06:07 2000
From: tim.one at home.com (Tim Peters)
Date: Wed, 20 Dec 2000 16:06:07 -0500
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14913.7162.824838.63143@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENGIEAA.tim.one@home.com>

[Fred]
>   Err, yes, that's easy to do, but... that means putting your private
> key on SourceForge.  They're a great bunch of guys, but they can't
> have my private key!

So generate a unique one-shot key pair for the life of the copy.  I can do
that for you on Windows if you lack a real OS <snort>.




From thomas at xs4all.net  Wed Dec 20 23:59:49 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 20 Dec 2000 23:59:49 +0100
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>; from tim.one@home.com on Wed, Dec 20, 2000 at 02:52:40PM -0500
References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>
Message-ID: <20001220235949.F29681@xs4all.nl>

On Wed, Dec 20, 2000 at 02:52:40PM -0500, Tim Peters wrote:

> So there you go:  scp files to the shell server from external hosts to the
> shell server whilst logged in to the shell server <wink>.

> Is scp working for *anyone*???

Not for me, anyway. And I'm not just saying that to avoid scp-duty :) And
I'm using the same ssh version, which works fine on all other machines. It
probably has to do with the funky setup Sourceforge uses. (Try looking at
'df' and 'cat /proc/mounts', and comparing the two -- you'll see what I mean
:) That also means I'm not tempted to try and reproduce it, obviously :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tim.one at home.com  Thu Dec 21 04:24:12 2000
From: tim.one at home.com (Tim Peters)
Date: Wed, 20 Dec 2000 22:24:12 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012192235.LAA02763@s454.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEOEIEAA.tim.one@home.com>

[Guido]
>> Boy, are you stirring up a can of worms that we've been through many
>> times before!  Nothing you say hasn't been said at least a hundred
>> times before, on this list as well as on c.l.py.

[Greg Ewing]
> And I'll wager you'll continue to hear them said at regular intervals
> for a long time to come, because you've done something which a lot of
> people feel very strongly was a mistake, and they have some very
> rational arguments as to why it was a mistake, whereas you don't seem
> to have any arguments to the contrary which those people are likely to
> find convincing.

Then it's a wash:  Guido doesn't find their arguments convincing either, and
ties favor the status quo even in the absence of BDFLness.

>> There really seem to be only two possibilities that don't have this
>> problem: (1) make it a built-in, or (2) make it a method on strings.

> False dichotomy. Some other possibilities:
>
> (3) Use an operator.

Oh, that's likely <wink>.

> (4) Leave it in the string module! Really, I don't see what
> would be so bad about that. You still need somewhere to put
> all the string-related constants, so why not keep the string
> module for those, plus the few functions that don't have
> any other obvious place?

Guido said he wants to deprecate the entire string module, so that Python
can eventually warn on the mere presence of "import string".  That's what he
said when I earlier ranted in favor of keeping the string module around.

My guess is that making it a builtin is the only alternative that stands any
chance at this point.

>> If " ".join(L) bugs you, try this:
>>
>>    space = " "	 # This could be a global
>>     .
>>     .
>>     .
>>    s = space.join(L)

> Surely you must realise that this completely fails to
> address Mr. Petrilli's concern?

Don't know about Guido, but I don't realize that, and we haven't heard back
from Charles.  His objections were raised the first day " ".join was
suggested, space.join was suggested almost immediately after, and that
latter suggestion did seem to pacify at least several objectors.  Don't know
whether it makes Charles happier, but since it *has* made others happier in
the past, it's not unreasonable to imagine that Charles might like it too.

if-we're-to-be-swayed-by-his-continued-outrage-afraid-it-will-
    have-to-come-from-him-ly y'rs  - tim




From tim.one at home.com  Thu Dec 21 08:44:19 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 21 Dec 2000 02:44:19 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <20001220094058.A17623@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEOMIEAA.tim.one@home.com>

[Andrew Kuchling]
> Is it OK to refer to 8-bit strings under that name?
> How about "expected an 8-bit string or Unicode string", when the
> object passed to ord() isn't of the right type.
>
> Similarly, when the value is of the right type but has length>1,
> the message is "ord() expected a character, length-%d string found".
> Should that be "length-%d (string / unicode) found)"
>
> And should the type names be changed to '8-bit string'/'Unicode
> string', maybe?

Actually, upon reflection I think it was a mistake to add all these "or
Unicode" clauses to the error msgs to begin with.  Python used to have only
one string type, we're saying that's also a hope for the future, and in the
meantime I know I'd have no trouble understanding "string" as including both
8-bit strings and Unicode strings.

So we should say "8-bit string" or "Unicode string" when *only* one of those
is allowable.  So

    "ord() expected string ..."

instead of (even a repaired version of)

    "ord() expected string or Unicode character ..."

but-i'm-not-even-motivated-enough-to-finish-this-sig-




From tim.one at home.com  Thu Dec 21 09:52:54 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 21 Dec 2000 03:52:54 -0500
Subject: [Python-Dev] RE: The Dictionary Gem is polished!
In-Reply-To: <3A3F9C16.562F9D9F@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPAIEAA.tim.one@home.com>

[Christian Tismer]
> Are you saying I should check the thing in? Really?

Of course.  The first thing you talked about showed a major improvement in
some bad cases, did no harm in the others, and both results were more than
just plausible -- they made compelling sense and were backed by simulation.
So why not check it in?  It's a clear net win!

Stuff since then has been a spattering of maybe-good maybe-bad maybe-neutral
ideas that hasn't gotten anywhere conclusive.  What I want to avoid is
another "Unicode compression" scenario, where we avoid grabbing a clear win
for months just because it may not be the best possible of all conceivable
compression schemes -- and then mistakes get made in a last-second rush to
get *any* improvement.

Checking in a clear improvement today does not preclude checking in a better
one next week <wink>.

> ...
> Ok, we have to stick with the given polymomials to stay
> compatible,

Na, feel free to explore that too, if you like.  It really should get some
study!  The polys there now are utterly arbitrary:  of all polys that happen
to be irreducible and that have x as a primitive root in the induced
multiplicative group, these are simply the smallest when viewed as binary
integers.  That's because they were *found* by trying all odd binary ints
with odd parity (even ints and ints with even parity necessarily correspond
to reducible polys), starting with 2**N+3 and going up until finding the
first one that was both irreducible and had x as a primitive root.  There's
no theory at all that I know of to say that any such poly is any better for
this purpose than any other.  And they weren't tested for that either --
they're simply the first ones "that worked at all" in a brute search.

Besides, Python's "better than random" dict behavior-- when it obtains! --is
almost entirely due to that its hash functions produce distinct starting
indices more often than a random hash function would.  The contribution of
the GF-based probe sequence in case of collision is to avoid the terrible
behavior most other forms of probe sequence would cause given that Python's
hash functions also tend to fill solid contiguous slices of the table more
often than would a random hash function.

[stuff about rotation]
> ...
> Too bad that this isn't supported in C. It is a native
> machine instruction on X86 machines.

Guido long ago rejected hash functions based on rotation for this reason;
he's not likely to approve of rotations more in the probe sequence <wink>.

A similar frustration is that almost modern CPUs have a fast instruction to
get at the high 32 bits of a 32x32->64 bit multiply:  another way to get the
high bits of the hash code into play is to multiply the 32-bit hash code by
a 32-bit constant (see Knuth for "Fibonacci hashing" details), and take the
least-significant N bits of the *upper* 32 bits of the 64-bit product as the
initial table index.  If the constant is chosen correctly, this defines a
permutation on the space of 32-bit unsigned ints, and can be very effective
at "scrambling" arithmetic progressions (which Python's hash functions often
produce).  But C doesn't give a decent way to get at that either.

> ...
> On the current "faster than random" cases, I assume that
> high bits in the hash are less likely than low bits,

I'm not sure what this means.  As the comment in dictobject.c says, it's
common for Python's hash functions to return a result with lots of leading
zeroes.  But the lookup currently applies ~ to those first (which is a bad
idea -- see earlier msgs), so the actual hash that gets *used* often has
lots of leading ones.

> so it is more likely that an entry finds its good place in the dict,
> before bits are rotated in. hence the "good" cases would be kept.

I can agree with this easily if I read the above as asserting that in the
very good cases today, the low bits of hashes (whether or not ~ is applied)
vary more than the high bits.

> ...
> Random integers seem to withstand any of these procedures.

If you wanted to, you could *define* random this way <wink>.

> ...
> I'd say let's do the patch   --   ciao - chris

full-circle-ly y'rs  - tim




From mal at lemburg.com  Thu Dec 21 12:16:27 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 21 Dec 2000 12:16:27 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
References: <LNBBLJKPBEHFEDALKOLCKEOMIEAA.tim.one@home.com>
Message-ID: <3A41E68B.6B12CD71@lemburg.com>

Tim Peters wrote:
> 
> [Andrew Kuchling]
> > Is it OK to refer to 8-bit strings under that name?
> > How about "expected an 8-bit string or Unicode string", when the
> > object passed to ord() isn't of the right type.
> >
> > Similarly, when the value is of the right type but has length>1,
> > the message is "ord() expected a character, length-%d string found".
> > Should that be "length-%d (string / unicode) found)"
> >
> > And should the type names be changed to '8-bit string'/'Unicode
> > string', maybe?
> 
> Actually, upon reflection I think it was a mistake to add all these "or
> Unicode" clauses to the error msgs to begin with.  Python used to have only
> one string type, we're saying that's also a hope for the future, and in the
> meantime I know I'd have no trouble understanding "string" as including both
> 8-bit strings and Unicode strings.
> 
> So we should say "8-bit string" or "Unicode string" when *only* one of those
> is allowable.  So
> 
>     "ord() expected string ..."
> 
> instead of (even a repaired version of)
> 
>     "ord() expected string or Unicode character ..."

I think this has to do with understanding that there are two
string types in Python 2.0 -- a novice won't notice this until
she sees the error message.

My understanding is similar to yours, "string" should mean
"any string object" and in cases where the difference between
8-bit string and Unicode matters, these should be referred to
as "8-bit string" and "Unicode string".

Still, I think it is a good idea to make people aware of the
possibility of passing Unicode objects to these functions, so
perhaps the idea of adding both possibilies to error messages
is not such a bad idea for 2.1. The next phases would be converting
all messages back to "string" and then convert all strings to
Unicode ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From akuchlin at mems-exchange.org  Thu Dec 21 19:37:19 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 21 Dec 2000 13:37:19 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEOMIEAA.tim.one@home.com>; from tim.one@home.com on Thu, Dec 21, 2000 at 02:44:19AM -0500
References: <20001220094058.A17623@kronos.cnri.reston.va.us> <LNBBLJKPBEHFEDALKOLCKEOMIEAA.tim.one@home.com>
Message-ID: <20001221133719.B11880@kronos.cnri.reston.va.us>

On Thu, Dec 21, 2000 at 02:44:19AM -0500, Tim Peters wrote:
>So we should say "8-bit string" or "Unicode string" when *only* one of those
>is allowable.  So

OK... how about this patch?

Index: bltinmodule.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v
retrieving revision 2.185
diff -u -r2.185 bltinmodule.c
--- bltinmodule.c	2000/12/20 15:07:34	2.185
+++ bltinmodule.c	2000/12/21 18:36:54
@@ -1524,13 +1524,14 @@
 		}
 	} else {
 		PyErr_Format(PyExc_TypeError,
-			     "ord() expected string or Unicode character, " \
+			     "ord() expected string of length 1, but " \
 			     "%.200s found", obj->ob_type->tp_name);
 		return NULL;
 	}
 
 	PyErr_Format(PyExc_TypeError, 
-		     "ord() expected a character, length-%d string found",
+		     "ord() expected a character, "
+                     "but string of length %d found",
 		     size);
 	return NULL;
 }



From thomas at xs4all.net  Fri Dec 22 16:21:43 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 22 Dec 2000 16:21:43 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>; from noreply@sourceforge.net on Fri, Dec 22, 2000 at 07:07:03AM -0800
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
Message-ID: <20001222162143.A5515@xs4all.nl>

On Fri, Dec 22, 2000 at 07:07:03AM -0800, noreply at sourceforge.net wrote:

>   * Guido-style:  8-column hard-tab indents.
>   * New style:  4-column space-only indents.

Hm, I must have missed this... Is 'new style' the preferred style, as its
name suggests, or is Guido mounting a rebellion to adhere to the One True
Style (or rather his own version of it, which just has the * in pointer
type declarations wrong ? :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From fdrake at acm.org  Fri Dec 22 16:31:21 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Dec 2000 10:31:21 -0500 (EST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <20001222162143.A5515@xs4all.nl>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222162143.A5515@xs4all.nl>
Message-ID: <14915.29641.806901.661707@cj42289-a.reston1.va.home.com>

Thomas Wouters writes:
 > Hm, I must have missed this... Is 'new style' the preferred style, as its
 > name suggests, or is Guido mounting a rebellion to adhere to the One True
 > Style (or rather his own version of it, which just has the * in pointer
 > type declarations wrong ? :)

  Guido has grudgingly granted that new code in the "New style" is
acceptable, mostly because many people complain that "Guido style"
causes too much code to get scrunched up on the right margin.  The
"New style" is more like the recommendations for Python code as well,
so it's easier for Python programmers to read (Tabs are hard to read
clearly! ;).


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From cgw at fnal.gov  Fri Dec 22 16:43:45 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Fri, 22 Dec 2000 09:43:45 -0600 (CST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add
 panel support
In-Reply-To: <14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222162143.A5515@xs4all.nl>
	<14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
Message-ID: <14915.30385.201343.360880@buffalo.fnal.gov>

Fred L. Drake, Jr. writes:
 > 
 >   Guido has grudgingly granted that new code in the "New style" is
 > acceptable, mostly because many people complain that "Guido style"
 > causes too much code to get scrunched up on the right margin.

I am reminded of Linus Torvalds comments on this subject (see
/usr/src/linux/Documentation/CodingStyle):

  Now, some people will claim that having 8-character indentations
  makes the code move too far to the right, and makes it hard to read
  on a 80-character terminal screen.  The answer to that is that if
  you need more than 3 levels of indentation, you're screwed anyway,
  and should fix your program.




From fdrake at acm.org  Fri Dec 22 16:58:56 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Dec 2000 10:58:56 -0500 (EST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add
 panel support
In-Reply-To: <14915.30385.201343.360880@buffalo.fnal.gov>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222162143.A5515@xs4all.nl>
	<14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
	<14915.30385.201343.360880@buffalo.fnal.gov>
Message-ID: <14915.31296.56181.260479@cj42289-a.reston1.va.home.com>

Charles G Waldman writes:
 > I am reminded of Linus Torvalds comments on this subject (see
 > /usr/src/linux/Documentation/CodingStyle):

  The catch, of course, is Python/cevel.c, where breaking it up can
hurt performance.  People scream when you do things like that....


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From cgw at fnal.gov  Fri Dec 22 17:07:47 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Fri, 22 Dec 2000 10:07:47 -0600 (CST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add
 panel support
In-Reply-To: <14915.31296.56181.260479@cj42289-a.reston1.va.home.com>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222162143.A5515@xs4all.nl>
	<14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
	<14915.30385.201343.360880@buffalo.fnal.gov>
	<14915.31296.56181.260479@cj42289-a.reston1.va.home.com>
Message-ID: <14915.31827.250987.283364@buffalo.fnal.gov>

Fred L. Drake, Jr. writes:
 > 
 >   The catch, of course, is Python/cevel.c, where breaking it up can
 > hurt performance.  People scream when you do things like that....

Quoting again from the same source:

 Use helper functions with descriptive names (you can ask the compiler
 to in-line them if you think it's performance-critical, and it will
 probably do a better job of it that you would have done).

But I should have pointed out that I was quoting the great Linus
mostly for entertainment/cultural value, and was not really trying to
add fuel to the fire.  In other words, a message that I thought was
amusing, but probably shouldn't have sent ;-)



From fdrake at acm.org  Fri Dec 22 17:20:52 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Dec 2000 11:20:52 -0500 (EST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add
 panel support
In-Reply-To: <14915.31827.250987.283364@buffalo.fnal.gov>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222162143.A5515@xs4all.nl>
	<14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
	<14915.30385.201343.360880@buffalo.fnal.gov>
	<14915.31296.56181.260479@cj42289-a.reston1.va.home.com>
	<14915.31827.250987.283364@buffalo.fnal.gov>
Message-ID: <14915.32612.252115.562296@cj42289-a.reston1.va.home.com>

Charles G Waldman writes:
 > But I should have pointed out that I was quoting the great Linus
 > mostly for entertainment/cultural value, and was not really trying to
 > add fuel to the fire.  In other words, a message that I thought was
 > amusing, but probably shouldn't have sent ;-)

  I understood the intent; I think he's really got a point.  There are
a few places in Python where it would really help to break things up!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From fredrik at effbot.org  Fri Dec 22 17:33:37 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 22 Dec 2000 17:33:37 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net><20001222162143.A5515@xs4all.nl><14915.29641.806901.661707@cj42289-a.reston1.va.home.com><14915.30385.201343.360880@buffalo.fnal.gov><14915.31296.56181.260479@cj42289-a.reston1.va.home.com><14915.31827.250987.283364@buffalo.fnal.gov> <14915.32612.252115.562296@cj42289-a.reston1.va.home.com>
Message-ID: <004b01c06c34$f08151c0$e46940d5@hagrid>

Fred wrote:
>   I understood the intent; I think he's really got a point.  There are
> a few places in Python where it would really help to break things up!

if that's what you want, maybe you could start by
putting the INLINE stuff back again? <halfwink>

(if C/C++ compatibility is a problem, put it inside a
cplusplus ifdef, and mark it as "for internal use only.
don't use inline on public interfaces")

</F>




From fdrake at acm.org  Fri Dec 22 17:36:15 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Dec 2000 11:36:15 -0500 (EST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <004b01c06c34$f08151c0$e46940d5@hagrid>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222162143.A5515@xs4all.nl>
	<14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
	<14915.30385.201343.360880@buffalo.fnal.gov>
	<14915.31296.56181.260479@cj42289-a.reston1.va.home.com>
	<14915.31827.250987.283364@buffalo.fnal.gov>
	<14915.32612.252115.562296@cj42289-a.reston1.va.home.com>
	<004b01c06c34$f08151c0$e46940d5@hagrid>
Message-ID: <14915.33535.520957.215310@cj42289-a.reston1.va.home.com>

Fredrik Lundh writes:
 > if that's what you want, maybe you could start by
 > putting the INLINE stuff back again? <halfwink>

  I could not see the value in the inline stuff that configure was
setting up, and still don't.

 > (if C/C++ compatibility is a problem, put it inside a
 > cplusplus ifdef, and mark it as "for internal use only.
 > don't use inline on public interfaces")

  We should be able to come up with something reasonable, but I don't
have time right now, and my head isn't currently wrapped around C
compilers.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From akuchlin at cnri.reston.va.us  Fri Dec 22 19:01:43 2000
From: akuchlin at cnri.reston.va.us (Andrew Kuchling)
Date: Fri, 22 Dec 2000 13:01:43 -0500
Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>; from noreply@sourceforge.net on Fri, Dec 22, 2000 at 07:07:03AM -0800
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
Message-ID: <20001222130143.B7127@newcnri.cnri.reston.va.us>

On Fri, Dec 22, 2000 at 07:07:03AM -0800, noreply at sourceforge.net wrote:
>  * Guido-style:  8-column hard-tab indents.
>  * New style:  4-column space-only indents.
>  * _curses style: 2 column indents.
>
>I'd prefer "New style", myself.

New style it is.  (Barry, is the "python" style in cc-mode.el going to
be changed to new style, or a "python2" style added?)

I've been wanting to reformat _cursesmodule.c to match the Python
style for some time.  Probably I'll do that a little while after the
panel module has settled down a bit.

Fred, did you look at the use of the CObject for exposing the API?
Did that look reasonable?  Also, should py_curses.h go in the Include/
subdirectory instead of Modules/?

--amk



From fredrik at effbot.org  Fri Dec 22 19:03:43 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 22 Dec 2000 19:03:43 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net><20001222162143.A5515@xs4all.nl><14915.29641.806901.661707@cj42289-a.reston1.va.home.com><14915.30385.201343.360880@buffalo.fnal.gov><14915.31296.56181.260479@cj42289-a.reston1.va.home.com><14915.31827.250987.283364@buffalo.fnal.gov><14915.32612.252115.562296@cj42289-a.reston1.va.home.com><004b01c06c34$f08151c0$e46940d5@hagrid> <14915.33535.520957.215310@cj42289-a.reston1.va.home.com>
Message-ID: <006701c06c41$896a1a00$e46940d5@hagrid>

Fred wrote:
>  > if that's what you want, maybe you could start by
>  > putting the INLINE stuff back again? <halfwink>
> 
>   I could not see the value in the inline stuff that configure was
> setting up, and still don't.

the INLINE stuff guarantees that "inline" is defined to be
whatever directive the compiler uses for explicit inlining.
quoting the autoconf docs:

    If the C compiler supports the keyword inline,
    do nothing. Otherwise define inline to __inline__
    or __inline if it accepts one of those, otherwise
    define inline to be empty

as a result, you can always use "inline" in your code, and
have it do the right thing on all compilers that support ex-
plicit inlining (all modern C compilers, in practice).

:::

to deal with people compiling Python with a C compiler, but
linking it with a C++ compiler, the config.h.in file could be
written as:

/* Define "inline" to be whatever the C compiler calls it.
    To avoid problems when mixing C and C++, make sure
    to only use "inline" for internal interfaces. */
#ifndef __cplusplus
#undef inline
#endif

</F>




From akuchlin at mems-exchange.org  Fri Dec 22 20:40:15 2000
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Fri, 22 Dec 2000 14:40:15 -0500
Subject: [Python-Dev] PEP 222 draft
Message-ID: <200012221940.OAA01936@207-172-57-45.s45.tnt2.ann.va.dialup.rcn.com>

I've completed a draft of PEP 222 (sort of -- note the XXX comments in
the text for things that still need to be resolved).  This is being
posted to python-dev, python-web-modules, and
python-list/comp.lang.python, to get comments on the proposed
interface.  I'm on all three lists, but would prefer to see followups
on python-list/comp.lang.python, so if you can reply there, please do
so.

--amk

Abstract

    This PEP proposes a set of enhancements to the CGI development
    facilities in the Python standard library.  Enhancements might be
    new features, new modules for tasks such as cookie support, or
    removal of obsolete code.

    The intent is to incorporate the proposals emerging from this
    document into Python 2.1, due to be released in the first half of
    2001.


Open Issues

    This section lists changes that have been suggested, but about
    which no firm decision has yet been made.  In the final version of
    this PEP, this section should be empty, as all the changes should
    be classified as accepted or rejected.

    cgi.py: We should not be told to create our own subclass just so
    we can handle file uploads. As a practical matter, I have yet to
    find the time to do this right, so I end up reading cgi.py's temp
    file into, at best, another file. Some of our legacy code actually
    reads it into a second temp file, then into a final destination!
    And even if we did, that would mean creating yet another object
    with its __init__ call and associated overhead.

    cgi.py: Currently, query data with no `=' are ignored.  Even if
    keep_blank_values is set, queries like `...?value=&...' are
    returned with blank values but queries like `...?value&...' are
    completely lost.  It would be great if such data were made
    available through the FieldStorage interface, either as entries
    with None as values, or in a separate list.

    Utility function: build a query string from a list of 2-tuples

    Dictionary-related utility classes: NoKeyErrors (returns an empty
    string, never a KeyError), PartialStringSubstitution (returns 
    the original key string, never a KeyError)


    
New Modules

    This section lists details about entire new packages or modules
    that should be added to the Python standard library.

    * fcgi.py : A new module adding support for the FastCGI protocol.
      Robin Dunn's code needs to be ported to Windows, though.

Major Changes to Existing Modules

    This section lists details of major changes to existing modules,
    whether in implementation or in interface.  The changes in this
    section therefore carry greater degrees of risk, either in
    introducing bugs or a backward incompatibility.

    The cgi.py module would be deprecated.  (XXX A new module or
    package name hasn't been chosen yet: 'web'?  'cgilib'?)

Minor Changes to Existing Modules

    This section lists details of minor changes to existing modules.
    These changes should have relatively small implementations, and
    have little risk of introducing incompatibilities with previous
    versions.


Rejected Changes

    The changes listed in this section were proposed for Python 2.1,
    but were rejected as unsuitable.  For each rejected change, a
    rationale is given describing why the change was deemed
    inappropriate.

    * An HTML generation module is not part of this PEP.  Several such
      modules exist, ranging from HTMLgen's purely programming
      interface to ASP-inspired simple templating to DTML's complex
      templating.  There's no indication of which templating module to
      enshrine in the standard library, and that probably means that
      no module should be so chosen.

    * cgi.py: Allowing a combination of query data and POST data.
      This doesn't seem to be standard at all, and therefore is
      dubious practice.

Proposed Interface

    XXX open issues: naming convention (studlycaps or
    underline-separated?); need to look at the cgi.parse*() functions
    and see if they can be simplified, too.

    Parsing functions: carry over most of the parse* functions from
    cgi.py
    
    # The Response class borrows most of its methods from Zope's
    # HTTPResponse class.
    
    class Response:
        """
        Attributes:
        status: HTTP status code to return
        headers: dictionary of response headers
        body: string containing the body of the HTTP response
        """
        
        def __init__(self, status=200, headers={}, body=""):
            pass
    
        def setStatus(self, status, reason=None):
            "Set the numeric HTTP response code"
            pass
    
        def setHeader(self, name, value):
            "Set an HTTP header"
            pass
    
        def setBody(self, body):
            "Set the body of the response"
            pass
    
        def setCookie(self, name, value,
                      path = '/',  
                      comment = None, 
                      domain = None, 
                      max-age = None,
                      expires = None,
                      secure = 0
                      ):
            "Set a cookie"
            pass
    
        def expireCookie(self, name):
            "Remove a cookie from the user"
            pass
    
        def redirect(self, url):
            "Redirect the browser to another URL"
            pass
    
        def __str__(self):
            "Convert entire response to a string"
            pass
    
        def dump(self):
            "Return a string representation useful for debugging"
            pass
            
        # XXX methods for specific classes of error:serverError, badRequest, etc.?
    
    
    class Request:
    
        """
        Attributes: 

        XXX should these be dictionaries, or dictionary-like objects?
        .headers : dictionary containing HTTP headers
        .cookies : dictionary of cookies
        .fields  : data from the form
        .env     : environment dictionary
        """
        
        def __init__(self, environ=os.environ, stdin=sys.stdin,
                     keep_blank_values=1, strict_parsing=0):
            """Initialize the request object, using the provided environment
            and standard input."""
            pass
    
        # Should people just use the dictionaries directly?
        def getHeader(self, name, default=None):
            pass
    
        def getCookie(self, name, default=None):
            pass
    
        def getField(self, name, default=None):
            "Return field's value as a string (even if it's an uploaded file)"
            pass
            
        def getUploadedFile(self, name):
            """Returns a file object that can be read to obtain the contents
            of an uploaded file.  XXX should this report an error if the 
            field isn't actually an uploaded file?  Or should it wrap
            a StringIO around simple fields for consistency?
            """
            
        def getURL(self, n=0, query_string=0):
            """Return the URL of the current request, chopping off 'n' path
            components from the right.  Eg. if the URL is
            "http://foo.com/bar/baz/quux", n=2 would return
            "http://foo.com/bar".  Does not include the query string (if
            any)
            """

        def getBaseURL(self, n=0):
            """Return the base URL of the current request, adding 'n' path
            components to the end to recreate more of the whole URL.  
            
            Eg. if the request URL is
            "http://foo.com/q/bar/baz/qux", n=0 would return
            "http://foo.com/", and n=2 "http://foo.com/q/bar".
            
            Returned URL does not include the query string, if any.
            """
        
        def dump(self):
            "String representation suitable for debugging output"
            pass
    
        # Possibilities?  I don't know if these are worth doing in the 
        # basic objects.
        def getBrowser(self):
            "Returns Mozilla/IE/Lynx/Opera/whatever"
    
        def isSecure(self):
            "Return true if this is an SSLified request"
            

    # Module-level function        
    def wrapper(func, logfile=sys.stderr):
        """
        Calls the function 'func', passing it the arguments
        (request, response, logfile).  Exceptions are trapped and
        sent to the file 'logfile'.  
        """
        # This wrapper will detect if it's being called from the command-line,
        # and if so, it will run in a debugging mode; name=value pairs 
        # can be entered on standard input to set field values.
        # (XXX how to do file uploads in this syntax?)

    
Copyright
    
    This document has been placed in the public domain.




From tim.one at home.com  Fri Dec 22 20:31:07 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 22 Dec 2000 14:31:07 -0500
Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support)
In-Reply-To: <20001222162143.A5515@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCAECIIFAA.tim.one@home.com>

[Thomas Wouters]
>>   * Guido-style:  8-column hard-tab indents.
>>   * New style:  4-column space-only indents.
>
> Hm, I must have missed this... Is 'new style' the preferred style, as
> its name suggests, or is Guido mounting a rebellion to adhere to the
> One True Style (or rather his own version of it, which just has
> the * in pointer type declarations wrong ? :)

Every time this comes up wrt C code,

1. Fred repeats that he thinks Guido caved in (but doesn't supply a
reference to anything saying so).

2. Guido repeats that he prefers old-style (but in a wishy-washy way that
leaves it uncertain (*)).

3. Fredrik and/or I repeat a request for a BDFL Pronouncement.

4. And there the thread ends.

It's *very* hard to find this history in the Python-Dev archives because
these threads always have subject lines like this one originally had ("RE:
[Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel
support").

Fred already did the #1 bit in this thread.

You can consider this msg the repeat of #3.

Since Guido is out of town, we can skip #2 and go straight to #4 early
<wink>.


(*) Two examples of #2 from this year:

Subject: Re: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/
Modules mmapmodule.c,2.1,2.2
From: Guido van Rossum <guido at python.org>
Date: Fri, 31 Mar 2000 07:10:45 -0500
> Can we change the 8-space-tab rule for all new C code that goes in?  I
> know that we can't practically change existing code right now, but for
> new C code, I propose we use no tab characters, and we use a 4-space
> block indentation.

Actually, this one was formatted for 8-space indents but using 4-space
tabs, so in my editor it looked like 16-space indents!

Given that we don't want to change existing code, I'd prefer to stick
with 1-tab 8-space indents.



Subject: Re: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules
linuxaudiodev.c,2.2,2.3
From: Guido van Rossum <guido at beopen.com>
Date: Sat, 08 Jul 2000 09:39:51 -0500

> Aren't tabs preferred as C-source indents, instead of 4-spaces ? At
> least, that's what I see in Python/*.c and Object/*.c, but I only
> vaguely recall it from the style document...

Yes, you're right.




From fredrik at effbot.org  Fri Dec 22 21:37:35 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 22 Dec 2000 21:37:35 +0100
Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support)
References: <LNBBLJKPBEHFEDALKOLCAECIIFAA.tim.one@home.com>
Message-ID: <00e201c06c57$052fff00$e46940d5@hagrid>

> 3. Fredrik and/or I repeat a request for a BDFL Pronouncement.

and.

</F>




From akuchlin at mems-exchange.org  Fri Dec 22 22:09:47 2000
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Fri, 22 Dec 2000 16:09:47 -0500
Subject: [Python-Dev] Reviving the bookstore
Message-ID: <200012222109.QAA02737@207-172-57-45.s45.tnt2.ann.va.dialup.rcn.com>

Since the PSA isn't doing anything for us any longer, I've been
working on reviving the bookstore at a new location with a new
affiliate code.  

A draft version is up at its new home,
http://www.kuchling.com/bookstore/ .  Please take a look and offer
comments.  Book authors, please take a look at the entry for your book
and let me know about any corrections.  Links to reviews of books
would also be really welcomed.

I'd like to abolish having book listings with no description or
review, so if you notice a book that you've read has no description,
please feel free to submit a description and/or review.

--amk



From tim.one at home.com  Sat Dec 23 08:15:59 2000
From: tim.one at home.com (Tim Peters)
Date: Sat, 23 Dec 2000 02:15:59 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <3A41E68B.6B12CD71@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDJIFAA.tim.one@home.com>

[Tim]
> ...
> So we should say "8-bit string" or "Unicode string" when *only*
> one of those is allowable.  So
>
>     "ord() expected string ..."
>
> instead of (even a repaired version of)
>
>     "ord() expected string or Unicode character ..."

[MAL]
> I think this has to do with understanding that there are two
> string types in Python 2.0 -- a novice won't notice this until
> she sees the error message.

Except that this error msg has nothing to do with how many string types
there are:  they didn't pass *any* flavor of string when they get this msg.
At the time they pass (say) a float to ord(), that there are currently two
flavors of string is more information than they need to know.

> My understanding is similar to yours, "string" should mean
> "any string object" and in cases where the difference between
> 8-bit string and Unicode matters, these should be referred to
> as "8-bit string" and "Unicode string".

In that happy case of universal harmony, the msg above should say just
"string" and leave it at that.

> Still, I think it is a good idea to make people aware of the
> possibility of passing Unicode objects to these functions,

Me too.

> so perhaps the idea of adding both possibilies to error messages
> is not such a bad idea for 2.1.

But not that.  The user is trying to track down their problem.  Advertising
an irrelevant (to their problem) distinction at that time of crisis is
simply spam.

    TypeError:  ord() requires an 8-bit string or a Unicode string.
                On the other hand, you'd be surprised to discover
                all the things you can pass to chr():  it's not just
                ints.  Long ints are also accepted, by design, and
                due to an obscure bug in the Python internals, you
                can also pass floats, which get truncated to ints.

> The next phases would be converting all messages back to "string"
> and then convert all strings to Unicode ;-)

Then we'll save a lot of work by skipping the need for the first half of
that -- unless you're volunteering to do all of it <wink>.





From tim.one at home.com  Sat Dec 23 08:16:29 2000
From: tim.one at home.com (Tim Peters)
Date: Sat, 23 Dec 2000 02:16:29 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <20001221133719.B11880@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEDKIFAA.tim.one@home.com>

[Tim]
> So we should say "8-bit string" or "Unicode string" when *only*
> one of those is allowable.

[Andrew]
> OK... how about this patch?

+1 from me.  And maybe if you offer to send a royalty to Marc-Andre each
time it's printed, he'll back down from wanting to use the error msgs as a
billboard <wink>.

> Index: bltinmodule.c
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v
> retrieving revision 2.185
> diff -u -r2.185 bltinmodule.c
> --- bltinmodule.c	2000/12/20 15:07:34	2.185
> +++ bltinmodule.c	2000/12/21 18:36:54
> @@ -1524,13 +1524,14 @@
>  		}
>  	} else {
>  		PyErr_Format(PyExc_TypeError,
> -			     "ord() expected string or Unicode character, " \
> +			     "ord() expected string of length 1, but " \
>  			     "%.200s found", obj->ob_type->tp_name);
>  		return NULL;
>  	}
>
>  	PyErr_Format(PyExc_TypeError,
> -		     "ord() expected a character, length-%d string found",
> +		     "ord() expected a character, "
> +                     "but string of length %d found",
>  		     size);
>  	return NULL;
>  }




From barry at digicool.com  Sat Dec 23 17:43:37 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 23 Dec 2000 11:43:37 -0500
Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222130143.B7127@newcnri.cnri.reston.va.us>
Message-ID: <14916.54841.418495.194558@anthem.concentric.net>

>>>>> "AK" == Andrew Kuchling <akuchlin at cnri.reston.va.us> writes:

    AK> New style it is.  (Barry, is the "python" style in cc-mode.el
    AK> going to be changed to new style, or a "python2" style added?)

There should probably be a second style added to cc-mode.el.  I
haven't maintained that package in a long time, but I'll work out a
patch and send it to the current maintainer.  Let's call it
"python2".

-Barry




From cgw at fnal.gov  Sat Dec 23 18:09:57 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Sat, 23 Dec 2000 11:09:57 -0600 (CST)
Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <14916.54841.418495.194558@anthem.concentric.net>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222130143.B7127@newcnri.cnri.reston.va.us>
	<14916.54841.418495.194558@anthem.concentric.net>
Message-ID: <14916.56421.370499.762023@buffalo.fnal.gov>

Barry A. Warsaw writes:

 > There should probably be a second style added to cc-mode.el.  I
 > haven't maintained that package in a long time, but I'll work out a
 > patch and send it to the current maintainer.  Let's call it
 > "python2".

Maybe we should wait for the BDFL's pronouncement?



From barry at digicool.com  Sat Dec 23 20:24:42 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 23 Dec 2000 14:24:42 -0500
Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222130143.B7127@newcnri.cnri.reston.va.us>
	<14916.54841.418495.194558@anthem.concentric.net>
	<14916.56421.370499.762023@buffalo.fnal.gov>
Message-ID: <14916.64506.56351.443287@anthem.concentric.net>

>>>>> "CGW" == Charles G Waldman <cgw at fnal.gov> writes:

    CGW> Barry A. Warsaw writes:

    >> There should probably be a second style added to cc-mode.el.  I
    >> haven't maintained that package in a long time, but I'll work
    >> out a patch and send it to the current maintainer.  Let's call
    >> it "python2".

    CGW> Maybe we should wait for the BDFL's pronouncement?

Sure, at least before submitting a patch.  Here's the simple one liner
you can add to your .emacs file to play with the new style in the
meantime.

-Barry

(c-add-style "python2" '("python" (c-basic-offset . 4)))




From tim.one at home.com  Sun Dec 24 05:04:47 2000
From: tim.one at home.com (Tim Peters)
Date: Sat, 23 Dec 2000 23:04:47 -0500
Subject: [Python-Dev] PEP 208 and __coerce__
In-Reply-To: <20001209033006.A3737@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEEMIFAA.tim.one@home.com>

[Neil Schemenauer
 Saturday, December 09, 2000 6:30 AM]

> While working on the implementation of PEP 208, I discovered that
> __coerce__ has some surprising properties.  Initially I
> implemented __coerce__ so that the numberic operation currently
> being performed was called on the values returned by __coerce__.
> This caused test_class to blow up due to code like this:
>
>     class Test:
>         def __coerce__(self, other):
>             return (self, other)
>
> The 2.0 "solves" this by not calling __coerce__ again if the
> objects returned by __coerce__ are instances.

If C.__coerce__ doesn't *know* it can do the full job, it should return
None.   This is what's documented, too:  a coerce method should return a
pair consisting of objects of the same type, or return None.

It's always going to be somewhat clumsy since what you really want is double
(or, in the case of pow, sometimes triple) dispatch.

Now there's a deliberate cheat that may not have gotten documented
comprehensibly:  when __coerce__ returns a pair, Python does not check to
verify both elements are of the same class.  That's because "a pair
consisting of objects of the same type" is often not what you *want* from
coerce.  For example, if I've got a matrix class M, then in

    M() + 42

I really don't want M.__coerce__ "promoting" 42 to a multi-gigabyte matrix
matching the shape and size of M().  M.__add__ can deal with that much more
efficiently if it gets 42 directly.  OTOH, M.__coerce__ may want to coerce
types other than scalar numbers to conform to the shape and size of self, or
fiddle self to conform to some other type.

What Python accepts back from __coerce__ has to be flexible enough to allow
all of those without further interference from the interpreter (just ask MAL
<wink>:  the *real* problem in practice is making coerce more of a help than
a burden to the end user; outside of int->long->float->complex (which is
itself partly broken, because long->float can lose precision or even fail
outright), "coercion to a common type" is almost never quite right; note
that C99 introduces distinct imaginary and complex types, because even
auto-conversion of imaginary->complex can be a royal PITA!).

> This has the effect of making code like:
>
>     class A:
>         def __coerce__(self, other):
>             return B(), other
>
>     class B:
>         def __coerce__(self, other):
>             return 1, other
>
>     A() + 1
>
> fail to work in the expected way.

I have no idea how you expected that to work.  Neither coerce() method looks
reasonable:  they don't follow the rules for coerce methods.  If A thinks it
needs to create a B() and have coercion "start over from scratch" with that,
then it should do so explicitly:

    class A:
        def __coerce__(self, other):
            return coerce(B(), other)

> The question is: how should __coerce__ work?

This can't be answered by a one-liner:  the intended behavior is documented
by a complex set of rules at the bottom of Lang Ref 3.3.6 ("Emulating
numeric types").  Alternatives should be written up as a diff against those
rules, which Guido worked hard on in years past -- more than once, too
<wink>.




From esr at thyrsus.com  Mon Dec 25 10:17:23 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 25 Dec 2000 04:17:23 -0500
Subject: [Python-Dev] Tkinter support under RH 7.0?
Message-ID: <20001225041723.A9567@thyrsus.com>

I just upgraded to Red Hat 7.0 and installed Python 2.0.  Anybody have
a recipe for making Tkinter support work in this environment?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Government is not reason, it is not eloquence, it is force; like fire, a
troublesome servant and a fearful master. Never for a moment should it be left
to irresponsible action."
	-- George Washington, in a speech of January 7, 1790



From thomas at xs4all.net  Mon Dec 25 11:59:45 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 25 Dec 2000 11:59:45 +0100
Subject: [Python-Dev] Tkinter support under RH 7.0?
In-Reply-To: <20001225041723.A9567@thyrsus.com>; from esr@thyrsus.com on Mon, Dec 25, 2000 at 04:17:23AM -0500
References: <20001225041723.A9567@thyrsus.com>
Message-ID: <20001225115945.A25820@xs4all.nl>

On Mon, Dec 25, 2000 at 04:17:23AM -0500, Eric S. Raymond wrote:

> I just upgraded to Red Hat 7.0 and installed Python 2.0.  Anybody have
> a recipe for making Tkinter support work in this environment?

I installed Python 2.0 + Tkinter both from the BeOpen rpms and later
from source (for various reasons) and both were a breeze. I didn't really
use the 2.0+tkinter rpm version until I needed Numpy and various other
things and had to revert to the self-compiled version, but it seemed to work
fine.

As far as I can recall, there's only two things you have to keep in mind:
the tcl/tk version that comes with RedHat 7.0 is 8.3, so you have to adjust
the Tkinter section of Modules/Setup accordingly, and some of the
RedHat-supplied scripts stop working because they use deprecated modules (at
least 'rand') and use the socket.socket call wrong.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From esr at thyrsus.com  Wed Dec 27 20:37:50 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 27 Dec 2000 14:37:50 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
Message-ID: <20001227143750.A26894@thyrsus.com>

I have 2.0 up and running on RH7.0, compiled from sources.  In the process, 
I discovered a couple of issues:

1. The curses module is commented out in the default Modules/Setup
file.  This is not good, as it may lead careless distribution builders
to ship Python 2.0s that will not be able to support the curses front
end in CML2.  Supporting CML2 (and thus getting Python the "design
win" of being involved in the Linux kernel build) was the major point
of integrating the curses module into the Python core.  It is possible
that one little "#" may have blown that.

2.The default Modules/Setup file assumes that various Tkinter-related libraries
are in /usr/local.  But /usr would be a more appropriate choice under most
circumstances.  Most Linux users now install their Tcl/Tk stuff from RPMs
or .deb packages that place the binaries and libraries under /usr.  Under
most other Unixes (e.g. Solaris) they were there to begin with.

3. The configure machinery could be made to deduce more about the contents
of Modules/Setup than it does now.  In particular, it's silly that the person
building Python has to fill in the locations of X librasries when 
configure is in principle perfectly capable of finding them.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Our society won't be truly free until "None of the Above" is always an option.



From guido at digicool.com  Wed Dec 27 22:04:27 2000
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 27 Dec 2000 16:04:27 -0500
Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support)
In-Reply-To: Your message of "Fri, 22 Dec 2000 14:31:07 EST."
             <LNBBLJKPBEHFEDALKOLCAECIIFAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCAECIIFAA.tim.one@home.com> 
Message-ID: <200012272104.QAA22278@cj20424-a.reston1.va.home.com>

> 2. Guido repeats that he prefers old-style (but in a wishy-washy way that
> leaves it uncertain (*)).

OK, since a pronouncement is obviously needed, here goes: Python C
source code should be indented using tabs only.

Exceptions:

(1) If 3rd party code is already written using a different style, it
    can stay that way, especially if it's a large volume that would be
    hard to reformat.  But only if it is consistent within a file or
    set of files (e.g. a 3rd party patch will have to conform to the
    prevailing style in the patched file).

(2) Occasionally (e.g. in ceval.c) there is code that's very deeply
    nested.  I will allow 4-space indents for the innermost nesting
    levels here.

Other C whitespace nits:

- Always place spaces around assignment operators, comparisons, &&, ||.

- No space between function name and left parenthesis.

- Always a space between a keyword ('if', 'for' etc.) and left paren.

- No space inside parentheses, brackets etc.

- No space before a comma or semicolon.

- Always a space after a comma (and semicolon, if not at end of line).

- Use ``return x;'' instead of ``return(x)''.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From cgw at fnal.gov  Wed Dec 27 23:17:31 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Wed, 27 Dec 2000 16:17:31 -0600 (CST)
Subject: [Python-Dev] sourceforge: problems with bug list?
Message-ID: <14922.27259.456364.750295@buffalo.fnal.gov>

Is it just me, or is anybody else getting this error when trying to
access the bug list?

 > An error occured in the logger. ERROR: pg_atoi: error in "5470/":
 > can't parse "/" 





From akuchlin at mems-exchange.org  Wed Dec 27 23:39:35 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 27 Dec 2000 17:39:35 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001227143750.A26894@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 27, 2000 at 02:37:50PM -0500
References: <20001227143750.A26894@thyrsus.com>
Message-ID: <20001227173935.A25605@kronos.cnri.reston.va.us>

On Wed, Dec 27, 2000 at 02:37:50PM -0500, Eric S. Raymond wrote:
>1. The curses module is commented out in the default Modules/Setup
>file.  This is not good, as it may lead careless distribution builders

It always has been commented out.  Good distributions ship with most
of the available modules enabled; I can't say if RH7.0 counts as a
good distribution or not (still on 6.2).

>3. The configure machinery could be made to deduce more about the contents
>of Modules/Setup than it does now.  In particular, it's silly that the person

This is the point of PEP 229 and patch #102588, which uses a setup.py
script to build extension modules.  (I need to upload an updated
version of the patch which actually includes setup.py -- thought I did
that, but apparently not...)  The patch is still extremely green,
though, but I think it's the best course; witness the tissue of
hackery required to get the bsddb module automatically detected and
built.

--amk



From guido at digicool.com  Wed Dec 27 23:54:26 2000
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 27 Dec 2000 17:54:26 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: Your message of "Fri, 22 Dec 2000 10:58:56 EST."
             <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> 
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net> <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov>  
            <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> 
Message-ID: <200012272254.RAA22931@cj20424-a.reston1.va.home.com>

> Charles G Waldman writes:
>  > I am reminded of Linus Torvalds comments on this subject (see
>  > /usr/src/linux/Documentation/CodingStyle):

Fred replied:
>   The catch, of course, is Python/cevel.c, where breaking it up can
> hurt performance.  People scream when you do things like that....

Funny, Jeremy is doing just that, and it doesn't seem to be hurting
performance at all.  See

 http://sourceforge.net/patch/?func=detailpatch&patch_id=102337&group_id=5470

(though this is not quite finished).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Thu Dec 28 00:05:46 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 27 Dec 2000 18:05:46 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001227173935.A25605@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Wed, Dec 27, 2000 at 05:39:35PM -0500
References: <20001227143750.A26894@thyrsus.com> <20001227173935.A25605@kronos.cnri.reston.va.us>
Message-ID: <20001227180546.A4365@thyrsus.com>

Andrew Kuchling <akuchlin at mems-exchange.org>:
> >1. The curses module is commented out in the default Modules/Setup
> >file.  This is not good, as it may lead careless distribution builders
> 
> It always has been commented out.  Good distributions ship with most
> of the available modules enabled; I can't say if RH7.0 counts as a
> good distribution or not (still on 6.2).

I think this needs to change.  If curses is a core facility  now, the
default build should tread it as one.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

If a thousand men were not to pay their tax-bills this year, that would
... [be] the definition of a peaceable revolution, if any such is possible.
	-- Henry David Thoreau



From tim.one at home.com  Thu Dec 28 01:44:29 2000
From: tim.one at home.com (Tim Peters)
Date: Wed, 27 Dec 2000 19:44:29 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Misc python-mode.el,3.108,3.109
In-Reply-To: <E14BKaD-0004JB-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKFIFAA.tim.one@home.com>

[Barry Warsaw]
> Modified Files:
> 	python-mode.el
> Log Message:
> (python-font-lock-keywords): Add highlighting of `as' as a keyword,
> but only in "import foo as bar" statements (including optional
> preceding `from' clause).

Oh, that's right, try to make IDLE look bad, will you?  I've got half a mind
to take up the challenge.  Unfortunately, I only have half a mind in total,
so you may get away with this backstabbing for a while <wink>.




From thomas at xs4all.net  Thu Dec 28 10:53:31 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 28 Dec 2000 10:53:31 +0100
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001227143750.A26894@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 27, 2000 at 02:37:50PM -0500
References: <20001227143750.A26894@thyrsus.com>
Message-ID: <20001228105331.A6042@xs4all.nl>

On Wed, Dec 27, 2000 at 02:37:50PM -0500, Eric S. Raymond wrote:
> I have 2.0 up and running on RH7.0, compiled from sources.  In the process, 
> I discovered a couple of issues:

> 1. The curses module is commented out in the default Modules/Setup
> file.  This is not good, as it may lead careless distribution builders
> to ship Python 2.0s that will not be able to support the curses front
> end in CML2.  Supporting CML2 (and thus getting Python the "design
> win" of being involved in the Linux kernel build) was the major point
> of integrating the curses module into the Python core.  It is possible
> that one little "#" may have blown that.

Note that Tkinter is off by default too. And readline. And ssl. And the use
of shared libraries. We *can't* enable the cursesmodule by default, because
we don't know what the system's curses library is called. We'd have to
auto-detect that before we can enable it (and lots of other modules)
automatically, and that's a lot of work. I personally favour autoconf for
the job, but since amk is already busy on using distutils, I'm not going to
work on that.

> 2.The default Modules/Setup file assumes that various Tkinter-related libraries
> are in /usr/local.  But /usr would be a more appropriate choice under most
> circumstances.  Most Linux users now install their Tcl/Tk stuff from RPMs
> or .deb packages that place the binaries and libraries under /usr.  Under
> most other Unixes (e.g. Solaris) they were there to begin with.

This is nonsense. The line above it specifically states 'edit to reflect
where your Tcl/Tk headers are'. And besides from the issue whether they are
usually found in /usr (I don't believe so, not even on Solaris, but 'my'
Solaris box doesn't even have tcl/tk,) /usr/local is a perfectly sane
choice, since /usr is already included in the include-path, but /usr/local
usually is not.

> 3. The configure machinery could be made to deduce more about the contents
> of Modules/Setup than it does now.  In particular, it's silly that the person
> building Python has to fill in the locations of X librasries when 
> configure is in principle perfectly capable of finding them.

In principle, I agree. It's a lot of work, though. For instance, Debian
stores the Tcl/Tk headers in /usr/include/tcl<version>, which means you can
compile for more than one tcl version, by just changing your include path
and the library you link with. And there are undoubtedly several other
variants out there.

Should we really make the Setup file default to Linux, and leave other
operating systems in the dark about what it might be on their system ? I
think people with Linux and without clue are the least likely people to
compile their own Python, since Linux distributions already come with a
decent enough Python. And, please, lets assume the people assembling those
know how to read ?

Maybe we just need a HOWTO document covering Setup ?

(Besides, won't this all be fixed when CML2 comes with a distribution, Eric ?
They'll *have* to have working curses/tkinter then :-)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From MarkH at ActiveState.com  Thu Dec 28 13:34:09 2000
From: MarkH at ActiveState.com (Mark Hammond)
Date: Thu, 28 Dec 2000 23:34:09 +1100
Subject: [Python-Dev] Fwd: try...else
Message-ID: <3A4B3341.5010707@ActiveState.com>

Spotted on c.l.python.  Although Pythonwin is mentioned, python.exe 
gives the same results - as does 1.5.2.

Seems a reasonable question...

[Also, if Robin hasn't been invited to join us here, I think it could 
make some sense...]

Mark.
-------- Original Message --------
Subject: try...else
Date: Fri, 22 Dec 2000 18:02:27 +0000
From: Robin Becker <robin at jessikat.fsnet.co.uk>
Newsgroups: comp.lang.python

I had expected that in try: except: else
the else clause always got executed, but it seems not for return

PythonWin 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on
win32.Portions Copyright 1994-2000 Mark Hammond (MarkH at ActiveState.com)
- see 'Help/About PythonWin' for further copyright information.
 >>> def bang():
....     try:
....             return 'return value'
....     except:
....             print 'bang failed'
....     else:
....             print 'bang succeeded'
....
  >>> bang()
'return value'
 >>>

is this a 'feature' or bug. The 2.0 docs seem not to mention
return/continue except for try finally.
-- 
Robin Becker




From mal at lemburg.com  Thu Dec 28 15:45:49 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Dec 2000 15:45:49 +0100
Subject: [Python-Dev] Fwd: try...else
References: <3A4B3341.5010707@ActiveState.com>
Message-ID: <3A4B521D.4372224A@lemburg.com>

Mark Hammond wrote:
> 
> Spotted on c.l.python.  Although Pythonwin is mentioned, python.exe
> gives the same results - as does 1.5.2.
> 
> Seems a reasonable question...
> 
> [Also, if Robin hasn't been invited to join us here, I think it could
> make some sense...]
> 
> Mark.
> -------- Original Message --------
> Subject: try...else
> Date: Fri, 22 Dec 2000 18:02:27 +0000
> From: Robin Becker <robin at jessikat.fsnet.co.uk>
> Newsgroups: comp.lang.python
> 
> I had expected that in try: except: else
> the else clause always got executed, but it seems not for return

I think Robin mixed up try...finally with try...except...else.
The finally clause is executed even in case an exception occurred.

He does have a point however that 'return' will bypass 
try...else and try...finally clauses. I don't think we can change
that behaviour, though, as it would break code.
 
> PythonWin 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on
> win32.Portions Copyright 1994-2000 Mark Hammond (MarkH at ActiveState.com)
> - see 'Help/About PythonWin' for further copyright information.
>  >>> def bang():
> ....     try:
> ....             return 'return value'
> ....     except:
> ....             print 'bang failed'
> ....     else:
> ....             print 'bang succeeded'
> ....
>   >>> bang()
> 'return value'
>  >>>
> 
> is this a 'feature' or bug. The 2.0 docs seem not to mention
> return/continue except for try finally.
> --
> Robin Becker
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Thu Dec 28 16:04:23 2000
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 28 Dec 2000 10:04:23 -0500
Subject: [Python-Dev] chomp()?
Message-ID: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>

Someone just posted a patch to implement s.chomp() as a string method:

http://sourceforge.net/patch/?func=detailpatch&patch_id=103029&group_id=5470

Pseudo code (for those not aware of the Perl function by that name):

def chomp(s):
    if s[-2:] == '\r\n':
        return s[:-2]
    if s[-1:] == '\r' or s[-1:] == '\n':
        return s[:-1]
    return s

I.e. it removes a trailing \r\n, \r, or \n.

Any comments?  Is this needed given that we have s.rstrip() already?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Thu Dec 28 16:30:57 2000
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 28 Dec 2000 10:30:57 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: Your message of "Wed, 27 Dec 2000 14:37:50 EST."
             <20001227143750.A26894@thyrsus.com> 
References: <20001227143750.A26894@thyrsus.com> 
Message-ID: <200012281530.KAA26049@cj20424-a.reston1.va.home.com>

Eric,

I think your recent posts have shown a worldview that's a bit too
Eric-centered. :-)

Not all the world is Linux.  CML2 isn't the only Python application
that matters.  Python world domination is not a goal.  There is no
Eric conspiracy! :-)

That said, I think that the future is bright: Anderw is already
working on a much more intelligent configuration manager.

I believe it would be a mistake to enable curses by default using the
current approach to module configuration: it doesn't compile out of
the box on every platform, and you wouldn't believe how much email I
get from clueless Unix users trying to build Python when there's a
problem like that in the distribution.

So I'd rather wait for Andrew's work.  You could do worse than help
him with that, to further your goal!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at acm.org  Thu Dec 28 16:41:23 2000
From: fdrake at acm.org (Fred L. Drake)
Date: Thu, 28 Dec 2000 10:41:23 -0500
Subject: [Python-Dev] chomp()?
In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <web-403062@digicool.com>

On Thu, 28 Dec 2000 10:04:23 -0500, Guido
<guido at digicool.com> wrote:
 > Someone just posted a patch to implement s.chomp() as a
 > string method:
...
 > Any comments?  Is this needed given that we have
 > s.rstrip() already?

  I've always considered this a different operation from
rstrip().  When you intend to be as surgical in your changes
as possible, it is important *not* to use rstrip().
  I don't feel strongly that it needs to be implemented in
C, though I imagine people who do a lot of string processing
feel otherwise.  It's just hard to beat the performance
difference if you are doing this a lot.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From barry at digicool.com  Thu Dec 28 17:00:36 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Thu, 28 Dec 2000 11:00:36 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Misc python-mode.el,3.108,3.109
References: <E14BKaD-0004JB-00@usw-pr-cvs1.sourceforge.net>
	<LNBBLJKPBEHFEDALKOLCGEKFIFAA.tim.one@home.com>
Message-ID: <14923.25508.668453.186209@anthem.concentric.net>

>>>>> "TP" == Tim Peters <tim.one at home.com> writes:

    TP> [Barry Warsaw]
    >> Modified Files: python-mode.el Log Message:
    >> (python-font-lock-keywords): Add highlighting of `as' as a
    >> keyword, but only in "import foo as bar" statements (including
    >> optional preceding `from' clause).

    TP> Oh, that's right, try to make IDLE look bad, will you?  I've
    TP> got half a mind to take up the challenge.  Unfortunately, I
    TP> only have half a mind in total, so you may get away with this
    TP> backstabbing for a while <wink>.

With my current network (un)connectivity, I feel like a nuclear sub
which can only surface once a month to receive low frequency orders
from some remote antenna farm out in New Brunswick.  Just think of me
as a rogue commander who tries to do as much damage as possible when
he's not joyriding in the draft-wake of giant squids.

rehoming-all-remaining-missiles-at-the-Kingdom-of-Timbotia-ly y'rs,
-Barry




From esr at thyrsus.com  Thu Dec 28 17:01:54 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 11:01:54 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <200012281530.KAA26049@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 10:30:57AM -0500
References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com>
Message-ID: <20001228110154.D32394@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> Not all the world is Linux.  CML2 isn't the only Python application
> that matters.  Python world domination is not a goal.  There is no
> Eric conspiracy! :-)

Perhaps I misunderstood you, then.  I thought you considered CML2 an
potentially important design win, and that was why curses didn't get
dropped from the core.  Have you changed your mind about this?

If Python world domination is not a goal then I can only conclude that
you haven't had your morning coffee yet :-).

There's a more general question here about what it means for something
to be in the core language.  Developers need to have a clear,
bright-line picture of what they can count on to be present.  To me
this implies that it's the job of the Python maintainers to make sure
that a facility declared "core" by its presence in the standard
library documentation is always present, for maximum "batteries are
included" effect.  

Yes, dealing with cross-platform variations in linking curses is a
pain -- but dealing with that kind of pain so the Python user doesn't
have to is precisely our job.  Or so I understand it, anyway.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Conservatism is the blind and fear-filled worship of dead radicals.



From moshez at zadka.site.co.il  Thu Dec 28 17:51:32 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: 28 Dec 2000 16:51:32 -0000
Subject: [Python-Dev] chomp()?
In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <20001228165132.8025.qmail@stimpy.scso.com>

On Thu, 28 Dec 2000, Guido van Rossum <guido at digicool.com> wrote:

> Someone just posted a patch to implement s.chomp() as a string method:
...
> Any comments?  Is this needed given that we have s.rstrip() already?

Yes.

i=0
for line in fileinput.input():
	print '%d: %s' % (i, line.chomp())
	i++

I want that operation to be invertable by

sed 's/^[0-9]*: //'



From guido at digicool.com  Thu Dec 28 18:08:18 2000
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 28 Dec 2000 12:08:18 -0500
Subject: [Python-Dev] scp to sourceforge
Message-ID: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>

I've seen a thread on this but there was no conclusive answer, so I'm
reopening this.

I can't SCP updated PEPs to the SourceForge machine.  The "pep2html.py
-i" command just hangs.  I can ssh into shell.sourceforge.net just
fine, but scp just hangs.  "scp -v" prints a bunch of things
suggesting that it can authenticate itself just fine, ending with
these three lines:

  cj20424-a.reston1.va.home.com: RSA authentication accepted by server.
  cj20424-a.reston1.va.home.com: Sending command: scp -v -t .
  cj20424-a.reston1.va.home.com: Entering interactive session.

and then nothing.  It just sits there.

Would somebody please figure out a way to update the PEPs?  It's kind
of pathetic to see the website not have the latest versions...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From moshez at zadka.site.co.il  Thu Dec 28 17:28:07 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: 28 Dec 2000 16:28:07 -0000
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <3A4B521D.4372224A@lemburg.com>
References: <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com>
Message-ID: <20001228162807.7229.qmail@stimpy.scso.com>

On Thu, 28 Dec 2000, "M.-A. Lemburg" <mal at lemburg.com> wrote:

> He does have a point however that 'return' will bypass 
> try...else and try...finally clauses. I don't think we can change
> that behaviour, though, as it would break code.

It doesn't bypass try..finally:

>>> def foo():
...     try:
...             print "hello"
...             return
...     finally:
...             print "goodbye"
...
>>> foo()
hello
goodbye




From guido at digicool.com  Thu Dec 28 17:43:26 2000
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 28 Dec 2000 11:43:26 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: Your message of "Thu, 28 Dec 2000 11:01:54 EST."
             <20001228110154.D32394@thyrsus.com> 
References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com>  
            <20001228110154.D32394@thyrsus.com> 
Message-ID: <200012281643.LAA26687@cj20424-a.reston1.va.home.com>

> Guido van Rossum <guido at digicool.com>:
> > Not all the world is Linux.  CML2 isn't the only Python application
> > that matters.  Python world domination is not a goal.  There is no
> > Eric conspiracy! :-)
> 
> Perhaps I misunderstood you, then.  I thought you considered CML2 an
> potentially important design win, and that was why curses didn't get
> dropped from the core.  Have you changed your mind about this?

Supporting CML2 was one of the reasons to keep curses in the core, but
not the only one.  Linux kernel configuration is so far removed from
my daily use of computers that I don't have a good way to judge its
importance in the grand scheme of things.  Since you obviously
consider it very important, and since I generally trust your judgement
(except on the issue of firearms :-), your plea for keeping, and
improving, curses support in the Python core made a difference in my
decision.  And don't worry, I don't expect to change that decision
-- though I personally still find it curious that curses is so important.
I find curses-style user interfaces pretty pathetic, and wished that
Linux migrated to a real GUI for configuration.  (And the linuxconf
approach does *not* qualify as a a real GUI. :-)

> If Python world domination is not a goal then I can only conclude that
> you haven't had your morning coffee yet :-).

Sorry to disappoint you, Eric.  I gave up coffee years ago. :-)

I was totally serious though: my personal satisfaction doesn't come
from Python world domination.  Others seem have that goal, and if it
doesn't inconvenience me too much I'll play along, but in the end I've
got some goals in my personal life that are much more important.

> There's a more general question here about what it means for something
> to be in the core language.  Developers need to have a clear,
> bright-line picture of what they can count on to be present.  To me
> this implies that it's the job of the Python maintainers to make sure
> that a facility declared "core" by its presence in the standard
> library documentation is always present, for maximum "batteries are
> included" effect.  

We do the best we can.  Using the current module configuration system,
it's a one-character edit to enable curses if you need it.  With
Andrew's new scheme, it will be automatic.

> Yes, dealing with cross-platform variations in linking curses is a
> pain -- but dealing with that kind of pain so the Python user doesn't
> have to is precisely our job.  Or so I understand it, anyway.

So help Andrew: http://python.sourceforge.net/peps/pep-0229.html

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Thu Dec 28 17:52:36 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Dec 2000 17:52:36 +0100
Subject: [Python-Dev] Fwd: try...else
References: <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com>
Message-ID: <3A4B6FD3.9B576E9A@lemburg.com>

Moshe Zadka wrote:
> 
> On Thu, 28 Dec 2000, "M.-A. Lemburg" <mal at lemburg.com> wrote:
> 
> > He does have a point however that 'return' will bypass
> > try...else and try...finally clauses. I don't think we can change
> > that behaviour, though, as it would break code.
> 
> It doesn't bypass try..finally:
> 
> >>> def foo():
> ...     try:
> ...             print "hello"
> ...             return
> ...     finally:
> ...             print "goodbye"
> ...
> >>> foo()
> hello
> goodbye

Hmm, that must have changed between Python 1.5 and more recent
versions:

Python 1.5:
>>> def f():
...     try:
...             return 1
...     finally:
...             print 'finally'
... 
>>> f()
1
>>> 

Python 2.0:
>>> def f():
...     try:
...             return 1
...     finally:
...             print 'finally'
... 
>>> f()
finally
1
>>>

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From moshez at stimpy.scso.com  Thu Dec 28 17:59:32 2000
From: moshez at stimpy.scso.com (Moshe Zadka)
Date: 28 Dec 2000 16:59:32 -0000
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <3A4B6FD3.9B576E9A@lemburg.com>
References: <3A4B6FD3.9B576E9A@lemburg.com>, <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com>
Message-ID: <20001228165932.8143.qmail@stimpy.scso.com>

On Thu, 28 Dec 2000 17:52:36 +0100, "M.-A. Lemburg" <mal at lemburg.com> wrote:

[about try..finally not playing well with return]
> Hmm, that must have changed between Python 1.5 and more recent
> versions:

I posted a 1.5.2 test. So it changed between 1.5 and 1.5.2?



From esr at thyrsus.com  Thu Dec 28 18:20:48 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 12:20:48 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001228105331.A6042@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 10:53:31AM +0100
References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl>
Message-ID: <20001228122048.A1381@thyrsus.com>

Thomas Wouters <thomas at xs4all.net>:
> > 1. The curses module is commented out in the default Modules/Setup
> > file.  This is not good, as it may lead careless distribution builders
> > to ship Python 2.0s that will not be able to support the curses front
> > end in CML2.  Supporting CML2 (and thus getting Python the "design
> > win" of being involved in the Linux kernel build) was the major point
> > of integrating the curses module into the Python core.  It is possible
> > that one little "#" may have blown that.
> 
> Note that Tkinter is off by default too. And readline. And ssl. And the use
> of shared libraries.

IMO ssl isn't an issue because it's not documented as being in the standard
module set.  Readline is a minor issue because raw_input()'s functionality
changes somewhat if it's not linked, but I think we can live with this -- the
change isn't visible to calling programs.  

Hm.  It appears tkinter isn't documented in the standard set of modules 
either.  Interesting.  Technically this means I don't have a problem with
it not being built in by default, but I think there is a problem here...

My more general point is that right now Pyjthon has three classes of modules:

1. Documented as being in the core and built in by default.
2. Not documented as being in the core and not built in by default.
3. Documented as being in the core but not built in by default.

My more general claim is that the existence of class 3 is a problem,
because it compromises the "batteries are included" effect -- it means
Python users don't have a bright-line test for what will be present in
every Python (or at least every Python on an operating system
theoretically feature-compatible with theirs).

My struggle to get CML2 adopted brings this problem into particularly
sharp focus because the kernel group is allergic to big footprints or
having to download extension modules to do a build.  But the issue is
really broader than that.  I think we ought to be migrating stuff out
of class 3 into class 1 where possible and to class 2 only where
unavoidable.

>         We *can't* enable the cursesmodule by default, because
> we don't know what the system's curses library is called. We'd have to
> auto-detect that before we can enable it (and lots of other modules)
> automatically, and that's a lot of work. I personally favour autoconf for
> the job, but since amk is already busy on using distutils, I'm not going to
> work on that.

Yes, we need to do a lot more autodetection -- this is a large part of
my point.  I have nothing against distutils, but I don't see how it
solves this problem unless we assume that we'll always have Python
already available on any platform where we're building Python.

I'm willing to put my effort where my mouth is on this.  I have a lot
of experience with autoconf; I'm willing to write some of these nasty
config tests.

> > 2.The default Modules/Setup file assumes that various Tkinter-related libraries
> > are in /usr/local.  But /usr would be a more appropriate choice under most
> > circumstances.  Most Linux users now install their Tcl/Tk stuff from RPMs
> > or .deb packages that place the binaries and libraries under /usr.  Under
> > most other Unixes (e.g. Solaris) they were there to begin with.
> 
> This is nonsense. The line above it specifically states 'edit to reflect
> where your Tcl/Tk headers are'. And besides from the issue whether they are
> usually found in /usr (I don't believe so, not even on Solaris, but 'my'
> Solaris box doesn't even have tcl/tk,) /usr/local is a perfectly sane
> choice, since /usr is already included in the include-path, but /usr/local
> usually is not.

Is it?  That is not clear from the comment.  Perhaps this is just a 
documentation problem.  I'll look again.
 
> > 3. The configure machinery could be made to deduce more about the contents
> > of Modules/Setup than it does now.  In particular, it's silly that the
> > person building Python has to fill in the locations of X librasries when 
> > configure is in principle perfectly capable of finding them.
> 
> In principle, I agree. It's a lot of work, though. For instance, Debian
> stores the Tcl/Tk headers in /usr/include/tcl<version>, which means you can
> compile for more than one tcl version, by just changing your include path
> and the library you link with. And there are undoubtedly several other
> variants out there.

As I said to Guido, I think it is exactly our job to deal with this sort
of grottiness.  One of Python's major selling points is supposed to be
cross-platform consistency of the API.  If we fail to do what you're
describing, we're failing to meet Python users' reasonable expectations
for the language.

> Should we really make the Setup file default to Linux, and leave other
> operating systems in the dark about what it might be on their system ? I
> think people with Linux and without clue are the least likely people to
> compile their own Python, since Linux distributions already come with a
> decent enough Python. And, please, lets assume the people assembling those
> know how to read ?

Please note that I am specifically *not* advocating making the build defaults
Linux-centric.  That's not my point at all.

> Maybe we just need a HOWTO document covering Setup ?

That would be a good idea.

> (Besides, won't this all be fixed when CML2 comes with a distribution, Eric ?
> They'll *have* to have working curses/tkinter then :-)

I'm concerned that it will work the other way around, that CML2 won't happen
if the core does not reliably include these facilities.  In itself CML2 
not happening wouldn't be the end of the world of course, but I'm pushing on
this because I think the larger issue of class 3 modules is actually important
to the health of Python and needs to be attacked seriously.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The Bible is not my book, and Christianity is not my religion.  I could never
give assent to the long, complicated statements of Christian dogma.
	-- Abraham Lincoln



From cgw at fnal.gov  Thu Dec 28 18:36:06 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Thu, 28 Dec 2000 11:36:06 -0600 (CST)
Subject: [Python-Dev] chomp()?
In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <14923.31238.65155.496546@buffalo.fnal.gov>

Guido van Rossum writes:
 > Someone just posted a patch to implement s.chomp() as a string method:
 > I.e. it removes a trailing \r\n, \r, or \n.
 > 
 > Any comments?  Is this needed given that we have s.rstrip() already?

-1 from me.  P=NP (Python is not Perl).  "Chomp" is an excessively cute name.
And like you said, this is too much like "rstrip" to merit a separate
method.





From esr at thyrsus.com  Thu Dec 28 18:41:17 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 12:41:17 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <200012281643.LAA26687@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 11:43:26AM -0500
References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com> <20001228110154.D32394@thyrsus.com> <200012281643.LAA26687@cj20424-a.reston1.va.home.com>
Message-ID: <20001228124117.B1381@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> Supporting CML2 was one of the reasons to keep curses in the core, but
> not the only one.  Linux kernel configuration is so far removed from
> my daily use of computers that I don't have a good way to judge its
> importance in the grand scheme of things.  Since you obviously
> consider it very important, and since I generally trust your judgement
> (except on the issue of firearms :-), your plea for keeping, and
> improving, curses support in the Python core made a difference in my
> decision.  And don't worry, I don't expect to change that decision
> -- though I personally still find it curious that curses is so important.
> I find curses-style user interfaces pretty pathetic, and wished that
> Linux migrated to a real GUI for configuration.  (And the linuxconf
> approach does *not* qualify as a a real GUI. :-)

Thank you, that makes your priorities much clearer.

Actually I agree with you that curses interfaces are mostly pretty
pathetic.  A lot of people still like them, though, because they tend
to be fast and lightweight.  Then, too, a really well-designed curses
interface can in fact be good enough that the usability gain from
GUIizing is marginal.  My favorite examples of this are mutt and slrn.
The fact that GUI programs have failed to make much headway against
this is not simply due to user conservatism, it's genuinely hard to
see how a GUI interface could be made significantly better.

And unfortunately, there is a niche where it is still important to
support curses interfacing independently of anyone's preferences in
interface style -- early in the system-configuration process before
one has bootstrapped to the point where X is reliably available.  I
hasten to add that this is not just *my* problem -- one of your
more important Python constituencies in a practical sense is 
the guys who maintain Red Hat's installer.

> I was totally serious though: my personal satisfaction doesn't come
> from Python world domination.  Others seem have that goal, and if it
> doesn't inconvenience me too much I'll play along, but in the end I've
> got some goals in my personal life that are much more important.

There speaks the new husband :-).  OK.  So what *do* you want from Python?

Personally, BTW, my goal is not exactly Python world domination either
-- it's that the world should be dominated by the language that has
the least tendency to produce grotty fragile code (remember that I
tend to obsess about the software-quality problem :-)).  Right now
that's Python.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The people of the various provinces are strictly forbidden to have in their
possession any swords, short swords, bows, spears, firearms, or other types
of arms. The possession of unnecessary implements makes difficult the
collection of taxes and dues and tends to foment uprisings.
        -- Toyotomi Hideyoshi, dictator of Japan, August 1588



From mal at lemburg.com  Thu Dec 28 18:43:13 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Dec 2000 18:43:13 +0100
Subject: [Python-Dev] chomp()?
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <3A4B7BB1.F09660ED@lemburg.com>

Guido van Rossum wrote:
> 
> Someone just posted a patch to implement s.chomp() as a string method:
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=103029&group_id=5470
> 
> Pseudo code (for those not aware of the Perl function by that name):
> 
> def chomp(s):
>     if s[-2:] == '\r\n':
>         return s[:-2]
>     if s[-1:] == '\r' or s[-1:] == '\n':
>         return s[:-1]
>     return s
> 
> I.e. it removes a trailing \r\n, \r, or \n.
> 
> Any comments?  Is this needed given that we have s.rstrip() already?

We already have .splitlines() which does the above (remove
line breaks) not only for a single line, but for many lines at once.

Even better: .splitlines() also does the right thing for Unicode.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Thu Dec 28 20:06:33 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Dec 2000 20:06:33 +0100
Subject: [Python-Dev] Fwd: try...else
References: <3A4B6FD3.9B576E9A@lemburg.com>, <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com> <20001228165932.8143.qmail@stimpy.scso.com>
Message-ID: <3A4B8F39.58C64EFB@lemburg.com>

Moshe Zadka wrote:
> 
> On Thu, 28 Dec 2000 17:52:36 +0100, "M.-A. Lemburg" <mal at lemburg.com> wrote:
> 
> [about try..finally not playing well with return]
> > Hmm, that must have changed between Python 1.5 and more recent
> > versions:
> 
> I posted a 1.5.2 test. So it changed between 1.5 and 1.5.2?

Sorry, false alarm: there was a bug in my patched 1.5 version.
The original 1.5 version does not show the described behaviour.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Thu Dec 28 21:21:15 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 28 Dec 2000 21:21:15 +0100
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <3A4B521D.4372224A@lemburg.com>; from mal@lemburg.com on Thu, Dec 28, 2000 at 03:45:49PM +0100
References: <3A4B3341.5010707@ActiveState.com> <3A4B521D.4372224A@lemburg.com>
Message-ID: <20001228212115.C1811@xs4all.nl>

On Thu, Dec 28, 2000 at 03:45:49PM +0100, M.-A. Lemburg wrote:

> > I had expected that in try: except: else
> > the else clause always got executed, but it seems not for return

> I think Robin mixed up try...finally with try...except...else.
> The finally clause is executed even in case an exception occurred.

(MAL and I already discussed this in private mail: Robin did mean
try/except/else, and 'finally' already executes when returning directly from
the 'try' block, even in Python 1.5)

> He does have a point however that 'return' will bypass 
> try...else and try...finally clauses. I don't think we can change
> that behaviour, though, as it would break code.

This code:

try:
   return
except:
   pass
else:
   print "returning"

will indeed not print 'returning', but I believe it's by design. I'm against
changing it, in any case, and not just because it'd break code :) If you
want something that always executes, use a 'finally'. Or don't return from
the 'try', but return in the 'else' clause. 

The 'except' clause is documented to execute if a matching exception occurs,
and 'else' if no exception occurs. Maybe the intent of the 'else' clause
would be clearer if it was documented to 'execute if the try: clause
finishes without an exception being raised' ? The 'else' clause isn't
executed when you 'break' or (after applying my continue-in-try patch ;)
'continue' out of the 'try', either.

Robin... Did I already reply this, on python-list or to you directly ? I
distinctly remember writing that post, but I'm not sure if it arrived. Maybe
I didn't send it after all, or maybe something on mail.python.org is
detaining it ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Thu Dec 28 19:19:06 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 28 Dec 2000 19:19:06 +0100
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001228122048.A1381@thyrsus.com>; from esr@thyrsus.com on Thu, Dec 28, 2000 at 12:20:48PM -0500
References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> <20001228122048.A1381@thyrsus.com>
Message-ID: <20001228191906.F1281@xs4all.nl>

On Thu, Dec 28, 2000 at 12:20:48PM -0500, Eric S. Raymond wrote:

> My more general point is that right now Pyjthon has three classes of
> modules:

> 1. Documented as being in the core and built in by default.
> 2. Not documented as being in the core and not built in by default.
> 3. Documented as being in the core but not built in by default.

> My more general claim is that the existence of class 3 is a problem,
> because it compromises the "batteries are included" effect -- it means
> Python users don't have a bright-line test for what will be present in
> every Python (or at least every Python on an operating system
> theoretically feature-compatible with theirs).

It depends on your definition of 'being in the core'. Some of the things
that are 'in the core' are simply not possible on all platforms. So if you
want really portable code, you don't want to use them. Other features are
available on all systems that matter [to you], so you don't really care
about it, just use them, and at best document that they need feature X.

There is also the subtle difference between a Python user and a Python
compiler/assembler (excuse my overloading of the terms, but you know what I
mean). People who choose to compile their own Python should realize that
they might disable or misconfigure some parts of it. I personally trust most
people that assemble OS distributions to compile a proper Python binary +
modules, but I think a HOWTO isn't a bad idea -- unless we autoconf
everything.

> I think we ought to be migrating stuff out
> of class 3 into class 1 where possible and to class 2 only where
> unavoidable.

[ and ]

> I'm willing to put my effort where my mouth is on this.  I have a lot
> of experience with autoconf; I'm willing to write some of these nasty
> config tests.

[ and ]

> As I said to Guido, I think it is exactly our job to deal with this sort
> of grottiness.  One of Python's major selling points is supposed to be
> cross-platform consistency of the API.  If we fail to do what you're
> describing, we're failing to meet Python users' reasonable expectations
> for the language.

[ and ]

> Please note that I am specifically *not* advocating making the build defaults
> Linux-centric.  That's not my point at all.

I apologize for the tone of my previous post, and the above snippet. I'm not
trying to block progress here ;) I'm actually all for autodetecting as much
as possible, and more than willing to put effort into it as well (as long as
it's deemed useful, and isn't supplanted by a distutils variant weeks
later.) And I personally have my doubts about the distutils variant, too,
but that's partly because I have little experience with distutils. If we can
work out a deal where both autoconf and distutils are an option, I'm happy
to write a few, if not all, autoconf tests for the currently disabled
modules.

So, Eric, let's split the work. I'll do Tkinter if you do curses. :)

However, I'm also keeping those oddball platforms that just don't support
some features in mind. If you want truly portable code, you have to work at
it. I think it's perfectly okay to say "your Python needs to have the curses
module or the tkinter module compiled in -- contact your administrator if it
has neither". There will still be platforms that don't have curses, or
syslog, or crypt(), though hopefully none of them will be Linux.

Oh, and I also apologize for possibly duplicating what has already been said
by others. I haven't seen anything but this post (which was CC'd to me
directly) since I posted my reply to Eric, due to the ululating bouts of
delay on mail.python.org. Maybe DC should hire some *real* sysadmins,
instead of those silly programmer-kniggits ? >:->

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mwh21 at cam.ac.uk  Thu Dec 28 19:27:48 2000
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: Thu, 28 Dec 2000 18:27:48 +0000 (GMT)
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <3A4B521D.4372224A@lemburg.com>
Message-ID: <Pine.SOL.4.21.0012281820240.3801-100000@yellow.csi.cam.ac.uk>

On Thu, 28 Dec 2000, M.-A. Lemburg wrote:

> I think Robin mixed up try...finally with try...except...else.

I think so too.

> The finally clause is executed even in case an exception occurred.
> 
> He does have a point however that 'return' will bypass 
> try...else and try...finally clauses. I don't think we can change
> that behaviour, though, as it would break code.

return does not skip finally clauses[1].  In my not especially humble
opinion, the current behaviour is the Right Thing.  I'd have to think for
a moment before saying what Robin's example would print, but I think the
alternative would disturb me far more.

Cheers,
M.

[1] In fact the flow of control on return is very similar to that of an
    exception - ooh, look at that implementation...




From esr at thyrsus.com  Thu Dec 28 20:17:51 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 14:17:51 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001228191906.F1281@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 07:19:06PM +0100
References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> <20001228122048.A1381@thyrsus.com> <20001228191906.F1281@xs4all.nl>
Message-ID: <20001228141751.B2528@thyrsus.com>

Thomas Wouters <thomas at xs4all.net>:
> > My more general claim is that the existence of class 3 is a problem,
> > because it compromises the "batteries are included" effect -- it means
> > Python users don't have a bright-line test for what will be present in
> > every Python (or at least every Python on an operating system
> > theoretically feature-compatible with theirs).
> 
> It depends on your definition of 'being in the core'. Some of the things
> that are 'in the core' are simply not possible on all platforms. So if you
> want really portable code, you don't want to use them. Other features are
> available on all systems that matter [to you], so you don't really care
> about it, just use them, and at best document that they need feature X.

I understand.  We can't, for example, guarantee to duplicate the Windows-
specific stuff in the Unix port (nor would we want to in most cases :-)).
However, I think "we build in curses/Tkinter everywhere the corresponding
libraries exist" is a guarantee we can and should make.  Similarly for
other modules presently in class 3.
 
> There is also the subtle difference between a Python user and a Python
> compiler/assembler (excuse my overloading of the terms, but you know what I
> mean).

Yes.  We have three categories here:

1. People who use python for applications (what I've been calling users)
2. People who configure Python binary packages for distribution (what
   you call a "compiler/assembler" and I think of as a "builder").
3. People who hack Python itself.

Problem is that "developer" is very ambiguous in this context...

>           People who choose to compile their own Python should realize that
> they might disable or misconfigure some parts of it. I personally trust most
> people that assemble OS distributions to compile a proper Python binary +
> modules, but I think a HOWTO isn't a bad idea -- unless we autoconf
> everything.

I'd like to see both things happen (HOWTO and autoconfing) and am willing to
work on both.

> I apologize for the tone of my previous post, and the above snippet.

No offense taken at all, I assure you.

>                                                                    I'm not
> trying to block progress here ;) I'm actually all for autodetecting as much
> as possible, and more than willing to put effort into it as well (as long as
> it's deemed useful, and isn't supplanted by a distutils variant weeks
> later.) And I personally have my doubts about the distutils variant, too,
> but that's partly because I have little experience with distutils. If we can
> work out a deal where both autoconf and distutils are an option, I'm happy
> to write a few, if not all, autoconf tests for the currently disabled
> modules.

I admit I'm not very clear on the scope of what distutils is supposed to
handle, and how.  Perhaps amk can enlighten us?

> So, Eric, let's split the work. I'll do Tkinter if you do curses. :)

You've got a deal.  I'll start looking at the autoconf code.  I've already
got a fair idea how to do this.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

No one who's seen it in action can say the phrase "government help" without
either laughing or crying.



From tim.one at home.com  Fri Dec 29 03:59:53 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 28 Dec 2000 21:59:53 -0500
Subject: [Python-Dev] scp to sourceforge
In-Reply-To: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEMKIFAA.tim.one@home.com>

[Guido]
> I've seen a thread on this but there was no conclusive answer, so I'm
> reopening this.

It hasn't budged an inch since then:  your "Entering interactive session"
problem is the same one everyone has; it gets reported on SF's bug and/or
support managers at least daily; SF has not fixed it yet; these days they
don't even respond to scp bug reports anymore; the cause appears to be SF's
custom sfshell, and only SF can change that; the only known workarounds are
to (a) modify files on SF directly (they suggest vi <wink>), or (b) initiate
scp *from* SF, using your local machine as a server (if you can do that -- I
cannot, or at least haven't succeeded).




From martin at loewis.home.cs.tu-berlin.de  Thu Dec 28 23:52:02 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 28 Dec 2000 23:52:02 +0100
Subject: [Python-Dev] curses in the core?
Message-ID: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>

> If curses is a core facility now, the default build should tread it
> as one.
...
> IMO ssl isn't an issue because it's not documented as being in the
> standard module set.
...
> 3. Documented as being in the core but not built in by default.
> My more general claim is that the existence of class 3 is a problem

In the case of curses, I believe there is a documentation error in the
2.0 documentation. The curses packages is listed under "Generic
Operating System Services". I believe this is wrong, it should be listed
as "Unix Specific Services".

Unless I'm mistaken, the curses module is not available on the Mac and
on Windows. With that change, the curses module would then fall into
Eric's category 2 (Not documented as being in the core and not built
in by default).

That documentation change should be carried out even if curses is
autoconfigured; autoconf is used on Unix only, either.

Regards,
Martin

P.S. The "Python Library Reference" content page does not mention the
word "core" at all, except as part of asyncore...



From thomas at xs4all.net  Thu Dec 28 23:58:25 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 28 Dec 2000 23:58:25 +0100
Subject: [Python-Dev] scp to sourceforge
In-Reply-To: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 12:08:18PM -0500
References: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>
Message-ID: <20001228235824.E1811@xs4all.nl>

On Thu, Dec 28, 2000 at 12:08:18PM -0500, Guido van Rossum wrote:

> I've seen a thread on this but there was no conclusive answer, so I'm
> reopening this.

Actually there was: it's all SourceForge's fault. (At least that's my
professional opinion ;) They honestly have a strange setup, though how
strange and to what end I cannot tell.

> Would somebody please figure out a way to update the PEPs?  It's kind
> of pathetic to see the website not have the latest versions...

The way to update the peps is by ssh'ing into shell.sourceforge.net, and
then scp'ing the files from your work repository to the htdocs/peps
directory. That is, until SF fixes the scp problem. This method works (I
just updated all PEPs to up-to-date CVS versions) but it's a bit cumbersome.
And it only works if you have ssh access to your work environment. And it's
damned hard to script; I tried playing with a single ssh command that did
all the work, but between shell weirdness, scp weirdness and a genuine bash
bug I couldn't figure it out.

I assume that SF is aware of the severity of this problem, and is working on
something akin to a fix or workaround. Until then, I can do an occasional
update of the PEPs, for those that can't themselves.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Fri Dec 29 00:05:28 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 29 Dec 2000 00:05:28 +0100
Subject: [Python-Dev] scp to sourceforge
In-Reply-To: <20001228235824.E1811@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 11:58:25PM +0100
References: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> <20001228235824.E1811@xs4all.nl>
Message-ID: <20001229000528.F1811@xs4all.nl>

On Thu, Dec 28, 2000 at 11:58:25PM +0100, Thomas Wouters wrote:
> On Thu, Dec 28, 2000 at 12:08:18PM -0500, Guido van Rossum wrote:

> > Would somebody please figure out a way to update the PEPs?  It's kind
> > of pathetic to see the website not have the latest versions...
> 
> The way to update the peps is by ssh'ing into shell.sourceforge.net, and
> then scp'ing the files from your work repository to the htdocs/peps

[ blah blah ]

And then they fixed it ! At least, for me, direct scp now works fine. (I
should've tested that before posting my blah blah, sorry.) Anybody else,
like people using F-secure ssh (unix or windows) experience the same ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From MarkH at ActiveState.com  Fri Dec 29 00:15:01 2000
From: MarkH at ActiveState.com (Mark Hammond)
Date: Fri, 29 Dec 2000 10:15:01 +1100
Subject: [Python-Dev] chomp()?
In-Reply-To: <14923.31238.65155.496546@buffalo.fnal.gov>
Message-ID: <LCEPIIGDJPKCOIHOBJEPIEILCNAA.MarkH@ActiveState.com>

> -1 from me.  P=NP (Python is not Perl).  "Chomp" is an
> excessively cute name.
> And like you said, this is too much like "rstrip" to merit a separate
> method.

My thoughts exactly.  I can't remember _ever_ wanting to chomp() when
rstrip() wasnt perfectly suitable.  I'm sure it happens, but not often
enough to introduce an ambiguous new function purely for "feature parity"
with Perl.

Mark.




From esr at thyrsus.com  Fri Dec 29 00:25:28 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 18:25:28 -0500
Subject: [Python-Dev] Re: curses in the core?
In-Reply-To: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Thu, Dec 28, 2000 at 11:52:02PM +0100
References: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>
Message-ID: <20001228182528.A10743@thyrsus.com>

Martin v. Loewis <martin at loewis.home.cs.tu-berlin.de>:
> In the case of curses, I believe there is a documentation error in the
> 2.0 documentation. The curses packages is listed under "Generic
> Operating System Services". I believe this is wrong, it should be listed
> as "Unix Specific Services".

I agree that this is an error and should be fixed.
 
> Unless I'm mistaken, the curses module is not available on the Mac and
> on Windows. With that change, the curses module would then fall into
> Eric's category 2 (Not documented as being in the core and not built
> in by default).

Well...that's a definitional question that is part of the larger issue here.

What does being in the Python core mean?  There are two potential definitions:

1. Documentation says it's available on all platforms.

2. Documentation restricts it to one of the three platform groups 
   (Unix/Windows/Mac) but implies that it will be available on any
   OS in that group.  

I think the second one is closer to what application programmers
thinking about which batteries are included expect.  But I could be
persuaded otherwise by a good argument.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The difference between death and taxes is death doesn't get worse
every time Congress meets
	-- Will Rogers



From akuchlin at mems-exchange.org  Fri Dec 29 01:33:36 2000
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Thu, 28 Dec 2000 19:33:36 -0500
Subject: [Python-Dev] Bookstore completed
Message-ID: <200012290033.TAA01295@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com>

OK, I think I'm ready to declare the Python bookstore complete enough
to go public.  Before I set up redirects from www.python.org, please
take another look.  (More book descriptions would be helpful...)

http://www.amk.ca/bookstore/

--amk





From akuchlin at mems-exchange.org  Fri Dec 29 01:46:16 2000
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Thu, 28 Dec 2000 19:46:16 -0500
Subject: [Python-Dev] Help wanted with setup.py script
Message-ID: <200012290046.TAA01346@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com>

Want to help with the laudable goal of automating the Python build
process?  It'll need lots of testing on many different platforms, and
I'd like to start the process now.

First, download the setup.py script from  
       http://www.amk.ca/files/python/setup.py

Next, drop it in the root directory of your Python source tree and run
"python setup.py build".  

If it dies with an exception, let me know.  (Replies to this list are
OK.)

If it runs to completion, look in the Modules/build/lib.<something>
directory to see which modules got built.  (On my system, <something>
is "linux-i686-2.0", but of course this will depend on your platform.)

Is anything missing that should have been built?  (_tkinter.so is the
prime candidate; the autodetection code is far too simple at the
moment and assumes one particular version of Tcl and Tk.)  Did an
attempt at building a module fail?  These indicate problems
autodetecting something, so if you can figure out how to find the
required library or include file, let me know what to do.

--amk



From fdrake at acm.org  Fri Dec 29 05:12:18 2000
From: fdrake at acm.org (Fred L. Drake)
Date: Thu, 28 Dec 2000 23:12:18 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <20001228212115.C1811@xs4all.nl>
Message-ID: <web-404134@digicool.com>

On Thu, 28 Dec 2000 21:21:15 +0100,
 Thomas Wouters <thomas at xs4all.net> wrote:
 > The 'except' clause is documented to execute if a
 > matching exception occurs,
 > and 'else' if no exception occurs. Maybe the intent of
 > the 'else' clause

  This can certainly be clarified in the documentation --
please file a bug report at http://sourceforge.net/projects/python/.
  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From tim.one at home.com  Fri Dec 29 05:25:44 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 28 Dec 2000 23:25:44 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <20001228212115.C1811@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEMMIFAA.tim.one@home.com>

[Fred, suggested doc change near the end]

[Thomas Wouters]
> (MAL and I already discussed this in private mail: Robin did mean
> try/except/else, and 'finally' already executes when returning
> directly from the 'try' block, even in Python 1.5)
>
> This code:
>
> try:
>    return
> except:
>    pass
> else:
>    print "returning"
>
> will indeed not print 'returning', but I believe it's by design.
> I'm against changing it, in any case, and not just because it'd
> break code :) If you want something that always executes, use a
> 'finally'. Or don't return from the 'try', but return in the
> 'else' clause.

Guido's out of town again, so I'll channel him:  Thomas is correct on all
counts.  In try/else, the "else" clause should execute if and only if
control "falls off the end" of the "try" block.

IOW, consider:

    try:
        arbitrary stuff
    x = 1

An "else" clause added to that "try" should execute when and only when the
code as written executes the "x = 1" after the block.  When "arbitrary
stuff" == "return", control does not fall off the end, so "else" shouldn't
trigger.  Same thing if "arbitrary stuff" == "break" and we're inside a
loop, or "continue" after Thomas's patch gets accepted.

> The 'except' clause is documented to execute if a matching
> exception occurs, and 'else' if no exception occurs.

Yup, and that's imprecise:  the same words are used to describe (part of)
when 'finally' executes, but they weren't intended to be the same.

> Maybe the intent of the 'else' clause would be clearer if it
> was documented to 'execute if the try: clause finishes without
> an exception being raised' ?

Sorry, I don't find that any clearer.  Let's be explicit:

    The optional 'else' clause is executed when the 'try' clause
    terminates by any means other than an exception or executing a
    'return', 'continue' or 'break' statement.  Exceptions in the
    'else' clause are not handled by the preceding 'except' clauses.

> The 'else' clause isn't executed when you 'break' or (after
> applying my continue-in-try patch ;) 'continue' out of the
> 'try', either.

Hey, now you're channeling me <wink>!  Be afraid -- be very afraid.




From moshez at zadka.site.co.il  Fri Dec 29 15:42:44 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 29 Dec 2000 16:42:44 +0200 (IST)
Subject: [Python-Dev] chomp()?
In-Reply-To: <3A4B7BB1.F09660ED@lemburg.com>
References: <3A4B7BB1.F09660ED@lemburg.com>, <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <20001229144244.D5AD0A84F@darjeeling.zadka.site.co.il>

On Thu, 28 Dec 2000, "M.-A. Lemburg" <mal at lemburg.com> wrote:

[about chomp]
> We already have .splitlines() which does the above (remove
> line breaks) not only for a single line, but for many lines at once.
> 
> Even better: .splitlines() also does the right thing for Unicode.

OK, I retract my earlier +1, and instead I move that this be added
to the FAQ. Where is the FAQ maintained nowadays? The grail link
doesn't work anymore.

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From loewis at informatik.hu-berlin.de  Fri Dec 29 17:52:13 2000
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Fri, 29 Dec 2000 17:52:13 +0100 (MET)
Subject: [Python-Dev] Re: [Patch #103002] Fix for #116285: Properly raise UnicodeErrors
Message-ID: <200012291652.RAA20251@pandora.informatik.hu-berlin.de>

[resent since python.org ran out of disk space]
> My only problem with it is your copyright notice. AFAIK, patches to
> the Python core cannot contain copyright notices without proper
> license information. OTOH, I don't think that these minor changes
> really warrant adding a complete license paragraph.

I'd like to get an "official" clarification on this question. Is it
the case that patches containing copyright notices are only accepted
if they are accompanied with license information?

I agree that the changes are minor, I also believe that I hold the
copyright to the changes whether I attach a notice or not (at least
according to our local copyright law).

What concerns me that without such a notice, gencodec.py looks as if
CNRI holds the copyright to it. I'm not willing to assign the
copyright of my changes to CNRI, and I'd like to avoid the impression
of doing so.

What is even more concerning is that CNRI also holds the copyright to
the generated files, even though they are derived from information
made available by the Unicode consortium!

Regards,
Martin



From tim.one at home.com  Fri Dec 29 20:56:36 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 29 Dec 2000 14:56:36 -0500
Subject: [Python-Dev] scp to sourceforge
In-Reply-To: <20001229000528.F1811@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEOBIFAA.tim.one@home.com>

[Thomas Wouters]
> And then they fixed it ! At least, for me, direct scp now works
> fine. (I should've tested that before posting my blah blah, sorry.)

I tried it immediately before posting my blah-blah yesterday, and it was
still hanging.

> Anybody else, like people using F-secure ssh (unix or windows)
> experience the same ?

Same here:  I tried it again just now (under Win98 cmdline ssh/scp) and it
worked fine!  We're in business again.  Thanks for fixing it, Thomas <wink>.

now-if-only-we-could-get-python-dev-email-on-an-approximation-to-the-
    same-day-it's-sent-ly y'rs  - tim




From tim.one at home.com  Fri Dec 29 21:27:40 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 29 Dec 2000 15:27:40 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <EC$An3AHZGT6EwJP@jessikat.fsnet.co.uk>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEODIFAA.tim.one@home.com>

[Robin Becker]
> The 2.0 docs clearly state 'The optional else clause is executed when no
> exception occurs in the try clause.' This makes it sound as though it
> gets executed on the 'way out'.

Of course.  That's not what the docs meant, though, and Guido is not going
to change the implementation now because that would break code that relies
on how Python has always *worked* in these cases.  The way Python works is
also the way Guido intended it to work (I'm allowed to channel him when he's
on vacation <0.9 wink)>.

Indeed, that's why I suggested a specific doc change.  If your friend would
also be confused by that, then we still have a problem; else we don't.




From tim.one at home.com  Fri Dec 29 21:37:08 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 29 Dec 2000 15:37:08 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <web-404134@digicool.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEOFIFAA.tim.one@home.com>

[Fred]
>   This can certainly be clarified in the documentation --
> please file a bug report at http://sourceforge.net/projects/python/.

Here you go:

https://sourceforge.net/bugs/?func=detailbug&bug_id=127098&group_id=5470




From thomas at xs4all.net  Fri Dec 29 21:59:16 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 29 Dec 2000 21:59:16 +0100
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEODIFAA.tim.one@home.com>; from tim.one@home.com on Fri, Dec 29, 2000 at 03:27:40PM -0500
References: <EC$An3AHZGT6EwJP@jessikat.fsnet.co.uk> <LNBBLJKPBEHFEDALKOLCKEODIFAA.tim.one@home.com>
Message-ID: <20001229215915.L1281@xs4all.nl>

On Fri, Dec 29, 2000 at 03:27:40PM -0500, Tim Peters wrote:

> Indeed, that's why I suggested a specific doc change.  If your friend would
> also be confused by that, then we still have a problem; else we don't.

Note that I already uploaded a patch to fix the docs, assigned to fdrake,
using Tim's wording exactly. (patch #103045)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From moshez at zadka.site.co.il  Sun Dec 31 01:33:30 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Sun, 31 Dec 2000 02:33:30 +0200 (IST)
Subject: [Python-Dev] FAQ Horribly Out Of Date
Message-ID: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>

Hi!
The current FAQ is horribly out of date. I think the FAQ-Wizard method
has proven itself not very efficient (for example, apparently no one
noticed until now that it's not working <0.2 wink>). Is there any 
hope putting the FAQ in Misc/, having a script which scp's it
to the SF page, and making that the official FAQ?

On a related note, what is the current status of the PSA? Is it officially
dead?
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tim.one at home.com  Sat Dec 30 21:48:08 2000
From: tim.one at home.com (Tim Peters)
Date: Sat, 30 Dec 2000 15:48:08 -0500
Subject: [Python-Dev] Most everything is busted
Message-ID: <LNBBLJKPBEHFEDALKOLCCEOMIFAA.tim.one@home.com>

Add this error to the pot:

"""
http://www.python.org/cgi-bin/moinmoin

Proxy Error
The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /cgi-bin/moinmoin.

Reason: Document contains no data
-------------------------------------------------------------------

Apache/1.3.9 Server at www.python.org Port 80
"""

Also, as far as I can tell:

+ news->mail for c.l.py hasn't delivered anything for well over 24 hours.

+ No mail to Python-Dev has showed up in the archives (let alone been
delivered) since Fri, 29 Dec 2000 16:42:44 +0200 (IST).

+ The other Python mailing lists appear equally dead.

time-for-a-new-year!-ly y'rs  - tim




From barry at wooz.org  Sun Dec 31 02:06:23 2000
From: barry at wooz.org (Barry A. Warsaw)
Date: Sat, 30 Dec 2000 20:06:23 -0500
Subject: [Python-Dev] Re: Most everything is busted
References: <LNBBLJKPBEHFEDALKOLCCEOMIFAA.tim.one@home.com>
Message-ID: <14926.34447.60988.553140@anthem.concentric.net>

>>>>> "TP" == Tim Peters <tim.one at home.com> writes:

    TP> + news->mail for c.l.py hasn't delivered anything for well
    TP> over 24 hours.

    TP> + No mail to Python-Dev has showed up in the archives (let
    TP> alone been delivered) since Fri, 29 Dec 2000 16:42:44 +0200
    TP> (IST).

    TP> + The other Python mailing lists appear equally dead.

There's a stupid, stupid bug in Mailman 2.0, which I've just fixed and
(hopefully) unjammed things on the Mailman end[1].  We're still
probably subject to the Postfix delays unfortunately; I think those
are DNS related, and I've gotten a few other reports of DNS oddities,
which I've forwarded off to the DC sysadmins.  I don't think that
particular problem will be fixed until after the New Year.

relax-and-enjoy-the-quiet-ly y'rs,
-Barry

[1] For those who care: there's a resource throttle in qrunner which
limits the number of files any single qrunner process will handle.
qrunner does a listdir() on the qfiles directory and ignores any .msg
file it finds (it only does the bulk of the processing on the
corresponding .db files).  But it performs the throttle check on every
file in listdir() so depending on the order that listdir() returns and
the number of files in the qfiles directory, the throttle check might
get triggered before any .db file is seen.  Wedge city.  This is
serious enough to warrant a Mailman 2.0.1 release, probably mid-next
week.




From gstein at lyra.org  Sun Dec 31 11:19:50 2000
From: gstein at lyra.org (Greg Stein)
Date: Sun, 31 Dec 2000 02:19:50 -0800
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Sun, Dec 31, 2000 at 02:33:30AM +0200
References: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>
Message-ID: <20001231021950.M28628@lyra.org>

On Sun, Dec 31, 2000 at 02:33:30AM +0200, Moshe Zadka wrote:
>...
> On a related note, what is the current status of the PSA? Is it officially
> dead?

The PSA was always kind of a (legal) fiction with the basic intent to help
provide some funding for Python development. Since that isn't occurring at
CNRI any more, the PSA is a bit moot. There was always some idea that maybe
the PSA would be the "sponsor" (and possibly the beneficiary) of the
conferences. That wasn't ever really formalized either.


From akuchlin at cnri.reston.va.us  Sun Dec 31 16:58:12 2000
From: akuchlin at cnri.reston.va.us (Andrew Kuchling)
Date: Sun, 31 Dec 2000 10:58:12 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Sun, Dec 31, 2000 at 02:33:30AM +0200
References: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>
Message-ID: <20001231105812.A12168@newcnri.cnri.reston.va.us>

On Sun, Dec 31, 2000 at 02:33:30AM +0200, Moshe Zadka wrote:
>The current FAQ is horribly out of date. I think the FAQ-Wizard method
>has proven itself not very efficient (for example, apparently no one
>noticed until now that it's not working <0.2 wink>). Is there any 

It also leads to one section of the FAQ (#3, I think) having something
like 60 questions jumbled together.  IMHO the FAQ should be a text
file, perhaps in the PEP format so it can be converted to HTML, and it
should have an editor who'll arrange it into smaller sections.  Any
volunteers?  (Must ... resist ...  urge to volunteer myself...  help
me, Spock...)

--amk





From skip at mojam.com  Sun Dec 31 20:25:18 2000
From: skip at mojam.com (Skip Montanaro)
Date: Sun, 31 Dec 2000 13:25:18 -0600 (CST)
Subject: [Python-Dev] plz test bsddb using shared linkage
Message-ID: <14927.34846.153117.764547@beluga.mojam.com>

A bug was filed on SF contending that the default linkage for bsddb should
be shared instead of static because some Linux systems ship multiple
versions of libdb.

Would those of you who can and do build bsddb (probably only unixoids of
some variety) please give this simple test a try?  Uncomment the *shared*
line in Modules/Setup.config.in, re-run configure, build Python and then
try:

    import bsddb
    db = bsddb.btopen("/tmp/dbtest.db", "c")
    db["1"] = "1"
    print db["1"]
    db.close()
    del db

If this doesn't fail for anyone I'll check the change in and close the bug
report, otherwise I'll add a(nother) comment to the bug report that *shared*
breaks bsddb for others and close the bug report.

Thx,

Skip




From skip at mojam.com  Sun Dec 31 20:26:16 2000
From: skip at mojam.com (Skip Montanaro)
Date: Sun, 31 Dec 2000 13:26:16 -0600 (CST)
Subject: [Python-Dev] plz test bsddb using shared linkage
Message-ID: <14927.34904.20832.319647@beluga.mojam.com>

oops, forgot the bug report is at

  http://sourceforge.net/bugs/?func=detailbug&bug_id=126564&group_id=5470

for those of you who do not monitor python-bugs-list.

S



From tim.one at home.com  Sun Dec 31 21:28:47 2000
From: tim.one at home.com (Tim Peters)
Date: Sun, 31 Dec 2000 15:28:47 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEBBIGAA.tim.one@home.com>

[Moshe Zadka]
> The current FAQ is horribly out of date.

The password is Spam.  Fix it <wink>.

> I think the FAQ-Wizard method has proven itself not very
> efficient (for example, apparently no one noticed until now
> that it's not working <0.2 wink>).

I'm afraid almost nothing on python.org with an active component works today
(not searches, not the FAQ Wizard, not the 2.0 Wiki, ...).  If history is
any clue, these will remain broken until Guido gets back from vacation.

> Is there any hope putting the FAQ in Misc/, having a script
> which scp's it to the SF page, and making that the official FAQ?

Would be OK by me.  I'm more concerned that the whole of python.org has
barely been updated since March; huge chunks of the FAQ are still relevant,
but, e.g., the Job Board hasn't been touched in over 3 months; the News got
so out of date Guido deleted the whole section; etc.

> On a related note, what is the current status of the PSA? Is it
> officially dead?

It appears that CNRI can only think about one thing at a time <0.5 wink>.
For the last 6 months, that thing has been the license.  If they ever
resolve the GPL compatibility issue, maybe they can be persuaded to think
about the PSA.  In the meantime, I'd suggest you not renew <ahem>.




From tim.one at home.com  Sun Dec 31 23:12:43 2000
From: tim.one at home.com (Tim Peters)
Date: Sun, 31 Dec 2000 17:12:43 -0500
Subject: [Python-Dev] plz test bsddb using shared linkage
In-Reply-To: <14927.34846.153117.764547@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGECEIGAA.tim.one@home.com>

[Skip Montanaro]
> ...
> Would those of you who can and do build bsddb (probably only
> unixoids of some variety) please give this simple test a try?

Just noting that bsddb already ships with the Windows installer as a
(shared) DLL.  But it's an old (1.85?) Windows port from Sam Rushing.




From gward at mems-exchange.org  Fri Dec  1 00:14:39 2000
From: gward at mems-exchange.org (Greg Ward)
Date: Thu, 30 Nov 2000 18:14:39 -0500
Subject: [Python-Dev] PEP 229 and 222
In-Reply-To: <20001128215748.A22105@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Tue, Nov 28, 2000 at 09:57:48PM -0500
References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us>
Message-ID: <20001130181438.A21596@ludwig.cnri.reston.va.us>

On 28 November 2000, Andrew Kuchling said:
> On Tue, Nov 28, 2000 at 06:01:38PM -0500, Guido van Rossum wrote:
> >- Always shared libs.  What about Unixish systems that don't have
> >  shared libs?  What if you just want something to be hardcoded as
> >  statically linked, e.g. for security reasons?  (On the other hand
> 
> Beats me.  I'm not even sure if the Distutils offers a way to compile
> a static Python binary.  (GPW: well, does it?)

It's in the CCompiler interface, but hasn't been exposed to the outside
world.  (IOW, it's mainly a question of desiging the right setup
script/command line interface: the implementation should be fairly
straightforward, assuming the existing CCompiler classes do the right
thing for generating binary executables.)

        Greg



From gward at mems-exchange.org  Fri Dec  1 00:19:38 2000
From: gward at mems-exchange.org (Greg Ward)
Date: Thu, 30 Nov 2000 18:19:38 -0500
Subject: [Python-Dev] A house upon the sand
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEMDIBAA.tim.one@home.com>; from tim.one@home.com on Wed, Nov 29, 2000 at 01:23:10AM -0500
References: <200011281510.KAA03475@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCOEMDIBAA.tim.one@home.com>
Message-ID: <20001130181937.B21596@ludwig.cnri.reston.va.us>

On 29 November 2000, Tim Peters said:
> [Guido]
> > ...
> > Because of its importance, the deprecation time of the string module
> > will be longer than that of most deprecated modules.  I expect it
> > won't be removed until Python 3000.
> 
> I see nothing in the 2.0 docs, code, or "what's new" web pages saying that
> it's deprecated.  So I don't think you can even start the clock on this one
> before 2.1 (a fuzzy stmt on the web page for the unused 1.6 release doesn't
> count ...).

FWIW, I would argue against *ever* removing (much less "deprecating",
ie. threatening to remove) the string module.  To a rough approximation,
every piece of Python code in existence code prior to Python 1.6 depends
on the string module.  I for one do not want to have to change all
occurences of string.foo(x) to x.foo() -- it just doesn't buy enough to
make it worth changing all that code.  

Not only does the amount of code to change mean the change would be
non-trivial, it's not always the right thing, especially if you happen
to be one of the people who dislikes the "delim.join(list)" idiom.  (I'm
still undecided.)

        Greg



From fredrik at effbot.org  Fri Dec  1 07:39:57 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 1 Dec 2000 07:39:57 +0100
Subject: [Python-Dev] TypeError: foo, bar
Message-ID: <008f01c05b61$877263b0$3c6340d5@hagrid>

just stumbled upon yet another (high-profile) python newbie
confused a "TypeError: read-only character buffer, dictionary"
message.

how about changing "read-only character buffer" to "string
or read-only character buffer", and the "foo, bar" format to
"expected foo, found bar", so we get:

    "TypeError: expected string or read-only character
    buffer, found dictionary"

</F>




From tim.one at home.com  Fri Dec  1 07:58:53 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 1 Dec 2000 01:58:53 -0500
Subject: [Python-Dev] TypeError: foo, bar
In-Reply-To: <008f01c05b61$877263b0$3c6340d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEAOICAA.tim.one@home.com>

[Fredrik Lundh]
> just stumbled upon yet another (high-profile) python newbie
> confused a "TypeError: read-only character buffer, dictionary"
> message.
>
> how about changing "read-only character buffer" to "string
> or read-only character buffer", and the "foo, bar" format to
> "expected foo, found bar", so we get:
> 
>     "TypeError: expected string or read-only character
>     buffer, found dictionary"

+0.  +1 if "found" is changed to "got".

"found"-implies-a-search-ly y'rs  - tim




From thomas.heller at ion-tof.com  Fri Dec  1 09:10:21 2000
From: thomas.heller at ion-tof.com (Thomas Heller)
Date: Fri, 1 Dec 2000 09:10:21 +0100
Subject: [Python-Dev] PEP 229 and 222
References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us> <20001130181438.A21596@ludwig.cnri.reston.va.us>
Message-ID: <014301c05b6e$269716a0$e000a8c0@thomasnotebook>

> > Beats me.  I'm not even sure if the Distutils offers a way to compile
> > a static Python binary.  (GPW: well, does it?)
> 
> It's in the CCompiler interface, but hasn't been exposed to the outside
> world.  (IOW, it's mainly a question of desiging the right setup
> script/command line interface: the implementation should be fairly
> straightforward, assuming the existing CCompiler classes do the right
> thing for generating binary executables.)
Distutils currently only supports build_*** commands for
C-libraries and Python extensions.

Shouldn't there also be build commands for shared libraries,
executable programs and static Python binaries?

Thomas

BTW: Distutils-sig seems pretty dead these days...





From ping at lfw.org  Fri Dec  1 11:23:56 2000
From: ping at lfw.org (Ka-Ping Yee)
Date: Fri, 1 Dec 2000 02:23:56 -0800 (PST)
Subject: [Python-Dev] Cryptic error messages
Message-ID: <Pine.LNX.4.10.10011181405020.504-100000@skuld.kingmanhall.org>

An attempt to use sockets for the first time yesterday left a
friend of mine bewildered:

    >>> import socket
    >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    >>> s.connect('localhost:234')
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    TypeError: 2-sequence, 13-sequence
    >>> 

"What the heck does '2-sequence, 13-sequence' mean?" he rightfully asked.


I see in getargs.c (line 275) that this type of message is documented:

    /* Convert a tuple argument.
    [...]
       If the argument is invalid:
    [...]
          *msgbuf contains an error message, whose format is:
             "<typename1>, <typename2>", where:
                <typename1> is the name of the expected type, and
                <typename2> is the name of the actual type,
             (so you can surround it by "expected ... found"),
          and msgbuf is returned.
    */

It's clear that the socketmodule is not prepending "expected" and
appending "found", as the author of converttuple intended.

But when i grepped through the source code, i couldn't find anyone
applying this "expected %s found" % msgbuf convention outside of
getargs.c.  Is it really in use?

Could we just change getargs.c so that converttuple() returns a
message like "expected ..., got ..." instead of seterror()?

Additionally it would be nice to say '13-character string' instead
of '13-sequence'...


-- ?!ng

"All models are wrong; some models are useful."
    -- George Box




From mwh21 at cam.ac.uk  Fri Dec  1 12:20:23 2000
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 01 Dec 2000 11:20:23 +0000
Subject: [Python-Dev] Cryptic error messages
In-Reply-To: Ka-Ping Yee's message of "Fri, 1 Dec 2000 02:23:56 -0800 (PST)"
References: <Pine.LNX.4.10.10011181405020.504-100000@skuld.kingmanhall.org>
Message-ID: <m37l5k5qx4.fsf@atrus.jesus.cam.ac.uk>

Ka-Ping Yee <ping at lfw.org> writes:

> An attempt to use sockets for the first time yesterday left a
> friend of mine bewildered:
> 
>     >>> import socket
>     >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>     >>> s.connect('localhost:234')
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     TypeError: 2-sequence, 13-sequence
>     >>> 
> 
> "What the heck does '2-sequence, 13-sequence' mean?" he rightfully asked.
> 

I'm not sure about the general case, but in this case you could do
something like:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102599&group_id=5470

Now you get an error message like:

TypeError: getsockaddrarg: AF_INET address must be tuple, not string

Cheers,
M.

-- 
  I have gathered a posie of other men's flowers, and nothing but the
  thread that binds them is my own.                       -- Montaigne




From guido at python.org  Fri Dec  1 14:02:02 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 01 Dec 2000 08:02:02 -0500
Subject: [Python-Dev] TypeError: foo, bar
In-Reply-To: Your message of "Fri, 01 Dec 2000 07:39:57 +0100."
             <008f01c05b61$877263b0$3c6340d5@hagrid> 
References: <008f01c05b61$877263b0$3c6340d5@hagrid> 
Message-ID: <200012011302.IAA31609@cj20424-a.reston1.va.home.com>

> just stumbled upon yet another (high-profile) python newbie
> confused a "TypeError: read-only character buffer, dictionary"
> message.
> 
> how about changing "read-only character buffer" to "string
> or read-only character buffer", and the "foo, bar" format to
> "expected foo, found bar", so we get:
> 
>     "TypeError: expected string or read-only character
>     buffer, found dictionary"

The first was easy, and I've done it.  The second one, for some
reason, is hard.  I forget why.  Sorry.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From cgw at fnal.gov  Fri Dec  1 14:41:04 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Fri, 1 Dec 2000 07:41:04 -0600 (CST)
Subject: [Python-Dev] TypeError: foo, bar
In-Reply-To: <008f01c05b61$877263b0$3c6340d5@hagrid>
References: <008f01c05b61$877263b0$3c6340d5@hagrid>
Message-ID: <14887.43632.812342.414156@buffalo.fnal.gov>

Fredrik Lundh writes:

 > how about changing "read-only character buffer" to "string
 > or read-only character buffer", and the "foo, bar" format to
 > "expected foo, found bar", so we get:
 > 
 >     "TypeError: expected string or read-only character
 >     buffer, found dictionary"

+100.  Recently, I've been teaching Python to some beginners and they
find this message absolutely inscrutable.

Also agree with Tim about "found" vs. "got", but this is of secondary
importance.




From moshez at math.huji.ac.il  Fri Dec  1 15:26:03 2000
From: moshez at math.huji.ac.il (Moshe Zadka)
Date: Fri, 1 Dec 2000 16:26:03 +0200 (IST)
Subject: [Python-Dev] [OT] Change of Address
Message-ID: <Pine.GSO.4.10.10012011624170.1366-100000@sundial>

I'm sorry to bother you all with this, but from time to time you might
need to reach my by e-mail...
30 days from now, this e-mail address will no longer be valid.
Please use anything at zadka.site.co.il to reach me.

Thank you for your time.
--
Moshe Zadka <moshez at zadka.site.co.il> -- 95855124
http://advogato.org/person/moshez




From gward at mems-exchange.org  Fri Dec  1 16:14:53 2000
From: gward at mems-exchange.org (Greg Ward)
Date: Fri, 1 Dec 2000 10:14:53 -0500
Subject: [Python-Dev] PEP 229 and 222
In-Reply-To: <014301c05b6e$269716a0$e000a8c0@thomasnotebook>; from thomas.heller@ion-tof.com on Fri, Dec 01, 2000 at 09:10:21AM +0100
References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us> <20001130181438.A21596@ludwig.cnri.reston.va.us> <014301c05b6e$269716a0$e000a8c0@thomasnotebook>
Message-ID: <20001201101452.A26074@ludwig.cnri.reston.va.us>

On 01 December 2000, Thomas Heller said:
> Distutils currently only supports build_*** commands for
> C-libraries and Python extensions.
> 
> Shouldn't there also be build commands for shared libraries,
> executable programs and static Python binaries?

Andrew and I talked about this a bit yesterday, and the proposed
interface is as follows:

    python setup.py build_ext --static

will compile all extensions in the current module distribution, but
instead of creating a .so (.pyd) file for each one, will create a new
python binary in build/bin.<plat>.

Issue to be resolved: what to call the new python binary, especially
when installing it (presumably we *don't* want to clobber the stock
binary, but supplement it with (eg.) "foopython").

Note that there is no provision for selectively building some extensions
as shared.  This means that Andrew's Distutil-ization of the standard
library will have to override the build_ext command and have some extra
way to select extensions for shared/static.  Neither of us considered
this a problem.

> BTW: Distutils-sig seems pretty dead these days...

Yeah, that's a combination of me playing on other things and python.net
email being dead for over a week.  I'll cc the sig on this and see if
this interface proposal gets anyone's attention.

        Greg



From jeremy at alum.mit.edu  Fri Dec  1 20:27:14 2000
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 1 Dec 2000 14:27:14 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
Message-ID: <14887.64402.88530.714821@bitdiddle.concentric.net>

There was recently some idle chatter in Guido's living room about
using a unit testing framework (like PyUnit) for the Python regression
test suite.  We're also writing tests for some DC projects, and need
to decide what framework to use.

Does anyone have opinions on test frameworks?  A quick web search
turned up PyUnit (pyunit.sourceforge.net) and a script by Tres Seaver
that allows implements xUnit-style unit tests.  Are there other tools
we should consider?

Is anyone else interested in migrating the current test suite to a new
framework?  I hope the new framework will allow us to improve the test
suite in a number of ways:

    - run an entire test suite to completion instead of stopping on
      the first failure

    - clearer reporting of what went wrong

    - better support for conditional tests, e.g. write a test for
      httplib that only runs if the network is up.  This is tied into
      better error reporting, since the current test suite could only
      report that httplib succeeded or failed.

Jeremy



From fdrake at acm.org  Fri Dec  1 20:24:46 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 1 Dec 2000 14:24:46 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net>
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
Message-ID: <14887.64254.399477.935828@cj42289-a.reston1.va.home.com>

Jeremy Hylton writes:
 >     - better support for conditional tests, e.g. write a test for
 >       httplib that only runs if the network is up.  This is tied into
 >       better error reporting, since the current test suite could only
 >       report that httplib succeeded or failed.

  There is a TestSkipped exception that can be raised with an
explanation of why.  It's used in the largefile test (at least).  I
think it is documented in the README.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From akuchlin at mems-exchange.org  Fri Dec  1 20:58:27 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Fri, 1 Dec 2000 14:58:27 -0500
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 02:27:14PM -0500
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
Message-ID: <20001201145827.D16751@kronos.cnri.reston.va.us>

On Fri, Dec 01, 2000 at 02:27:14PM -0500, Jeremy Hylton wrote:
>There was recently some idle chatter in Guido's living room about
>using a unit testing framework (like PyUnit) for the Python regression
>test suite.  We're also writing tests for some DC projects, and need

Someone remembered my post of 23 Nov, I see...  The only other test
framework I know of is the unittest.py inside Quixote, written because
we thought PyUnit was kind of clunky.  Greg Ward, who primarily wrote
it, used more sneaky interpreter tricks to make the interface more
natural, though it still worked with Jython last time we checked (some
time ago, though).  No GUI, but it can optionally show the code coverage of a
test suite, too.

See http://x63.deja.com/=usenet/getdoc.xp?AN=683946404 for some notes
on using it.  Obviously I think the Quixote unittest.py is the best
choice for the stdlib.

--amk



From jeremy at alum.mit.edu  Fri Dec  1 21:55:28 2000
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 1 Dec 2000 15:55:28 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <20001201145827.D16751@kronos.cnri.reston.va.us>
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
	<20001201145827.D16751@kronos.cnri.reston.va.us>
Message-ID: <14888.4160.838336.537708@bitdiddle.concentric.net>

Is there any documentation for the Quixote unittest tool?  The Example
page is helpful, but it feels like there are some details that are not
explained.

Jeremy



From akuchlin at mems-exchange.org  Fri Dec  1 22:12:12 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Fri, 1 Dec 2000 16:12:12 -0500
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14888.4160.838336.537708@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 03:55:28PM -0500
References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net>
Message-ID: <20001201161212.A12372@kronos.cnri.reston.va.us>

On Fri, Dec 01, 2000 at 03:55:28PM -0500, Jeremy Hylton wrote:
>Is there any documentation for the Quixote unittest tool?  The Example
>page is helpful, but it feels like there are some details that are not
>explained.

I don't believe we've written docs at all for internal use.  What
details seem to be missing?

--amk




From jeremy at alum.mit.edu  Fri Dec  1 22:21:27 2000
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 1 Dec 2000 16:21:27 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <20001201161212.A12372@kronos.cnri.reston.va.us>
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
	<20001201145827.D16751@kronos.cnri.reston.va.us>
	<14888.4160.838336.537708@bitdiddle.concentric.net>
	<20001201161212.A12372@kronos.cnri.reston.va.us>
Message-ID: <14888.5719.844387.435471@bitdiddle.concentric.net>

>>>>> "AMK" == Andrew Kuchling <akuchlin at mems-exchange.org> writes:

  AMK> On Fri, Dec 01, 2000 at 03:55:28PM -0500, Jeremy Hylton wrote:
  >> Is there any documentation for the Quixote unittest tool?  The
  >> Example page is helpful, but it feels like there are some details
  >> that are not explained.

  AMK> I don't believe we've written docs at all for internal use.
  AMK> What details seem to be missing?

Details:

   - I assume setup/shutdown are equivalent to setUp/tearDown 
   - Is it possible to override constructor for TestScenario?
   - Is there something equivalent to PyUnit self.assert_
   - What does parse_args() do?
   - What does run_scenarios() do?
   - If I have multiple scenarios, how do I get them to run?

Jeremy




From akuchlin at mems-exchange.org  Fri Dec  1 22:34:30 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Fri, 1 Dec 2000 16:34:30 -0500
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14888.5719.844387.435471@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 04:21:27PM -0500
References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net> <20001201161212.A12372@kronos.cnri.reston.va.us> <14888.5719.844387.435471@bitdiddle.concentric.net>
Message-ID: <20001201163430.A12417@kronos.cnri.reston.va.us>

On Fri, Dec 01, 2000 at 04:21:27PM -0500, Jeremy Hylton wrote:
>   - I assume setup/shutdown are equivalent to setUp/tearDown 

Correct.

>   - Is it possible to override constructor for TestScenario?

Beats me; I see no reason why you couldn't, though.

>   - Is there something equivalent to PyUnit self.assert_

Probably test_bool(), I guess: self.test_bool('self.run.is_draft()')
asserts that self.run.is_draft() will return true.  Or does
self.assert_() do something more?

>   - What does parse_args() do?
>   - What does run_scenarios() do?
>   - If I have multiple scenarios, how do I get them to run?

These 3 questions are all related, really.  At the bottom of our test
scripts, we have the following stereotyped code:

if __name__ == "__main__":
    (scenarios, options) = parse_args()
    run_scenarios (scenarios, options)

parse_args() ensures consistent arguments to test scripts; -c measures
code coverage, -v is verbose, etc.  It also looks in the __main__
module and finds all subclasses of TestScenario, so you can do:  

python test_process_run.py  # Runs all N scenarios
python test_process_run.py ProcessRunTest # Runs all cases for 1 scenario
python test_process_run.py ProcessRunTest:check_access # Runs one test case
                                                       # in one scenario class

--amk




From tim.one at home.com  Fri Dec  1 22:47:54 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 1 Dec 2000 16:47:54 -0500
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECFICAA.tim.one@home.com>

[Jeremy Hylton]
> There was recently some idle chatter in Guido's living room about
> using a unit testing framework (like PyUnit) for the Python regression
> test suite.  We're also writing tests for some DC projects, and need
> to decide what framework to use.
>
> Does anyone have opinions on test frameworks?  A quick web search
> turned up PyUnit (pyunit.sourceforge.net) and a script by Tres Seaver
> that allows implements xUnit-style unit tests.  Are there other tools
> we should consider?

My own doctest is loved by people other than just me <wink>, but is aimed at
ensuring that examples in docstrings work exactly as shown (which is why it
starts with "doc" instead of "test").

> Is anyone else interested in migrating the current test suite to a new
> framework?

Yes.

> I hope the new framework will allow us to improve the test
> suite in a number of ways:
>
>     - run an entire test suite to completion instead of stopping on
>       the first failure

doctest does that.

>     - clearer reporting of what went wrong

Ditto.

>     - better support for conditional tests, e.g. write a test for
>       httplib that only runs if the network is up.  This is tied into
>       better error reporting, since the current test suite could only
>       report that httplib succeeded or failed.

A doctest test is simply an interactive Python session pasted into a
docstring (or more than one session, and/or interspersed with prose).  If
you can write an example in the interactive shell, doctest will verify it
still works as advertised.  This allows for embedding unit tests into the
docs for each function, method and class.  Nothing about them "looks like"
an artificial test tacked on:  the examples in the docs *are* the test
cases.

I need to try the other frameworks.  I dare say doctest is ideal for
computational functions, where the intended input->output relationship can
be clearly explicated via examples.  It's useless for GUIs.  Usefulness
varies accordingly between those extremes (doctest is natural exactly to the
extent that a captured interactive session is helpful for documentation
purposes).

testing-ain't-easy-ly y'rs  - tim




From barry at digicool.com  Sat Dec  2 04:52:29 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Fri, 1 Dec 2000 22:52:29 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
Message-ID: <14888.29181.355023.669030@anthem.concentric.net>

I've just uploaded PEP 231, which describes a new hook in the instance
access mechanism, called __findattr__() after a similar mechanism that
exists in Jython (but is not exposed at the Python layer).

You can do all kinds of interesting things with __findattr__(),
including implement the __of__() protocol of ExtensionClass, and thus
implicit and explicit acquisitions, in pure Python.  You can also do
Java Bean-like interfaces and C++-like access control.  The PEP
contains sample implementations of all of these, although the latter
isn't as clean as I'd like, due to other restrictions in Python.

My hope is that __findattr__() would eliminate most, if not all, the
need for ExtensionClass, at least within the Zope and ZODB contexts.
I haven't tried to implement Persistent using it though.

Since it's a long PEP, I won't include it here.  You can read about it
at this URL

    http://python.sourceforge.net/peps/pep-0231.html

It includes a link to the patch implementing this feature on
SourceForge.

Enjoy,
-Barry



From moshez at math.huji.ac.il  Sat Dec  2 10:11:50 2000
From: moshez at math.huji.ac.il (Moshe Zadka)
Date: Sat, 2 Dec 2000 11:11:50 +0200 (IST)
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <14888.29181.355023.669030@anthem.concentric.net>
Message-ID: <Pine.GSO.4.10.10012021109320.1366-100000@sundial>

On Fri, 1 Dec 2000, Barry A. Warsaw wrote:

> I've just uploaded PEP 231, which describes a new hook in the instance
> access mechanism, called __findattr__() after a similar mechanism that
> exists in Jython (but is not exposed at the Python layer).

There's one thing that bothers me about this: what exactly is "the
call stack"? Let me clarify: what happens when you have threads.
Either machine-level threads and stackless threads confuse the issues
here, not to talk about stackless continuations. Can you add a few
words to the PEP about dealing with those?




From mal at lemburg.com  Sat Dec  2 11:03:11 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 02 Dec 2000 11:03:11 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
Message-ID: <3A28C8DF.E430484F@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> I've just uploaded PEP 231, which describes a new hook in the instance
> access mechanism, called __findattr__() after a similar mechanism that
> exists in Jython (but is not exposed at the Python layer).
> 
> You can do all kinds of interesting things with __findattr__(),
> including implement the __of__() protocol of ExtensionClass, and thus
> implicit and explicit acquisitions, in pure Python.  You can also do
> Java Bean-like interfaces and C++-like access control.  The PEP
> contains sample implementations of all of these, although the latter
> isn't as clean as I'd like, due to other restrictions in Python.
> 
> My hope is that __findattr__() would eliminate most, if not all, the
> need for ExtensionClass, at least within the Zope and ZODB contexts.
> I haven't tried to implement Persistent using it though.

The PEP does define when and how __findattr__() is called,
but makes no statement about what it should do or return...

Here's a slightly different idea:

Given the name, I would expect it to go look for an attribute
and then return the attribute and its container (this
doesn't seem to be what you have in mind here, though).

An alternative approach given the semantics above would
then be to first try a __getattr__() lookup and revert
to __findattr__() in case this fails. I don't think there
is any need to overload __setattr__() in such a way, because
you cannot be sure which object actually gets the new attribute.

By exposing the functionality using a new builtin, findattr(),
this could be used for all the examples you give too.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From barry at digicool.com  Sat Dec  2 17:50:02 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 2 Dec 2000 11:50:02 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
	<3A28C8DF.E430484F@lemburg.com>
Message-ID: <14889.10298.621133.961677@anthem.concentric.net>

>>>>> "M" == M  <mal at lemburg.com> writes:

    M> The PEP does define when and how __findattr__() is called,
    M> but makes no statement about what it should do or return...

Good point.  I've clarified that in the PEP.

    M> Here's a slightly different idea:

    M> Given the name, I would expect it to go look for an attribute
    M> and then return the attribute and its container (this doesn't
    M> seem to be what you have in mind here, though).

No, because some applications won't need a wrapped object.  E.g. in
the Java bean example, it just returns the attribute (which is stored
with a slightly different name).

    M> An alternative approach given the semantics above would then be
    M> to first try a __getattr__() lookup and revert to
    M> __findattr__() in case this fails.

I don't think this is as useful.  What would that buy you that you
can't already do today?

The key concept here is that you want to give the class first crack to
interpose on every attribute access.  You want this hook to get called
before anybody else can get at, or set, your attributes.  That gives
you (the class) total control to implement whatever policy is useful.
    
    M> I don't think there is any need to overload __setattr__() in
    M> such a way, because you cannot be sure which object actually
    M> gets the new attribute.

    M> By exposing the functionality using a new builtin, findattr(),
    M> this could be used for all the examples you give too.

No, because then people couldn't use the object in the normal
dot-notational way.

-Barry



From tismer at tismer.com  Sat Dec  2 17:27:33 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sat, 02 Dec 2000 18:27:33 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
Message-ID: <3A2922F5.C2E0D10@tismer.com>

Hi Barry,

"Barry A. Warsaw" wrote:
> 
> I've just uploaded PEP 231, which describes a new hook in the instance
> access mechanism, called __findattr__() after a similar mechanism that
> exists in Jython (but is not exposed at the Python layer).
> 
> You can do all kinds of interesting things with __findattr__(),
> including implement the __of__() protocol of ExtensionClass, and thus
> implicit and explicit acquisitions, in pure Python.  You can also do
> Java Bean-like interfaces and C++-like access control.  The PEP
> contains sample implementations of all of these, although the latter
> isn't as clean as I'd like, due to other restrictions in Python.
> 
> My hope is that __findattr__() would eliminate most, if not all, the
> need for ExtensionClass, at least within the Zope and ZODB contexts.
> I haven't tried to implement Persistent using it though.

I have been using ExtensionClass for quite a long time, and
I have to say that you indeed eliminate most of its need
through this super-elegant idea. Congratulations!

Besides acquisition and persitency interception,
wrapping plain C objects and giving them Class-like behavior
while retaining fast access to internal properties but being
able to override methods by Python methods was my other use
of ExtensionClass. I assume this is the other "20%" part you
mention, which is much harder to achieve?
But that part also looks easier to implement now, by the support
of the __findattr__ method.

> Since it's a long PEP, I won't include it here.  You can read about it
> at this URL
> 
>     http://python.sourceforge.net/peps/pep-0231.html

Great. I had to read it twice, but it was fun.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From tismer at tismer.com  Sat Dec  2 17:55:21 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sat, 02 Dec 2000 18:55:21 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <Pine.GSO.4.10.10012021109320.1366-100000@sundial>
Message-ID: <3A292979.60BB1731@tismer.com>


Moshe Zadka wrote:
> 
> On Fri, 1 Dec 2000, Barry A. Warsaw wrote:
> 
> > I've just uploaded PEP 231, which describes a new hook in the instance
> > access mechanism, called __findattr__() after a similar mechanism that
> > exists in Jython (but is not exposed at the Python layer).
> 
> There's one thing that bothers me about this: what exactly is "the
> call stack"? Let me clarify: what happens when you have threads.
> Either machine-level threads and stackless threads confuse the issues
> here, not to talk about stackless continuations. Can you add a few
> words to the PEP about dealing with those?

As far as I understood the patch (just skimmed), thee is no
stack involved directly, but the instance increments and decrments
a variable infindattr.

+       if (v != NULL && !inst->infindaddr &&
+           (func = inst->in_class->cl_findattr))
+       {
+               PyObject *args, *res;
+               args = Py_BuildValue("(OOO)", inst, name, v);
+               if (args == NULL)
+                       return -1;
+               ++inst->infindaddr;
+               res = PyEval_CallObject(func, args);
+               --inst->infindaddr;

This is: The call modifies the instance's state, while calling
the findattr method.
You are right: I see a serious problem with this. It doesn't
even need continuations to get things messed up. Guido's
proposed coroutines, together with uThread-Switching, might
be able to enter the same instance twice with ease.

Barry, after second thought, I feel this can become
a problem in the future. This infindattr attribute
only works correctly if we are guaranteed to use
strict stack order of execution.
What you're *intending* to to is to tell the PyEval_CallObject
that it should not find the __findattr__ attribute. But
this should be done only for this call and all of its descendants,
but no *fresh* access from elsewhere.

The hard way to get out of this would be to stop scheduling
in that case. Maybe this is very cheap, but quite unelegant.

We have a quite peculiar system state here: A function call
acts like an escape, to make all subsequent calls behave
differently, until this call is finished.

Without blocking microthreads, a clean way to do this would be
a search up in the frame chain, if there is a running __findattr__
method of this object. Fairly expensive. Well, the problem
also exists with real threads, if they are allowed to switch
in such a context.

I fear it is necessary to either block this stuff until it is
ready, or to maintain some thread-wise structure for the
state of this object.

Ok, after thinking some more, I'll start an extra message
to Barry on this topic.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From tismer at tismer.com  Sat Dec  2 18:21:18 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sat, 02 Dec 2000 19:21:18 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
Message-ID: <3A292F8D.7C616449@tismer.com>


"Barry A. Warsaw" wrote:
> 
> I've just uploaded PEP 231, which describes a new hook in the instance
> access mechanism, called __findattr__() after a similar mechanism that
> exists in Jython (but is not exposed at the Python layer).

Ok, as I announced already, here some thoughts on __findattr__,
system state, and how it could work.

Looking at your patch, I realize that you are blocking __findattr__
for your whole instance, until this call ends.
This is not what you want to do, I guess. This has an effect of
affecting the whole system state when threads are involved.
Also you cannot use __findattr__ on any other attribute during
this call.

You want most probably do this:
__findattr__ should not be invoked again for this instance,
with this attribute name, for this "thread", until you are done.

The correct way to find out whether __findattr__ is active or
not would be to look upwards the frame chain and inspect it.
Moshe also asked about continuations: I think this would resolve
quite fine. However we jump around, the current chain of frames
dictates the semantics of __findattr__. It even applies to
Guido's tamed coroutines, given that an explicit switch were
allowed in the context of __findattr__.

In a sense, we get some kind of dynamic context here, since
we need to do a lookup for something in the dynamic call chain.
I guess this would be quite messy to implement, and inefficient.

Isn't there a way to accomplish the desired effect without
modifying the instance? In the context of __findattr__, *we*
know that we don't want to get a recursive call.
Let's assume __getattr__ and __setattr__ had yet another
optional parameter: infindattr, defaulting to 0.
We would than have to pass a positive value in this context,
which would object.c tell to not try to invoke __findattr__
again.
With explicit passing of state, no problems with threads
can occour. Readability might improve as well.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From moshez at math.huji.ac.il  Sun Dec  3 14:14:43 2000
From: moshez at math.huji.ac.il (Moshe Zadka)
Date: Sun, 3 Dec 2000 15:14:43 +0200 (IST)
Subject: [Python-Dev] Another Python Developer Missing
Message-ID: <Pine.GSO.4.10.10012031512430.7826-100000@sundial>

Gordon McMillan is not a possible assignee in the assign_to field.


--
Moshe Zadka <moshez at zadka.site.co.il> -- 95855124
http://moshez.org




From tim.one at home.com  Sun Dec  3 18:35:36 2000
From: tim.one at home.com (Tim Peters)
Date: Sun, 3 Dec 2000 12:35:36 -0500
Subject: [Python-Dev] Another Python Developer Missing
In-Reply-To: <Pine.GSO.4.10.10012031512430.7826-100000@sundial>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEDOICAA.tim.one@home.com>

[Moshe Zadka]
> Gordon McMillan is not a possible assignee in the assign_to field.

We almost never add people as Python developers unless they ask for that,
since it comes with responsibility as well as riches beyond the dreams of
avarice.  If Gordon would like to apply, we won't charge him any interest
until 2001 <wink>.




From mal at lemburg.com  Sun Dec  3 20:21:11 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sun, 03 Dec 2000 20:21:11 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib urllib.py,1.107,1.108
References: <200012031830.KAA30620@slayer.i.sourceforge.net>
Message-ID: <3A2A9D27.AF43D665@lemburg.com>

"Martin v. L?wis" wrote:
> 
> Update of /cvsroot/python/python/dist/src/Lib
> In directory slayer.i.sourceforge.net:/tmp/cvs-serv30506
> 
> Modified Files:
>         urllib.py
> Log Message:
> Convert Unicode strings to byte strings before passing them into specific
> protocols. Closes bug #119822.
> 
> ...
> +
> + def toBytes(url):
> +     """toBytes(u"URL") --> 'URL'."""
> +     # Most URL schemes require ASCII. If that changes, the conversion
> +     # can be relaxed
> +     if type(url) is types.UnicodeType:
> +         try:
> +             url = url.encode("ASCII")

You should make this: 'ascii' -- encoding names are lower case 
per convention (and the implementation has a short-cut to speed up
conversion to 'ascii' -- not for 'ASCII').

> +         except UnicodeError:
> +             raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters")

Would it be better to use a simple ValueError here ? (UnicodeError
is a subclass of ValueError, but the error doesn't really have something to
do with Unicode conversions...)

> +     return url
> 
>   def unwrap(url):


-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tismer at tismer.com  Sun Dec  3 21:01:07 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sun, 03 Dec 2000 22:01:07 +0200
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib filecmp.py,1.6,1.7
References: <200012032048.MAA10353@slayer.i.sourceforge.net>
Message-ID: <3A2AA683.3840AA8A@tismer.com>


Moshe Zadka wrote:
> 
> Update of /cvsroot/python/python/dist/src/Lib
> In directory slayer.i.sourceforge.net:/tmp/cvs-serv9465
> 
> Modified Files:
>         filecmp.py
> Log Message:
> Call of _cmp had wrong number of paramereters.
> Fixed definition of _cmp.

...

> !         return not abs(cmp(a, b, sh, st))
>       except os.error:
>           return 2

Ugh! Wouldn't that be a fine chance to rename the cmp
function in this module? Overriding a built-in
is really not nice to have in a library.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From moshez at math.huji.ac.il  Sun Dec  3 22:01:07 2000
From: moshez at math.huji.ac.il (Moshe Zadka)
Date: Sun, 3 Dec 2000 23:01:07 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 filecmp.py,1.6,1.7
In-Reply-To: <3A2AA683.3840AA8A@tismer.com>
Message-ID: <Pine.GSO.4.10.10012032300100.7608-100000@sundial>

On Sun, 3 Dec 2000, Christian Tismer wrote:

> Ugh! Wouldn't that be a fine chance to rename the cmp
> function in this module? Overriding a built-in
> is really not nice to have in a library.

The fine chance was when we moved cmp.py->filecmp.py. 
Now it would just break backwards compatability.
--
Moshe Zadka <moshez at zadka.site.co.il> -- 95855124
http://moshez.org




From tismer at tismer.com  Sun Dec  3 21:12:15 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sun, 03 Dec 2000 22:12:15 +0200
Subject: [Python-Dev] Re: [Python-checkins] CVS: 
 python/dist/src/Libfilecmp.py,1.6,1.7
References: <Pine.GSO.4.10.10012032300100.7608-100000@sundial>
Message-ID: <3A2AA91F.843E2BAE@tismer.com>


Moshe Zadka wrote:
> 
> On Sun, 3 Dec 2000, Christian Tismer wrote:
> 
> > Ugh! Wouldn't that be a fine chance to rename the cmp
> > function in this module? Overriding a built-in
> > is really not nice to have in a library.
> 
> The fine chance was when we moved cmp.py->filecmp.py.
> Now it would just break backwards compatability.

Yes, I see. cmp belongs to the module's interface.
Maybe it could be renamed anyway, and be assigned
to cmp at the very end of the file, but not using
cmp anywhere in the code. My first reaction on reading
the patch was "juck!" since I didn't know this module.

python-dev/null - ly y'rs - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From martin at loewis.home.cs.tu-berlin.de  Sun Dec  3 22:56:44 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 3 Dec 2000 22:56:44 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
Message-ID: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>

> Isn't there a way to accomplish the desired effect without modifying
> the instance? In the context of __findattr__, *we* know that we
> don't want to get a recursive call.  Let's assume __getattr__ and
> __setattr__ had yet another optional parameter: infindattr,
> defaulting to 0.  We would than have to pass a positive value in
> this context, which would object.c tell to not try to invoke
> __findattr__ again.

Who is "we" here? The Python code implementing __findattr__? How would
it pass a value to __setattr__? It doesn't call __setattr__, instead
it has "self.__myfoo = x"...

I agree that the current implementation is not thread-safe. To solve
that, you'd need to associate with each instance not a single
"infindattr" attribute, but a whole set of them - one per "thread of
execution" (which would be a thread-id in most threading systems). Of
course, that would need some cooperation from the any thread scheme
(including uthreads), which would need to provide an identification
for a "calling context".

Regards,
Martin



From martin at loewis.home.cs.tu-berlin.de  Sun Dec  3 23:07:17 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 3 Dec 2000 23:07:17 +0100
Subject: [Python-Dev] Re: CVS: python/dist/src/Lib urllib.py,1.107,1.108
Message-ID: <200012032207.XAA03394@loewis.home.cs.tu-berlin.de>

> You should make this: 'ascii' -- encoding names are lower case per
> convention (and the implementation has a short-cut to speed up
> conversion to 'ascii' -- not for 'ASCII').

With conventions, it is a difficult story. I'm pretty certain that
users typically see that particular american standard as ASCII (to the
extend of calling it "a s c two"), not ascii.

As for speed - feel free to change the code if you think it matters.

> +             raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters")

> Would it be better to use a simple ValueError here ? (UnicodeError
> is a subclass of ValueError, but the error doesn't really have
> something to do with Unicode conversions...)

Why does it not have to do with Unicode conversion? A conversion from
Unicode to ASCII was attempted, and failed.

I guess I would be more open to suggested changes if you had put them
into the patch manager at the time you've reviewed the patch...

Regards,
Martin



From tismer at tismer.com  Sun Dec  3 22:38:11 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sun, 03 Dec 2000 23:38:11 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
Message-ID: <3A2ABD43.AB56BD60@tismer.com>


"Martin v. Loewis" wrote:
> 
> > Isn't there a way to accomplish the desired effect without modifying
> > the instance? In the context of __findattr__, *we* know that we
> > don't want to get a recursive call.  Let's assume __getattr__ and
> > __setattr__ had yet another optional parameter: infindattr,
> > defaulting to 0.  We would than have to pass a positive value in
> > this context, which would object.c tell to not try to invoke
> > __findattr__ again.
> 
> Who is "we" here? The Python code implementing __findattr__? How would
> it pass a value to __setattr__? It doesn't call __setattr__, instead
> it has "self.__myfoo = x"...

Ouch - right! Sorry :)

> I agree that the current implementation is not thread-safe. To solve
> that, you'd need to associate with each instance not a single
> "infindattr" attribute, but a whole set of them - one per "thread of
> execution" (which would be a thread-id in most threading systems). Of
> course, that would need some cooperation from the any thread scheme
> (including uthreads), which would need to provide an identification
> for a "calling context".

Right, that is one possible way to do it. I also thought about
some alternatives, but they all sound too complicated to
justify them. Also I don't think this is only thread-related,
since mess can happen even with an explicit coroutine jmp.
Furthermore, how to deal with multiple attribute names?
The function works wrong if __findattr__ tries to inspect
another attribute.

IMO, the state of the current interpreter changes here
(or should do so), and this changed state needs to be carried
down with all subsequent function calls.

confused - ly chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From mal at lemburg.com  Sun Dec  3 23:51:10 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sun, 03 Dec 2000 23:51:10 +0100
Subject: [Python-Dev] Re: CVS: python/dist/src/Lib urllib.py,1.107,1.108
References: <200012032207.XAA03394@loewis.home.cs.tu-berlin.de>
Message-ID: <3A2ACE5E.A9F860A8@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > You should make this: 'ascii' -- encoding names are lower case per
> > convention (and the implementation has a short-cut to speed up
> > conversion to 'ascii' -- not for 'ASCII').
> 
> With conventions, it is a difficult story. I'm pretty certain that
> users typically see that particular american standard as ASCII (to the
> extend of calling it "a s c two"), not ascii.

It's a convention in the codec registry design and used as such
in the Unicode implementation.
 
> As for speed - feel free to change the code if you think it matters.

Hey... this was just a suggestion. I thought that you didn't
know of the internal short-cut and wanted to hint at it.
 
> > +             raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters")
> 
> > Would it be better to use a simple ValueError here ? (UnicodeError
> > is a subclass of ValueError, but the error doesn't really have
> > something to do with Unicode conversions...)
> 
> Why does it not have to do with Unicode conversion? A conversion from
> Unicode to ASCII was attempted, and failed.

Sure, but the fact that URLs have to be ASCII is not something
that is enforced by the Unicode implementation.
 
> I guess I would be more open to suggested changes if you had put them
> into the patch manager at the time you've reviewed the patch...

I didn't review the patch, only the summary...

Don't have much time to look into these things closely right now, so
all I can do is comment.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From barry at scottb.demon.co.uk  Mon Dec  4 01:55:32 2000
From: barry at scottb.demon.co.uk (Barry Scott)
Date: Mon, 4 Dec 2000 00:55:32 -0000
Subject: [Python-Dev] A house upon the sand
In-Reply-To: <20001130181937.B21596@ludwig.cnri.reston.va.us>
Message-ID: <000201c05d8c$e7a15b10$060210ac@private>

I fully support Greg Wards view. If string was removed I'd not
update the old code but add in my own string module.

Given the effort you guys went to to keep the C extension protocol the
same (in the context of crashing on importing a 1.5 dll into 2.0) I
amazed you think that string could be removed...

Could you split the lib into blessed and backward compatibility sections?
Then by some suitable mechanism I can choose the compatibility I need?

Oh and as for join obviously a method of a list...

	['thats','better'].join(' ')

		Barry




From fredrik at pythonware.com  Mon Dec  4 11:37:18 2000
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 4 Dec 2000 11:37:18 +0100
Subject: [Python-Dev] unit testing and Python regression test
References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us>
Message-ID: <00e701c05dde$2d77c240$0900a8c0@SPIFF>

andrew kuchling wrote:
> Someone remembered my post of 23 Nov, I see...  The only other test
> framework I know of is the unittest.py inside Quixote, written because
> we thought PyUnit was kind of clunky.

the pythonware teams agree -- we've been using an internal
reimplementation of Kent Beck's original Smalltalk work, but
we're switching to unittest.py.

> Obviously I think the Quixote unittest.py is the best choice for the stdlib.

+1 from here.

</F>




From mal at lemburg.com  Mon Dec  4 12:14:20 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 04 Dec 2000 12:14:20 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
References: <14888.29181.355023.669030@anthem.concentric.net>
		<3A28C8DF.E430484F@lemburg.com> <14889.10298.621133.961677@anthem.concentric.net>
Message-ID: <3A2B7C8C.D6B889EE@lemburg.com>

"Barry A. Warsaw" wrote:
> 
> >>>>> "M" == M  <mal at lemburg.com> writes:
> 
>     M> The PEP does define when and how __findattr__() is called,
>     M> but makes no statement about what it should do or return...
> 
> Good point.  I've clarified that in the PEP.
> 
>     M> Here's a slightly different idea:
> 
>     M> Given the name, I would expect it to go look for an attribute
>     M> and then return the attribute and its container (this doesn't
>     M> seem to be what you have in mind here, though).
> 
> No, because some applications won't need a wrapped object.  E.g. in
> the Java bean example, it just returns the attribute (which is stored
> with a slightly different name).

I was thinking of a standardised helper which could then be
used for all kinds of attribute retrieval techniques. Acquisition
would be easy to do, access control too. In most cases __findattr__
would simply return (self, self.attrname).
 
>     M> An alternative approach given the semantics above would then be
>     M> to first try a __getattr__() lookup and revert to
>     M> __findattr__() in case this fails.
> 
> I don't think this is as useful.  What would that buy you that you
> can't already do today?

Forget that idea... *always* calling __findattr__ is the more
useful way, just like you intended.
 
> The key concept here is that you want to give the class first crack to
> interpose on every attribute access.  You want this hook to get called
> before anybody else can get at, or set, your attributes.  That gives
> you (the class) total control to implement whatever policy is useful.

Right.
 
>     M> I don't think there is any need to overload __setattr__() in
>     M> such a way, because you cannot be sure which object actually
>     M> gets the new attribute.
> 
>     M> By exposing the functionality using a new builtin, findattr(),
>     M> this could be used for all the examples you give too.
> 
> No, because then people couldn't use the object in the normal
> dot-notational way.

Uhm, why not ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From gvwilson at nevex.com  Mon Dec  4 15:40:58 2000
From: gvwilson at nevex.com (Greg Wilson)
Date: Mon, 4 Dec 2000 09:40:58 -0500
Subject: [Python-Dev] Q: Python standard library re-org plans/schedule?
In-Reply-To: <20001201145827.D16751@kronos.cnri.reston.va.us>
Message-ID: <NEBBIACCCGNFLECLCLMHCEADCBAA.gvwilson@nevex.com>

Hi, everyone.  A potential customer has asked whether there are any
plans to re-organize and rationalize the Python standard library.
If there are any firms plans, and a schedule (however tentative),
I'd be grateful for a pointer.

Thanks,
Greg



From barry at digicool.com  Mon Dec  4 16:13:23 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 4 Dec 2000 10:13:23 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
Message-ID: <14891.46227.785856.307437@anthem.concentric.net>

>>>>> "MvL" == Martin v Loewis <martin at loewis.home.cs.tu-berlin.de> writes:

    MvL> I agree that the current implementation is not
    MvL> thread-safe. To solve that, you'd need to associate with each
    MvL> instance not a single "infindattr" attribute, but a whole set
    MvL> of them - one per "thread of execution" (which would be a
    MvL> thread-id in most threading systems). Of course, that would
    MvL> need some cooperation from the any thread scheme (including
    MvL> uthreads), which would need to provide an identification for
    MvL> a "calling context".

I'm still catching up on several hundred emails over the weekend.  I
had a sneaking suspicion that infindattr wasn't thread-safe, so I'm
convinced this is a bug in the implementation.  One approach might be
to store the info in the thread state object (isn't that how the
recursive repr stop flag is stored?)  That would also save having to
allocate an extra int for every instance (yuck) but might impose a bit
more of a performance overhead.

I'll work more on this later today.
-Barry



From jeremy at alum.mit.edu  Mon Dec  4 16:23:10 2000
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 4 Dec 2000 10:23:10 -0500 (EST)
Subject: [Python-Dev] unit testing and Python regression test
In-Reply-To: <00e701c05dde$2d77c240$0900a8c0@SPIFF>
References: <14887.64402.88530.714821@bitdiddle.concentric.net>
	<20001201145827.D16751@kronos.cnri.reston.va.us>
	<00e701c05dde$2d77c240$0900a8c0@SPIFF>
Message-ID: <14891.46814.359333.76720@bitdiddle.concentric.net>

>>>>> "FL" == Fredrik Lundh <fredrik at pythonware.com> writes:

  FL> andrew kuchling wrote:
  >> Someone remembered my post of 23 Nov, I see...  The only other
  >> test framework I know of is the unittest.py inside Quixote,
  >> written because we thought PyUnit was kind of clunky.

  FL> the pythonware teams agree -- we've been using an internal
  FL> reimplementation of Kent Beck's original Smalltalk work, but
  FL> we're switching to unittest.py.

Can you provide any specifics about what you like about unittest.py
(perhaps as opposed to PyUnit)?

Jeremy



From guido at python.org  Mon Dec  4 16:20:11 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 04 Dec 2000 10:20:11 -0500
Subject: [Python-Dev] Q: Python standard library re-org plans/schedule?
In-Reply-To: Your message of "Mon, 04 Dec 2000 09:40:58 EST."
             <NEBBIACCCGNFLECLCLMHCEADCBAA.gvwilson@nevex.com> 
References: <NEBBIACCCGNFLECLCLMHCEADCBAA.gvwilson@nevex.com> 
Message-ID: <200012041520.KAA20979@cj20424-a.reston1.va.home.com>

> Hi, everyone.  A potential customer has asked whether there are any
> plans to re-organize and rationalize the Python standard library.
> If there are any firms plans, and a schedule (however tentative),
> I'd be grateful for a pointer.

Alas, none that I know of except the ineffable Python 3000
schedule. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Mon Dec  4 16:46:53 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 4 Dec 2000 10:46:53 -0500
Subject: [Python-Dev] Quixote unit testing docs (Was: unit testing)
In-Reply-To: <14891.46814.359333.76720@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Mon, Dec 04, 2000 at 10:23:10AM -0500
References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <00e701c05dde$2d77c240$0900a8c0@SPIFF> <14891.46814.359333.76720@bitdiddle.concentric.net>
Message-ID: <20001204104653.A19387@kronos.cnri.reston.va.us>

Prodded by Jeremy, I went and actually wrote some documentation for
the Quixote unittest.py; please see 
<URL:http://www.amk.ca/python/unittest.html>.

The HTML is from a manually hacked Library Reference, so ignore the
broken image links and other formatting goofyness.  In case anyone
needs it, the LaTeX is in /files/python/.  The plain text version
comes out to around 290 lines; I can post it to this list if that's
desired.

--amk




From pf at artcom-gmbh.de  Mon Dec  4 18:59:54 2000
From: pf at artcom-gmbh.de (Peter Funk)
Date: Mon, 4 Dec 2000 18:59:54 +0100 (MET)
Subject: Tim Peter's doctest compared to Quixote unit testing (was Re: [Python-Dev] Quixote unit testing docs)
In-Reply-To: <20001204104653.A19387@kronos.cnri.reston.va.us> from Andrew Kuchling at "Dec 4, 2000 10:46:53 am"
Message-ID: <m142zu6-000Dm8C@artcom0.artcom-gmbh.de>

Hi all,

Andrew Kuchling:
> ... I ... actually wrote some documentation for
> the Quixote unittest.py; please see 
> <URL:http://www.amk.ca/python/unittest.html>.
[...]
> comes out to around 290 lines; I can post it to this list if that's
> desired.

After reading Andrews docs, I think Quixote basically offers 
three additional features if compared with Tim Peters 'doctest':
 1. integration of Skip Montanaro's code coverage analysis. 
 2. the idea of Scenario objects useful to share the setup needed to
    test related functions or methods of a class (same start condition).
 3. Some useful functions to check whether the result returned
    by some test fullfills certain properties without having to be
    so explicite, as cut'n'paste from the interactive interpreter
    session would have been.

As I've pointed out before in private mail to Jeremy I've used Tim Peters 
'doctest.py' to accomplish all testing of Python apps in our company.

In doctest each doc string is an independent unit, which starts fresh.
Sometimes this leads to duplicated setup stuff, which is needed
to test each method of a set of related methods from a class.
This is distracting, if you intend the test cases to take their
double role of being at same time useful documentational examples 
for the intended use of the provided API.

Tim_one: Do you read this?  What do you think about the idea to add 
something like the following two functions to 'doctest':
use_module_scenario() -- imports all objects created and preserved during
    execution of the module doc string examples.
use_class_scenario() -- imports all objects created and preserved during 
    the execution of doc string examples of a class.  Only allowed in doc
    string examples of methods.  

This would allow easily to provide the same setup scenario to a group
of related test cases.

AFAI understand doctest handles test-shutdown automatically, iff
the doc string test examples leave no persistent resources behind.

Regards, Peter



From moshez at zadka.site.co.il  Tue Dec  5 04:31:18 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue, 05 Dec 2000 05:31:18 +0200
Subject: [Python-Dev] PEP 231, __findattr__() 
In-Reply-To: Message from barry@digicool.com (Barry A. Warsaw) 
   of "Mon, 04 Dec 2000 10:13:23 EST." <14891.46227.785856.307437@anthem.concentric.net> 
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>  <14891.46227.785856.307437@anthem.concentric.net> 
Message-ID: <20001205033118.9135CA817@darjeeling.zadka.site.co.il>

> I'm still catching up on several hundred emails over the weekend.  I
> had a sneaking suspicion that infindattr wasn't thread-safe, so I'm
> convinced this is a bug in the implementation.  One approach might be
> to store the info in the thread state object

I don't think this is a good idea -- continuations and coroutines might
mess it up. Maybe the right thing is to mess with the *compilation* of
__findattr__ so that it would call __setattr__ and __getattr__ with
special flags that stop them from calling __findattr__? This is 
ugly, but I can't think of a better way.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tismer at tismer.com  Mon Dec  4 19:35:19 2000
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 04 Dec 2000 20:35:19 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>  <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il>
Message-ID: <3A2BE3E7.60A8E220@tismer.com>


Moshe Zadka wrote:
> 
> > I'm still catching up on several hundred emails over the weekend.  I
> > had a sneaking suspicion that infindattr wasn't thread-safe, so I'm
> > convinced this is a bug in the implementation.  One approach might be
> > to store the info in the thread state object
> 
> I don't think this is a good idea -- continuations and coroutines might
> mess it up. Maybe the right thing is to mess with the *compilation* of
> __findattr__ so that it would call __setattr__ and __getattr__ with
> special flags that stop them from calling __findattr__? This is
> ugly, but I can't think of a better way.

Yeah, this is what I tried to say by "different machine state";
compiling different behavior in the case of a special method
is an interesting idea. It is limited somewhat, since the
changed system state is not inherited to called functions.
But if __findattr__ performs its one, single task in its
body alone, we are fine.

still-thinking-of-alternatives - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From tismer at tismer.com  Mon Dec  4 19:52:43 2000
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 04 Dec 2000 20:52:43 +0200
Subject: [Python-Dev] A house upon the sand
References: <000201c05d8c$e7a15b10$060210ac@private>
Message-ID: <3A2BE7FB.831F2F93@tismer.com>


Barry Scott wrote:
> 
> I fully support Greg Wards view. If string was removed I'd not
> update the old code but add in my own string module.
> 
> Given the effort you guys went to to keep the C extension protocol the
> same (in the context of crashing on importing a 1.5 dll into 2.0) I
> amazed you think that string could be removed...
> 
> Could you split the lib into blessed and backward compatibility sections?
> Then by some suitable mechanism I can choose the compatibility I need?
> 
> Oh and as for join obviously a method of a list...
> 
>         ['thats','better'].join(' ')

The above is the way as it is defined for JavaScript. But in
JavaScript, the list join method performs an implicit str()
on the list elements.
As has been discussed some time ago, Python's lists are
too versatile to justify a string-centric method.

Marc Andr? pointed out that one could do a reduction with the
semantics of the "+" operator, but Guido said that he wouldn't
like to see

      [2, 3, 5].join(7)

being reduced to 2+7+3+7+5 == 24.
That could only be avoided if there were a way to distinguish
numeric addition from concatenation.

but-I-could-live-with-it - ly y'rs - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From barry at digicool.com  Mon Dec  4 22:23:00 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 4 Dec 2000 16:23:00 -0500
Subject: [Python-Dev] PEP 231, __findattr__() 
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
	<14891.46227.785856.307437@anthem.concentric.net>
	<20001205033118.9135CA817@darjeeling.zadka.site.co.il>
Message-ID: <14892.2868.982013.313562@anthem.concentric.net>

>>>>> "CT" == Christian Tismer <tismer at tismer.com> writes:

    CT> You want most probably do this: __findattr__ should not be
    CT> invoked again for this instance, with this attribute name, for
    CT> this "thread", until you are done.

First, I think the rule should be "__findattr__ should not be invoked
again for this instance, in this thread, until you are done".
I.e. once in __findattr__, you want all subsequent attribute
references to bypass findattr, because presumably, your instance now
has complete control for all accesses in this thread.  You don't want
to limit it to just the currently named attribute.

Second, if "this thread" is defined as _PyThreadState_Current, then we
have a simple solution, as I mapped out earlier.  We do a
PyThreadState_GetDict() and store the instance in that dict on entry
to __findattr__ and remove it on exit from __findattr__.  If the
instance can be found in the current thread's dict, we bypass
__findattr__.

>>>>> "MZ" == Moshe Zadka <moshez at zadka.site.co.il> writes:

    MZ> I don't think this is a good idea -- continuations and
    MZ> coroutines might mess it up.

You might be right, but I'm not sure.

If we make __findattr__ thread safe according to the definition above,
and if uthread/coroutine/continuation safety can be accomplished by
the __findattr__ programmer's discipline, then I think that is enough.
IOW, if we can tell the __findattr__ author to not relinquish the
uthread explicitly during the __findattr__ call, we're cool.  Oh, and
as long as we're not somehow substantially reducing the utility of
__findattr__ by making that restriction.

What I worry about is re-entrancy that isn't under the programmer's
control, like the Real Thread-safety problem.

-Barry



From barry at digicool.com  Mon Dec  4 23:58:33 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 4 Dec 2000 17:58:33 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
	<14891.46227.785856.307437@anthem.concentric.net>
	<20001205033118.9135CA817@darjeeling.zadka.site.co.il>
	<14892.2868.982013.313562@anthem.concentric.net>
	<3A2C0E0D.E042D026@tismer.com>
Message-ID: <14892.8601.41178.81475@anthem.concentric.net>

>>>>> "CT" == Christian Tismer <tismer at tismer.com> writes:

    CT> Hmm. WHat do you think about Moshe's idea to change compiling
    CT> of the method? It has the nice advantage that there are no
    CT> Thread-safety problems by design. The only drawback is that
    CT> the contract of not-calling-myself only holds for this
    CT> function.

I'm not sure I understand what Moshe was proposing.  Moshe: are you
saying that we should change the way the compiler works, so that it
somehow recognizes this special case?  I'm not sure I like that
approach.  I think I want something more runtime-y, but I'm not sure
why (maybe just because I'm more comfortable mucking about in the
run-time than in the compiler).

-Barry



From guido at python.org  Tue Dec  5 00:16:17 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 04 Dec 2000 18:16:17 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: Your message of "Mon, 04 Dec 2000 16:23:00 EST."
             <14892.2868.982013.313562@anthem.concentric.net> 
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il>  
            <14892.2868.982013.313562@anthem.concentric.net> 
Message-ID: <200012042316.SAA23081@cj20424-a.reston1.va.home.com>

I'm unconvinced by the __findattr__ proposal as it now stands.

- Do you really think that JimF would do away with ExtensionClasses if
  __findattr__ was intruduced?  I kinda doubt it.  See [*footnote].
  It seems that *using* __findattr__ is expensive (even if *not* using
  is cheap :-).

- Why is deletion not supported?  What if you want to enforce a policy
  on deletions too?

- It's ugly to use the same call for get and set.  The examples
  indicate that it's not such a great idea: every example has *two*
  tests whether it's get or set.  To share a policy, the proper thing
  to do is to write a method that either get or set can use.

- I think it would be sufficient to *only* use __findattr__ for
  getattr -- __setattr__ and __delattr__ already have full control.
  The "one routine to implement the policy" argument doesn't really
  hold, I think.

- The PEP says that the "in-findattr" flag is set on the instance.
  We've already determined that this is not thread-safe.  This is not
  just a bug in the implementation -- it's a bug in the specification.
  I also find it ugly.  But if we decide to do this, it can go in the
  thread-state -- if we ever add coroutines, we have to decide on what
  stuff to move from the thread state to the coroutine state anyway.

- It's also easy to conceive situations where recursive __findattr__
  calls on the same instance in the same thread/coroutine are
  perfectly desirable -- e.g. when __findattr__ ends up calling a
  method that uses a lot of internal machinery of the class.  You
  don't want all the machinery to have to be aware of the fact that it
  may be called with __findattr__ on the stack and without it.  So
  perhaps it may be better to only treat the body of __findattr__
  itself special, as Moshe suggested.  What does Jython do here?

- The code examples require a *lot* of effort to understand.  These
  are complicated issues!  (I rewrote the Bean example using
  __getattr__ and __setattr__ and found no need for __findattr__; the
  __getattr__ version is simpler and easier to understand.  I'm still
  studying the other __findattr__ examples.)

- The PEP really isn't that long, except for the code examples.  I
  recommend reading the patch first -- the patch is probably shorter
  than any specification of the feature can be.

--Guido van Rossum (home page: http://www.python.org/~guido/)

[*footnote]

  There's an easy way (that few people seem to know) to cause
  __getattr__ to be called for virtually all attribute accesses: put
  *all* (user-visible) attributes in a sepate dictionary.  If you want
  to prevent access to this dictionary too (for Zope security
  enforcement), make it a global indexed by id() -- a
  destructor(__del__) can take care of deleting entries here.



From martin at loewis.home.cs.tu-berlin.de  Tue Dec  5 00:10:43 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 5 Dec 2000 00:10:43 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <14891.46227.785856.307437@anthem.concentric.net>
	(barry@digicool.com)
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net>
Message-ID: <200012042310.AAA00786@loewis.home.cs.tu-berlin.de>

> I'm still catching up on several hundred emails over the weekend.  I
> had a sneaking suspicion that infindattr wasn't thread-safe, so I'm
> convinced this is a bug in the implementation.  One approach might be
> to store the info in the thread state object (isn't that how the
> recursive repr stop flag is stored?)

Whether this works depends on how exactly the info is stored. A single
flag won't be sufficient, since multiple objects may have __findattr__
in progress in a given thread. With a set of instances, it would work,
though.

Regards,
Martin



From martin at loewis.home.cs.tu-berlin.de  Tue Dec  5 00:13:15 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 5 Dec 2000 00:13:15 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <20001205033118.9135CA817@darjeeling.zadka.site.co.il> (message
	from Moshe Zadka on Tue, 05 Dec 2000 05:31:18 +0200)
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>  <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il>
Message-ID: <200012042313.AAA00832@loewis.home.cs.tu-berlin.de>

> I don't think this is a good idea -- continuations and coroutines
> might mess it up.

If coroutines and continuations present operate preemptively, then
they should present themselves as an implementation of the thread API;
perhaps the thread API needs to be extended to allow for such a feature.

If yielding control is in the hands of the implementation, it would be
easy to outrule a context switch while findattr is in progress.

Regards,
Martin




From martin at loewis.home.cs.tu-berlin.de  Tue Dec  5 00:19:37 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 5 Dec 2000 00:19:37 +0100
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <14892.8601.41178.81475@anthem.concentric.net>
	(barry@digicool.com)
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
	<14891.46227.785856.307437@anthem.concentric.net>
	<20001205033118.9135CA817@darjeeling.zadka.site.co.il>
	<14892.2868.982013.313562@anthem.concentric.net>
	<3A2C0E0D.E042D026@tismer.com> <14892.8601.41178.81475@anthem.concentric.net>
Message-ID: <200012042319.AAA00877@loewis.home.cs.tu-berlin.de>

> I'm not sure I understand what Moshe was proposing.  Moshe: are you
> saying that we should change the way the compiler works, so that it
> somehow recognizes this special case?  I'm not sure I like that
> approach.  I think I want something more runtime-y, but I'm not sure
> why (maybe just because I'm more comfortable mucking about in the
> run-time than in the compiler).

I guess you are also uncomfortable with the problem that the
compile-time analysis cannot "see" through levels of indirection.
E.g. if findattr as

   return self.compute_attribute(real_attribute)

then compile-time analysis could figure out to call compute_attribute
directly. However, that method may be implemented as 

  def compute_attribute(self,name):
    return self.mapping[name]

where the access to mapping could not be detected statically.

Regards,
Martin




From tismer at tismer.com  Mon Dec  4 22:35:09 2000
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 04 Dec 2000 23:35:09 +0200
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
		<14891.46227.785856.307437@anthem.concentric.net>
		<20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net>
Message-ID: <3A2C0E0D.E042D026@tismer.com>


"Barry A. Warsaw" wrote:
> 
> >>>>> "CT" == Christian Tismer <tismer at tismer.com> writes:
> 
>     CT> You want most probably do this: __findattr__ should not be
>     CT> invoked again for this instance, with this attribute name, for
>     CT> this "thread", until you are done.
> 
> First, I think the rule should be "__findattr__ should not be invoked
> again for this instance, in this thread, until you are done".

Maybe this is better. Surely easier. :)

[ThreadState solution - well fine so far]

>     MZ> I don't think this is a good idea -- continuations and
>     MZ> coroutines might mess it up.
> 
> You might be right, but I'm not sure.
> 
> If we make __findattr__ thread safe according to the definition above,
> and if uthread/coroutine/continuation safety can be accomplished by
> the __findattr__ programmer's discipline, then I think that is enough.
> IOW, if we can tell the __findattr__ author to not relinquish the
> uthread explicitly during the __findattr__ call, we're cool.  Oh, and
> as long as we're not somehow substantially reducing the utility of
> __findattr__ by making that restriction.
> 
> What I worry about is re-entrancy that isn't under the programmer's
> control, like the Real Thread-safety problem.

Hmm. WHat do you think about Moshe's idea to change compiling
of the method? It has the nice advantage that there are no
Thread-safety problems by design. The only drawback is that
the contract of not-calling-myself only holds for this function.

I don't know how Threadstate scale up when there are more things
like these invented. Well, for the moment, the simple solution
with Stackless would just be to let the interpreter recurse
in this call, the same as it happens during __init__ and
anything else that isn't easily turned into tail-recursion.
It just blocks :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From barry at digicool.com  Tue Dec  5 03:54:23 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 4 Dec 2000 21:54:23 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
	<14891.46227.785856.307437@anthem.concentric.net>
	<20001205033118.9135CA817@darjeeling.zadka.site.co.il>
	<14892.2868.982013.313562@anthem.concentric.net>
	<200012042316.SAA23081@cj20424-a.reston1.va.home.com>
Message-ID: <14892.22751.921264.156010@anthem.concentric.net>

>>>>> "GvR" == Guido van Rossum <guido at python.org> writes:

    GvR> - Do you really think that JimF would do away with
    GvR> ExtensionClasses if __findattr__ was intruduced?  I kinda
    GvR> doubt it.  See [*footnote].  It seems that *using*
    GvR> __findattr__ is expensive (even if *not* using is cheap :-).

That's not even the real reason why JimF wouldn't stop using
ExtensionClass.  He's already got too much code invested in EC.
However EC can be a big pill to swallow for some applications because
it's a C extension (and because it has some surprising non-Pythonic
side effects).  In those situations, a pure Python approach, even
though slower, is useful.

    GvR> - Why is deletion not supported?  What if you want to enforce
    GvR> a policy on deletions too?

It could be, without much work.

    GvR> - It's ugly to use the same call for get and set.  The
    GvR> examples indicate that it's not such a great idea: every
    GvR> example has *two* tests whether it's get or set.  To share a
    GvR> policy, the proper thing to do is to write a method that
    GvR> either get or set can use.

I don't have strong feelings either way.

    GvR> - I think it would be sufficient to *only* use __findattr__
    GvR> for getattr -- __setattr__ and __delattr__ already have full
    GvR> control.  The "one routine to implement the policy" argument
    GvR> doesn't really hold, I think.

What about the ability to use "normal" x.name attribute access syntax
inside the hook?  Let me guess your answer. :)

    GvR> - The PEP says that the "in-findattr" flag is set on the
    GvR> instance.  We've already determined that this is not
    GvR> thread-safe.  This is not just a bug in the implementation --
    GvR> it's a bug in the specification.  I also find it ugly.  But
    GvR> if we decide to do this, it can go in the thread-state -- if
    GvR> we ever add coroutines, we have to decide on what stuff to
    GvR> move from the thread state to the coroutine state anyway.

Right.  That's where we've ended up in subsequent messages on this thread.

    GvR> - It's also easy to conceive situations where recursive
    GvR> __findattr__ calls on the same instance in the same
    GvR> thread/coroutine are perfectly desirable -- e.g. when
    GvR> __findattr__ ends up calling a method that uses a lot of
    GvR> internal machinery of the class.  You don't want all the
    GvR> machinery to have to be aware of the fact that it may be
    GvR> called with __findattr__ on the stack and without it.

Hmm, okay, I don't really understand your example.  I suppose I'm
envisioning __findattr__ as a way to provide an interface to clients
of the class.  Maybe it's a bean interface, maybe it's an acquisition
interface or an access control interface.  The internal machinery has
to know something about how that interface is implemented, so whether
__findattr__ is recursive or not doesn't seem to enter into it.

And also, allowing __findattr__ to be recursive will just impose
different constraints on the internal machinery methods, just like
__setattr__ currently does.  I.e. you better know that you're in
__setattr__ and not do self.name type things, or you'll recurse
forever. 

    GvR> So perhaps it may be better to only treat the body of
    GvR> __findattr__ itself special, as Moshe suggested.

Maybe I'm being dense, but I'm not sure exactly what this means, or
how you would do this.
    
    GvR> What does Jython do here?

It's not exactly equivalent, because Jython's __findattr__ can't call
back into Python.

    GvR> - The code examples require a *lot* of effort to understand.
    GvR> These are complicated issues!  (I rewrote the Bean example
    GvR> using __getattr__ and __setattr__ and found no need for
    GvR> __findattr__; the __getattr__ version is simpler and easier
    GvR> to understand.  I'm still studying the other __findattr__
    GvR> examples.)

Is it simpler because you separated out the set and get behavior?  If
__findattr__ only did getting, I think it would be a lot similar too
(but I'd still be interested in seeing your __getattr__ only
example).  The acquisition examples are complicated because I wanted
to support the same interface that EC's acquisition classes support.
All that detail isn't necessary for example code.

    GvR> - The PEP really isn't that long, except for the code
    GvR> examples.  I recommend reading the patch first -- the patch
    GvR> is probably shorter than any specification of the feature can
    GvR> be.

Would it be more helpful to remove the examples?  If so, where would
you put them?  It's certainly useful to have examples someplace I
think.

    GvR>   There's an easy way (that few people seem to know) to cause
    GvR> __getattr__ to be called for virtually all attribute
    GvR> accesses: put *all* (user-visible) attributes in a sepate
    GvR> dictionary.  If you want to prevent access to this dictionary
    GvR> too (for Zope security enforcement), make it a global indexed
    GvR> by id() -- a destructor(__del__) can take care of deleting
    GvR> entries here.

Presumably that'd be a module global, right?  Maybe within Zope that
could be protected, but outside of that, that global's always going to
be accessible.  So are methods, even if given private names.  And I
don't think that such code would be any more readable since instead of
self.name you'd see stuff like

    def __getattr__(self, name):
        global instdict
	mydict = instdict[id(self)]
	obj = mydict[name]
	...

    def __setattr__(self, name, val):
	global instdict
	mydict = instdict[id(self)]
	instdict[name] = val
	...

and that /might/ be a problem with Jython currently, because id()'s
may be reused.  And relying on __del__ may have unfortunate side
effects when viewed in conjunction with garbage collection.

You're probably still unconvinced <wink>, but are you dead-set against
it?  I can try implementing __findattr__() as a pre-__getattr__ hook
only.  Then we can live with the current __setattr__() restrictions
and see what the examples look like in that situation.

-Barry



From guido at python.org  Tue Dec  5 13:54:20 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 05 Dec 2000 07:54:20 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: Your message of "Mon, 04 Dec 2000 21:54:23 EST."
             <14892.22751.921264.156010@anthem.concentric.net> 
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com>  
            <14892.22751.921264.156010@anthem.concentric.net> 
Message-ID: <200012051254.HAA25502@cj20424-a.reston1.va.home.com>

> >>>>> "GvR" == Guido van Rossum <guido at python.org> writes:
> 
>     GvR> - Do you really think that JimF would do away with
>     GvR> ExtensionClasses if __findattr__ was intruduced?  I kinda
>     GvR> doubt it.  See [*footnote].  It seems that *using*
>     GvR> __findattr__ is expensive (even if *not* using is cheap :-).
> 
> That's not even the real reason why JimF wouldn't stop using
> ExtensionClass.  He's already got too much code invested in EC.
> However EC can be a big pill to swallow for some applications because
> it's a C extension (and because it has some surprising non-Pythonic
> side effects).  In those situations, a pure Python approach, even
> though slower, is useful.

Agreed.  But I'm still hoping to find the silver bullet that lets Jim
(and everybody else) do what ExtensionClass does without needing
another extension.

>     GvR> - Why is deletion not supported?  What if you want to enforce
>     GvR> a policy on deletions too?
> 
> It could be, without much work.

Then it should be -- except I prefer to do only getattr anyway, see
below.

>     GvR> - It's ugly to use the same call for get and set.  The
>     GvR> examples indicate that it's not such a great idea: every
>     GvR> example has *two* tests whether it's get or set.  To share a
>     GvR> policy, the proper thing to do is to write a method that
>     GvR> either get or set can use.
> 
> I don't have strong feelings either way.

What does Jython do?  I thought it only did set (hence the name :-).
I think there's no *need* for findattr to catch the setattr operation,
because __setattr__ *already* gets invoked on each set not just ones
where the attr doesn't yet exist.

>     GvR> - I think it would be sufficient to *only* use __findattr__
>     GvR> for getattr -- __setattr__ and __delattr__ already have full
>     GvR> control.  The "one routine to implement the policy" argument
>     GvR> doesn't really hold, I think.
> 
> What about the ability to use "normal" x.name attribute access syntax
> inside the hook?  Let me guess your answer. :)

Aha!  You got me there.  Clearly the REAL reason for wanting
__findattr__ is the no-recursive-calls rule -- which is also the most
uncooked feature...  Traditional getattr hooks don't need this as much
because they don't get called when the attribute already exists;
traditional setattr hooks deal with it by switching on the attribute
name.  The no-recursive-calls rule certainly SEEMS an attractive way
around this.  But I'm not sure that it really is...

I need to get my head around this more.  (The only reason I'm still
posting this reply is to test the new mailing lists setup via
mail.python.org.)

>     GvR> - The PEP says that the "in-findattr" flag is set on the
>     GvR> instance.  We've already determined that this is not
>     GvR> thread-safe.  This is not just a bug in the implementation --
>     GvR> it's a bug in the specification.  I also find it ugly.  But
>     GvR> if we decide to do this, it can go in the thread-state -- if
>     GvR> we ever add coroutines, we have to decide on what stuff to
>     GvR> move from the thread state to the coroutine state anyway.
> 
> Right.  That's where we've ended up in subsequent messages on this thread.
> 
>     GvR> - It's also easy to conceive situations where recursive
>     GvR> __findattr__ calls on the same instance in the same
>     GvR> thread/coroutine are perfectly desirable -- e.g. when
>     GvR> __findattr__ ends up calling a method that uses a lot of
>     GvR> internal machinery of the class.  You don't want all the
>     GvR> machinery to have to be aware of the fact that it may be
>     GvR> called with __findattr__ on the stack and without it.
> 
> Hmm, okay, I don't really understand your example.  I suppose I'm
> envisioning __findattr__ as a way to provide an interface to clients
> of the class.  Maybe it's a bean interface, maybe it's an acquisition
> interface or an access control interface.  The internal machinery has
> to know something about how that interface is implemented, so whether
> __findattr__ is recursive or not doesn't seem to enter into it.

But the class is also a client of itself, and not all cases where it
is a client of itself are inside a findattr call.  Take your bean
example.  Suppose your bean class also has a spam() method.  The
findattr code needs to account for this, e.g.:

    def __findattr__(self, name, *args):
	if name == "spam" and not args:
	    return self.spam
	...original body here...

Or you have to add a _get_spam() method:

    def _get_spam(self):
	return self.spam

Either solution gets tedious if there ar a lot of methods; instead,
findattr could check if the attr is defined on the class, and then
return that:

    def __findattr__(self, name, *args):
        if not args and name[0] != '_' and hasattr(self.__class__, name):
	    return getattr(self, name)
	...original body here...

Anyway, let's go back to the spam method.  Suppose it references
self.foo.  The findattr machinery will access it.  Fine.  But now
consider another attribute (bar) with _set_bar() and _get_bar()
methods that do a little more.  Maybe bar is really calculated from
the value of self.foo.  Then _get_bar cannot use self.foo (because
it's inside findattr so findattr won't resolve it, and self.foo
doesn't actually exist on the instance) so it has to use self.__myfoo.
Fine -- after all this is inside a _get_* handler, which knows it's
being called from findattr.  But what if, instead of needing self.foo,
_get_bar wants to call self.spam() in order?  Then self.spam() is
being called from inside findattr, so when it access self.foo,
findattr isn't used -- and it fails with an AttributeError!

Sorry for the long detour, but *that's* the problem I was referring
to.  I think the scenario is quite realistic.

> And also, allowing __findattr__ to be recursive will just impose
> different constraints on the internal machinery methods, just like
> __setattr__ currently does.  I.e. you better know that you're in
> __setattr__ and not do self.name type things, or you'll recurse
> forever. 

Actually, this is usually solved by having __setattr__ check for
specific names only, and for others do self.__dict__[name] = value;
that way, recursive __setattr__ calls are okay.  Similar for
__getattr__ (which has to raise AttributeError for unrecognized
names).

>     GvR> So perhaps it may be better to only treat the body of
>     GvR> __findattr__ itself special, as Moshe suggested.
> 
> Maybe I'm being dense, but I'm not sure exactly what this means, or
> how you would do this.

Read Moshe's messages (and Martin's replies) again.  I don't care that
much for it so I won't explain it again.

>     GvR> What does Jython do here?
> 
> It's not exactly equivalent, because Jython's __findattr__ can't call
> back into Python.

I'd say that Jython's __findattr__ is an entirely different beast than
what we have here.  Its min purpose in life appears to be to be a
getattr equivalent that returns NULL instead of raising an exception
when the attribute isn't found -- which is reasonable because from
within Java, testing for null is much cheaper than checking for an
exception, and you often need to look whether a given attribute exists
do some default action if not.  (In fact, I'd say that CPython could
also use a findattr of this kind...)

This is really too bad.  Based on the name similarity and things I
thought you'd said in private before, I thought that they would be
similar.  Then the experience with Jython would be a good argument for
adding a findattr hook to CPython.  But now that they are totally
different beasts it doesn't help at all.

>     GvR> - The code examples require a *lot* of effort to understand.
>     GvR> These are complicated issues!  (I rewrote the Bean example
>     GvR> using __getattr__ and __setattr__ and found no need for
>     GvR> __findattr__; the __getattr__ version is simpler and easier
>     GvR> to understand.  I'm still studying the other __findattr__
>     GvR> examples.)
> 
> Is it simpler because you separated out the set and get behavior?  If
> __findattr__ only did getting, I think it would be a lot similar too
> (but I'd still be interested in seeing your __getattr__ only
> example).

Here's my getattr example.  It's more lines of code, but cleaner IMHO:

    class Bean:
	def __init__(self, x):
	    self.__myfoo = x

	def __isprivate(self, name):
	    return name.startswith('_')

	def __getattr__(self, name):
	    if self.__isprivate(name):
		raise AttributeError, name
	    return getattr(self, "_get_" + name)()

	def __setattr__(self, name, value):
	    if self.__isprivate(name):
		self.__dict__[name] = value
	    else:
		return getattr(self, "_set_" + name)(value)

	def _set_foo(self, x):
	    self.__myfoo = x

	def _get_foo(self):
	    return self.__myfoo


    b = Bean(3)
    print b.foo
    b.foo = 9
    print b.foo

> The acquisition examples are complicated because I wanted
> to support the same interface that EC's acquisition classes support.
> All that detail isn't necessary for example code.

I *still* have to study the examples... :-(  Will do next.

>     GvR> - The PEP really isn't that long, except for the code
>     GvR> examples.  I recommend reading the patch first -- the patch
>     GvR> is probably shorter than any specification of the feature can
>     GvR> be.
> 
> Would it be more helpful to remove the examples?  If so, where would
> you put them?  It's certainly useful to have examples someplace I
> think.

No, my point is that the examples need more explanation.  Right now
the EC example is over 200 lines of brain-exploding code! :-)

>     GvR>   There's an easy way (that few people seem to know) to cause
>     GvR> __getattr__ to be called for virtually all attribute
>     GvR> accesses: put *all* (user-visible) attributes in a sepate
>     GvR> dictionary.  If you want to prevent access to this dictionary
>     GvR> too (for Zope security enforcement), make it a global indexed
>     GvR> by id() -- a destructor(__del__) can take care of deleting
>     GvR> entries here.
> 
> Presumably that'd be a module global, right?  Maybe within Zope that
> could be protected,

Yes.

> but outside of that, that global's always going to
> be accessible.  So are methods, even if given private names.

Aha!  Another think that I expect has been on your agenda for a long
time, but which isn't explicit in the PEP (AFAICT): findattr gives
*total* control over attribute access, unlike __getattr__ and
__setattr__ and private name mangling, which can all be defeated.

And this may be one of the things that Jim is after with
ExtensionClasses in Zope.  Although I believe that in DTML, he doesn't
trust this: he uses source-level (or bytecode-level) transformations
to turn all X.Y operations into a call into a security manager.

So I'm not sure that the argument is very strong.

> And I
> don't think that such code would be any more readable since instead of
> self.name you'd see stuff like
> 
>     def __getattr__(self, name):
>         global instdict
> 	mydict = instdict[id(self)]
> 	obj = mydict[name]
> 	...
> 
>     def __setattr__(self, name, val):
> 	global instdict
> 	mydict = instdict[id(self)]
> 	instdict[name] = val
> 	...
> 
> and that /might/ be a problem with Jython currently, because id()'s
> may be reused.  And relying on __del__ may have unfortunate side
> effects when viewed in conjunction with garbage collection.

Fair enough.  I withdraw the suggestion, and propose restricted
execution instead.  There, you can use Bastions -- which have problems
of their own, but you do get total control.

> You're probably still unconvinced <wink>, but are you dead-set against
> it?  I can try implementing __findattr__() as a pre-__getattr__ hook
> only.  Then we can live with the current __setattr__() restrictions
> and see what the examples look like in that situation.

I am dead-set against introducing a feature that I don't fully
understand.  Let's continue this discussion.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From bckfnn at worldonline.dk  Tue Dec  5 16:40:10 2000
From: bckfnn at worldonline.dk (Finn Bock)
Date: Tue, 05 Dec 2000 15:40:10 GMT
Subject: [Python-Dev] PEP 231, __findattr__()
In-Reply-To: <200012051254.HAA25502@cj20424-a.reston1.va.home.com>
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com>   <14892.22751.921264.156010@anthem.concentric.net>  <200012051254.HAA25502@cj20424-a.reston1.va.home.com>
Message-ID: <3a2d0c29.242749@smtp.worldonline.dk>

On Tue, 05 Dec 2000 07:54:20 -0500, you wrote:

>>     GvR> What does Jython do here?
>> 
>> It's not exactly equivalent, because Jython's __findattr__ can't call
>> back into Python.
>
>I'd say that Jython's __findattr__ is an entirely different beast than
>what we have here.  Its min purpose in life appears to be to be a
>getattr equivalent that returns NULL instead of raising an exception
>when the attribute isn't found -- which is reasonable because from
>within Java, testing for null is much cheaper than checking for an
>exception, and you often need to look whether a given attribute exists
>do some default action if not. 

Correct. It is also the method to override when making a new builtin
type and it will be called on such a type subclass regardless of the
presence of any __getattr__ hook and __dict__ content. So I think it
have some of the properties which Barry wants.


regards,
finn



From greg at cosc.canterbury.ac.nz  Wed Dec  6 00:07:06 2000
From: greg at cosc.canterbury.ac.nz (greg at cosc.canterbury.ac.nz)
Date: Wed, 06 Dec 2000 12:07:06 +1300 (NZDT)
Subject: Are you all mad? (Re: [Python-Dev] PEP 231, __findattr__())
In-Reply-To: <200012051254.HAA25502@cj20424-a.reston1.va.home.com>
Message-ID: <200012052307.MAA01082@s454.cosc.canterbury.ac.nz>

I can't believe you're even considering a magic
dynamically-scoped flag that invisibly changes the
semantics of fundamental operations. To me the
idea is utterly insane!

If I understand correctly, the problem is that if
you do something like

  def __findattr__(self, name):
    if name == 'spam':
      return self.__dict__['spam']

then self.__dict__ is going to trigger a recursive
__findattr__ call. 

It seems to me that if you're going to have some sort
of hook that is always called on any x.y reference,
you need some way of explicitly bypassing it and getting
at the underlying machinery.

I can think of a couple of ways:

1) Make the __dict__ attribute special, so that accessing
it always bypasses __findattr__.

2) Provide some other way of getting direct access to the
attributes of an object, e.g. new builtins called
peekattr() and pokeattr().

This assumes that you always know when you write a particular
access whether you want it to be a "normal" or "special"
one, so that you can use the appropriate mechanism.
Are there any cases where this is not true?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From barry at digicool.com  Wed Dec  6 03:20:40 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 5 Dec 2000 21:20:40 -0500
Subject: [Python-Dev] PEP 231, __findattr__()
References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de>
	<14891.46227.785856.307437@anthem.concentric.net>
	<20001205033118.9135CA817@darjeeling.zadka.site.co.il>
	<14892.2868.982013.313562@anthem.concentric.net>
	<200012042316.SAA23081@cj20424-a.reston1.va.home.com>
	<14892.22751.921264.156010@anthem.concentric.net>
	<200012051254.HAA25502@cj20424-a.reston1.va.home.com>
	<3a2d0c29.242749@smtp.worldonline.dk>
Message-ID: <14893.41592.701128.58110@anthem.concentric.net>

>>>>> "FB" == Finn Bock <bckfnn at worldonline.dk> writes:

    FB> Correct. It is also the method to override when making a new
    FB> builtin type and it will be called on such a type subclass
    FB> regardless of the presence of any __getattr__ hook and
    FB> __dict__ content. So I think it have some of the properties
    FB> which Barry wants.

We had a discussion about this PEP at our group meeting today.  Rather
than write it all twice, I'm going to try to update the PEP and patch
tonight.  I think what we came up with will solve most of the problems
raised, and will be implementable in Jython (I'll try to work up a
Jython patch too, if I don't fall asleep first :)

-Barry



From barry at digicool.com  Wed Dec  6 03:54:36 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 5 Dec 2000 21:54:36 -0500
Subject: Are you all mad? (Re: [Python-Dev] PEP 231, __findattr__())
References: <200012051254.HAA25502@cj20424-a.reston1.va.home.com>
	<200012052307.MAA01082@s454.cosc.canterbury.ac.nz>
Message-ID: <14893.43628.61063.905227@anthem.concentric.net>

>>>>> "greg" ==   <greg at cosc.canterbury.ac.nz> writes:

    | 1) Make the __dict__ attribute special, so that accessing
    | it always bypasses __findattr__.

You're not far from what I came up with right after our delicious
lunch.  We're going to invent a new protocol which passes __dict__
into the method as an argument.  That way self.__dict__ doesn't need
to be special cased at all because you can get at all the attributes
via a local!  So no recursion stop hack is necessary.

More in the updated PEP and patch.

-Barry



From dgoodger at bigfoot.com  Thu Dec  7 05:33:33 2000
From: dgoodger at bigfoot.com (David Goodger)
Date: Wed, 06 Dec 2000 23:33:33 -0500
Subject: [Python-Dev] unit testing and Python regression test
Message-ID: <B6547D4C.BE96%dgoodger@bigfoot.com>

There is another unit testing implementation out there, OmPyUnit, available
from:

    http://www.objectmentor.com/freeware/downloads.html

-- 
David Goodger    dgoodger at bigfoot.com    Open-source projects:
 - The Go Tools Project: http://gotools.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net (soon!)




From fdrake at users.sourceforge.net  Thu Dec  7 07:26:54 2000
From: fdrake at users.sourceforge.net (Fred L. Drake)
Date: Wed, 6 Dec 2000 22:26:54 -0800
Subject: [Python-Dev] [development doc updates]
Message-ID: <200012070626.WAA22103@orbital.p.sourceforge.net>

The development version of the documentation has been updated:

	http://python.sourceforge.net/devel-docs/


Lots of small changes, but most important, more DOM documentation:

	http://python.sourceforge.net/devel-docs/lib/module-xml.dom.html



From guido at python.org  Thu Dec  7 18:48:53 2000
From: guido at python.org (Guido van Rossum)
Date: Thu, 07 Dec 2000 12:48:53 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
Message-ID: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>

After perusing David Ascher's proposal, several versions of his
patches, and hundreds of email exchanged on this subject (almost all
of this dated April or May of 1998), I've produced a reasonably
semblance of PEP 207.  Get it from CVS or here on the web:

  http://python.sourceforge.net/peps/pep-0207.html

I'd like to hear your comments, praise, and criticisms!

The PEP still needs work; in particular, the minority point of view
back then (that comparisons should return only Boolean results) is not
adequately represented (but I *did* work in a reference to tabnanny,
to ensure Tim's support :-).

I'd like to work on a patch next, but I think there will be
interference with Neil's coercion patch.  I'm not sure how to resolve
that yet; maybe I'll just wait until Neil's coercion patch is checked
in.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Thu Dec  7 18:54:51 2000
From: guido at python.org (Guido van Rossum)
Date: Thu, 07 Dec 2000 12:54:51 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
Message-ID: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>

I'm maybe about three quarters on the way with writing PEP 230 -- far
enough along to be asking for comments.  Get it from CVS or go to:

  http://python.sourceforge.net/peps/pep-0230.html

A prototype implementation in Python is included in the PEP; I think
this shows that the implementation is not too complex (Paul Prescod's
fear about my proposal).

This is pretty close to what I proposed earlier (Nov 5), except that I
have added warning category classes (inspired by Paul's proposal).
This class also serves as the exception to be raised when warnings are
turned into exceptions.

Do I need to include a discussion of Paul's counter-proposal and why I
rejected it?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From Barrett at stsci.edu  Thu Dec  7 23:49:02 2000
From: Barrett at stsci.edu (Paul Barrett)
Date: Thu,  7 Dec 2000 17:49:02 -0500 (EST)
Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays
Message-ID: <14896.1191.240597.632888@nem-srvr.stsci.edu>

What is the status of PEP 209?  I see David Ascher is the champion of
this PEP, but nothing has been written up.  Is the intention of this
PEP to make the current Numeric a built-in feature of Python or to
re-implement and replace the current Numeric module?

The reason that I ask these questions is because I'm working on a
prototype of a new N-dimensional Array module which I call Numeric 2.
This new module will be much more extensible than the current Numeric.
For example, new array types and universal functions can be loaded or
imported on demand.  We also intend to implement a record (or
C-structure) type, because 1-D arrays or lists of records are a common
data structure for storing photon events in astronomy and related
fields.

The current Numeric does not handle record types efficiently,
particularly when the data type is not aligned and is in non-native
endian format.  To handle such data, temporary arrays must be created
and alignment and byte-swapping done on them.  Numeric 2 does such
pre- and post-processing inside the inner-most loop which is more
efficient in both time and memory.  It also does type conversion at
this level which is consistent with that proposed for PEP 208.

Since many scientific users would like direct access to the array data
via C pointers, we have investigated using the buffer object.  We have
not had much success with it, because of its implementation.  I have
scanned the python-dev mailing list for discussions of this issue and
found that it now appears to be deprecated.

My opinion on this is that a new _fundamental_ built-in type should be
created for memory allocation with features and an interface similar
to the _mmap_ object.  I'll call this a _malloc_ object.  This would
allow Numeric 2 to use either object interchangeably depending on the
circumstance.  The _string_ type could also benefit from this new
object by using a read-only version of it.  Since its an object, it's
memory area should be safe from inadvertent deletion.

Because of these and other new features in Numeric 2, I have a keen
interest in the status of PEPs 207, 208, 211, 225, and 228; and also
in the proposed buffer object.  

I'm willing to implement this new _malloc_ object if members of the
python-dev list are in agreement.  Actually I see no alternative,
given the current design of Numeric 2, since the Array class will
initially be written completely in Python and will need a mutable
memory buffer, while the _string_ type is meant to be a read-only
object.

All comments welcome.

 -- Paul

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218



From DavidA at ActiveState.com  Fri Dec  8 02:13:04 2000
From: DavidA at ActiveState.com (David Ascher)
Date: Thu, 7 Dec 2000 17:13:04 -0800 (Pacific Standard Time)
Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional
 Arrays
In-Reply-To: <14896.1191.240597.632888@nem-srvr.stsci.edu>
Message-ID: <Pine.WNT.4.21.0012071712410.1360-100000@loom>

On Thu, 7 Dec 2000, Paul Barrett wrote:

> What is the status of PEP 209?  I see David Ascher is the champion of
> this PEP, but nothing has been written up.  Is the intention of this

I put my name on the PEP just to make sure it wasn't forgotten.  If
someone wants to champion it, their name should go on it.

--david




From guido at python.org  Fri Dec  8 17:10:50 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 08 Dec 2000 11:10:50 -0500
Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays
In-Reply-To: Your message of "Thu, 07 Dec 2000 17:49:02 EST."
             <14896.1191.240597.632888@nem-srvr.stsci.edu> 
References: <14896.1191.240597.632888@nem-srvr.stsci.edu> 
Message-ID: <200012081610.LAA30679@cj20424-a.reston1.va.home.com>

> What is the status of PEP 209?  I see David Ascher is the champion of
> this PEP, but nothing has been written up.  Is the intention of this
> PEP to make the current Numeric a built-in feature of Python or to
> re-implement and replace the current Numeric module?

David has already explained why his name is on it -- basically,
David's name is on several PEPs but he doesn't currently have any time
to work on these, so other volunteers are most welcome to join.

It is my understanding that the current Numeric is sufficiently messy
in implementation and controversial in semantics that it would not be
a good basis to start from.

However, I do think that a basic multi-dimensional array object would
be a welcome addition to core Python.

> The reason that I ask these questions is because I'm working on a
> prototype of a new N-dimensional Array module which I call Numeric 2.
> This new module will be much more extensible than the current Numeric.
> For example, new array types and universal functions can be loaded or
> imported on demand.  We also intend to implement a record (or
> C-structure) type, because 1-D arrays or lists of records are a common
> data structure for storing photon events in astronomy and related
> fields.

I'm not familiar with the use of computers in astronomy and related
fields, so I'll take your word for that! :-)

> The current Numeric does not handle record types efficiently,
> particularly when the data type is not aligned and is in non-native
> endian format.  To handle such data, temporary arrays must be created
> and alignment and byte-swapping done on them.  Numeric 2 does such
> pre- and post-processing inside the inner-most loop which is more
> efficient in both time and memory.  It also does type conversion at
> this level which is consistent with that proposed for PEP 208.
> 
> Since many scientific users would like direct access to the array data
> via C pointers, we have investigated using the buffer object.  We have
> not had much success with it, because of its implementation.  I have
> scanned the python-dev mailing list for discussions of this issue and
> found that it now appears to be deprecated.

Indeed.  I think it's best to leave the buffer object out of your
implementation plans.  There are several problems with it, and one of
the backburner projects is to redesign it to be much more to the point
(providing less, not more functionality).

> My opinion on this is that a new _fundamental_ built-in type should be
> created for memory allocation with features and an interface similar
> to the _mmap_ object.  I'll call this a _malloc_ object.  This would
> allow Numeric 2 to use either object interchangeably depending on the
> circumstance.  The _string_ type could also benefit from this new
> object by using a read-only version of it.  Since its an object, it's
> memory area should be safe from inadvertent deletion.

Interesting.  I'm actually not sufficiently familiar with mmap to
comment.  But would the existing array module's array object be at all
useful?  You can get to the raw bytes in C (using the C buffer API,
which is not deprecated) and it is extensible.

> Because of these and other new features in Numeric 2, I have a keen
> interest in the status of PEPs 207, 208, 211, 225, and 228; and also
> in the proposed buffer object.  

Here are some quick comments on the mentioned PEPs.

207: Rich Comparisons.  This will go into Python 2.1.  (I just
finished the first draft of the PEP, please read it and comment.)

208: Reworking the Coercion Model.  This will go into Python 2.1.
Neil Schemenauer has mostly finished the patches already.  Please
comment.

211: Adding New Lineal Algebra Operators (Greg Wilson).  This is
unlikely to go into Python 2.1.  I don't like the idea much.  If you
disagree, please let me know!  (Also, a choice has to be made between
211 and 225; I don't want to accept both, so until 225 is rejected,
211 is in limbo.)

225: Elementwise/Objectwise Operators (Zhu, Lielens).  This will
definitely not go into Python 2.1.  It adds too many new operators.

228: Reworking Python's Numeric Model.  This is a total pie-in-the-sky
PEP, and this kind of change is not likely to happen before Python
3000.

> I'm willing to implement this new _malloc_ object if members of the
> python-dev list are in agreement.  Actually I see no alternative,
> given the current design of Numeric 2, since the Array class will
> initially be written completely in Python and will need a mutable
> memory buffer, while the _string_ type is meant to be a read-only
> object.

Would you be willing to take over authorship of PEP 209?  David Ascher
and the Numeric Python community will thank you.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Dec  8 19:43:39 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 08 Dec 2000 13:43:39 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: Your message of "Thu, 30 Nov 2000 17:46:52 EST."
             <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> 
Message-ID: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>

After the last round of discussion, I was left with the idea that the
best thing we could do to help destructive iteration is to introduce a
{}.popitem() that returns an arbitrary (key, value) pair and deletes
it.  I wrote about this:

> > One more concern: if you repeatedly remove the *first* item, the hash
> > table will start looking lobsided.  Since we don't resize the hash
> > table on deletes, maybe picking an item at random (but not using an
> > expensive random generator!) would be better.

and Tim replied:

> Which is the reason SETL doesn't specify *which* set item is removed:  if
> you always start looking at "the front" of a dict that's being consumed, the
> dict fills with turds without shrinking, you skip over them again and again,
> and consuming the entire dict is still quadratic time.
> 
> Unfortunately, while using a random start point is almost always quicker
> than that, the expected time for consuming the whole dict remains quadratic.
> 
> The clearest way around that is to save a per-dict search finger, recording
> where the last search left off.  Start from its current value.  Failure if
> it wraps around.  This is linear time in non-pathological cases (a
> pathological case is one in which it isn't linear time <wink>).

I've implemented this, except I use a static variable for the finger
intead of a per-dict finger.  I'm concerned about adding 4-8 extra
bytes to each dict object for a feature that most dictionaries never
need.  So, instead, I use a single shared finger.  This works just as
well as long as this is used for a single dictionary.  For multiple
dictionaries (either used by the same thread or in different threads),
it'll work almost as well, although it's possible to make up a
pathological example that would work qadratically.

An easy example of such a pathological example is to call popitem()
for two identical dictionaries in lock step.

Comments please!  We could:

- Live with the pathological cases.

- Forget the whole thing; and then also forget about firstkey()
  etc. which has the same problem only worse.

- Fix the algorithm. Maybe jumping criss-cross through the hash table
  like lookdict does would improve that; but I don't understand the
  math used for that ("Cycle through GF(2^n)-{0}" ???).

I've placed a patch on SourceForge:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102733&group_id=5470

The algorithm is:

static PyObject *
dict_popitem(dictobject *mp, PyObject *args)
{
	static int finger = 0;
	int i;
	dictentry *ep;
	PyObject *res;

	if (!PyArg_NoArgs(args))
		return NULL;
	if (mp->ma_used == 0) {
		PyErr_SetString(PyExc_KeyError,
				"popitem(): dictionary is empty");
		return NULL;
	}
	i = finger;
	if (i >= mp->ma_size)
		ir = 0;
	while ((ep = &mp->ma_table[i])->me_value == NULL) {
		i++;
		if (i >= mp->ma_size)
			i = 0;
	}
	finger = i+1;
	res = PyTuple_New(2);
	if (res != NULL) {
		PyTuple_SET_ITEM(res, 0, ep->me_key);
		PyTuple_SET_ITEM(res, 1, ep->me_value);
		Py_INCREF(dummy);
		ep->me_key = dummy;
		ep->me_value = NULL;
		mp->ma_used--;
	}
	return res;
}

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Dec  8 19:51:49 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 08 Dec 2000 13:51:49 -0500
Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use
Message-ID: <200012081851.NAA32254@cj20424-a.reston1.va.home.com>

Moshe proposes to add an overridable function sys.displayhook(obj)
which will be called by the interpreter for the PRINT_EXPR opcode,
instead of hardcoding the behavior.  The default implementation will
of course have the current behavior, but this makes it much simpler to
experiment with alternatives, e.g. using str() instead of repr() (or
to choose between str() and repr() based on the type).

Moshe has asked me to pronounce on this PEP.  I've thought about it,
and I'm now all for it.  Moshe (or anyone else), please submit a patch
to SF that shows the complete implementation!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Fri Dec  8 20:06:50 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 8 Dec 2000 14:06:50 -0500
Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEAJIDAA.tim.one@home.com>

[Guido, on sharing a search finger and getting worse-than-linear
 behavior in a simple test case]

See my reply on SourceForge (crossed in the mails).  I predict that fixing
this in an acceptable way (not bulletproof, but linear-time for all
predictably common cases) is a two-character change.

Surprise, although maybe I'm hallucinating (would someone please confirm?):
when I went to the SF patch manager page to look for your patch (using the
Open Patches view), I couldn't find it.  My guess is that if there are "too
many" patches to fit on one screen, then unlike the SF *bug* manager, you
don't get any indication that more patches exist or any control to go to the
next page.




From barry at digicool.com  Fri Dec  8 20:18:26 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Fri, 8 Dec 2000 14:18:26 -0500
Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCIEAJIDAA.tim.one@home.com>
Message-ID: <14897.13314.469255.853298@anthem.concentric.net>

>>>>> "TP" == Tim Peters <tim.one at home.com> writes:

    TP> Surprise, although maybe I'm hallucinating (would someone
    TP> please confirm?): when I went to the SF patch manager page to
    TP> look for your patch (using the Open Patches view), I couldn't
    TP> find it.  My guess is that if there are "too many" patches to
    TP> fit on one screen, then unlike the SF *bug* manager, you don't
    TP> get any indication that more patches exist or any control to
    TP> go to the next page.

I haven't checked recently, but this was definitely true a few weeks
ago.  I think I even submitted an admin request on it, but I don't
remember for sure.

-Barry



From Barrett at stsci.edu  Fri Dec  8 22:22:39 2000
From: Barrett at stsci.edu (Paul Barrett)
Date: Fri,  8 Dec 2000 16:22:39 -0500 (EST)
Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays
In-Reply-To: <200012081610.LAA30679@cj20424-a.reston1.va.home.com>
References: <14896.1191.240597.632888@nem-srvr.stsci.edu>
	<200012081610.LAA30679@cj20424-a.reston1.va.home.com>
Message-ID: <14897.10309.686024.254701@nem-srvr.stsci.edu>

Guido van Rossum writes:
 > > What is the status of PEP 209?  I see David Ascher is the champion of
 > > this PEP, but nothing has been written up.  Is the intention of this
 > > PEP to make the current Numeric a built-in feature of Python or to
 > > re-implement and replace the current Numeric module?
 > 
 > David has already explained why his name is on it -- basically,
 > David's name is on several PEPs but he doesn't currently have any time
 > to work on these, so other volunteers are most welcome to join.
 > 
 > It is my understanding that the current Numeric is sufficiently messy
 > in implementation and controversial in semantics that it would not be
 > a good basis to start from.

That is our (Rick, Perry, and I) belief also.

 > However, I do think that a basic multi-dimensional array object would
 > be a welcome addition to core Python.

That's re-assuring.

 > Indeed.  I think it's best to leave the buffer object out of your
 > implementation plans.  There are several problems with it, and one of
 > the backburner projects is to redesign it to be much more to the point
 > (providing less, not more functionality).

I agree and have already made the decision to leave it out.

 > > My opinion on this is that a new _fundamental_ built-in type should be
 > > created for memory allocation with features and an interface similar
 > > to the _mmap_ object.  I'll call this a _malloc_ object.  This would
 > > allow Numeric 2 to use either object interchangeably depending on the
 > > circumstance.  The _string_ type could also benefit from this new
 > > object by using a read-only version of it.  Since its an object, it's
 > > memory area should be safe from inadvertent deletion.
 > 
 > Interesting.  I'm actually not sufficiently familiar with mmap to
 > comment.  But would the existing array module's array object be at all
 > useful?  You can get to the raw bytes in C (using the C buffer API,
 > which is not deprecated) and it is extensible.

I tried using this but had problems.  I'll look into it again.

 > > Because of these and other new features in Numeric 2, I have a keen
 > > interest in the status of PEPs 207, 208, 211, 225, and 228; and also
 > > in the proposed buffer object.  
 > 
 > Here are some quick comments on the mentioned PEPs.

I've got these PEPs on my desk and will comment on them when I can.

 > > I'm willing to implement this new _malloc_ object if members of the
 > > python-dev list are in agreement.  Actually I see no alternative,
 > > given the current design of Numeric 2, since the Array class will
 > > initially be written completely in Python and will need a mutable
 > > memory buffer, while the _string_ type is meant to be a read-only
 > > object.
 > 
 > Would you be willing to take over authorship of PEP 209?  David Ascher
 > and the Numeric Python community will thank you.

Yes, I'd gladly wield vast and inconsiderate power over unsuspecting
pythoneers. ;-)

 -- Paul





From guido at python.org  Fri Dec  8 23:58:03 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 08 Dec 2000 17:58:03 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Thu, 07 Dec 2000 12:54:51 EST."
             <200012071754.MAA26557@cj20424-a.reston1.va.home.com> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> 
Message-ID: <200012082258.RAA02389@cj20424-a.reston1.va.home.com>

Nobody seems to care much about the warnings PEP so far.  What's up?
Are you all too busy buying presents for the holidays?  Then get me
some too, please? :-)

>   http://python.sourceforge.net/peps/pep-0230.html

I've now produced a prototype implementation for the C code:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102715&group_id=5470

Issues:

- This defines a C API PyErr_Warn(category, message) instead of
  Py_Warn(message, category) as the PEP proposes.  I actually like
  this better: it's consistent with PyErr_SetString() etc. rather than
  with the Python warn(message[, category]) function.

- This calls the Python module from C.  We'll have to see if this is
  fast enough.  I wish I could postpone the import of warnings.py
  until the first call to PyErr_Warn(), but unfortunately the warning
  category classes must be initialized first (so they can be passed
  into PyErr_Warn()).  The current version of warnings.py imports
  rather a lot of other modules (e.g. re and getopt); this can be
  reduced by placing those imports inside the functions that use them.

- All the issues listed in the PEP.

Please comment!

BTW: somebody overwrote the PEP on SourceForge with an older version.
Please remember to do a "cvs update" before running "make install" in
the peps directory!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gstein at lyra.org  Sat Dec  9 00:26:51 2000
From: gstein at lyra.org (Greg Stein)
Date: Fri, 8 Dec 2000 15:26:51 -0800
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Dec 08, 2000 at 01:43:39PM -0500
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
Message-ID: <20001208152651.H30644@lyra.org>

On Fri, Dec 08, 2000 at 01:43:39PM -0500, Guido van Rossum wrote:
>...
> Comments please!  We could:
> 
> - Live with the pathological cases.

I agree: live with it. The typical case will operate just fine.

> - Forget the whole thing; and then also forget about firstkey()
>   etc. which has the same problem only worse.

No opinion.

> - Fix the algorithm. Maybe jumping criss-cross through the hash table
>   like lookdict does would improve that; but I don't understand the
>   math used for that ("Cycle through GF(2^n)-{0}" ???).

No need. The keys were inserted randomly, so sequencing through is
effectively random. :-)

>...
> static PyObject *
> dict_popitem(dictobject *mp, PyObject *args)
> {
> 	static int finger = 0;
> 	int i;
> 	dictentry *ep;
> 	PyObject *res;
> 
> 	if (!PyArg_NoArgs(args))
> 		return NULL;
> 	if (mp->ma_used == 0) {
> 		PyErr_SetString(PyExc_KeyError,
> 				"popitem(): dictionary is empty");
> 		return NULL;
> 	}
> 	i = finger;
> 	if (i >= mp->ma_size)
> 		ir = 0;

Should be "i = 0"

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From tismer at tismer.com  Sat Dec  9 17:44:14 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sat, 09 Dec 2000 18:44:14 +0200
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
Message-ID: <3A32615E.D39B68D2@tismer.com>


Guido van Rossum wrote:
> 
> After the last round of discussion, I was left with the idea that the
> best thing we could do to help destructive iteration is to introduce a
> {}.popitem() that returns an arbitrary (key, value) pair and deletes
> it.  I wrote about this:
> 
> > > One more concern: if you repeatedly remove the *first* item, the hash
> > > table will start looking lobsided.  Since we don't resize the hash
> > > table on deletes, maybe picking an item at random (but not using an
> > > expensive random generator!) would be better.
> 
> and Tim replied:
> 
> > Which is the reason SETL doesn't specify *which* set item is removed:  if
> > you always start looking at "the front" of a dict that's being consumed, the
> > dict fills with turds without shrinking, you skip over them again and again,
> > and consuming the entire dict is still quadratic time.
> >
> > Unfortunately, while using a random start point is almost always quicker
> > than that, the expected time for consuming the whole dict remains quadratic.
> >
> > The clearest way around that is to save a per-dict search finger, recording
> > where the last search left off.  Start from its current value.  Failure if
> > it wraps around.  This is linear time in non-pathological cases (a
> > pathological case is one in which it isn't linear time <wink>).
> 
> I've implemented this, except I use a static variable for the finger
> intead of a per-dict finger.  I'm concerned about adding 4-8 extra
> bytes to each dict object for a feature that most dictionaries never
> need.  So, instead, I use a single shared finger.  This works just as
> well as long as this is used for a single dictionary.  For multiple
> dictionaries (either used by the same thread or in different threads),
> it'll work almost as well, although it's possible to make up a
> pathological example that would work qadratically.
> 
> An easy example of such a pathological example is to call popitem()
> for two identical dictionaries in lock step.
> 
> Comments please!  We could:
> 
> - Live with the pathological cases.
> 
> - Forget the whole thing; and then also forget about firstkey()
>   etc. which has the same problem only worse.
> 
> - Fix the algorithm. Maybe jumping criss-cross through the hash table
>   like lookdict does would improve that; but I don't understand the
>   math used for that ("Cycle through GF(2^n)-{0}" ???).

That algorithm is really a gem which you should know,
so let me try to explain it.


Intro: A little story about finite field theory (very basic).
-------------------------------------------------------------

For every prime p and every power p^n, there
exists a Galois Field ( GF(p^n) ), which is
a finite field.
The additive group is called "elementary Abelian",
it is commutative, and it looks a little like a
vector space, since addition works in cycles modulo p
for every p cell.
The multiplicative group is cyclic, and it never
touches 0. Cyclic groups are generated by a single
primitive element. The powers of that element make up all the
other elements. For all elements of the multiplication
group GF(p^n)* the equality    x^(p^n -1) == 1 . A generator
element is therefore a primitive (p^n-1)th root of unity.


From nas at arctrix.com  Sat Dec  9 12:30:06 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Sat, 9 Dec 2000 03:30:06 -0800
Subject: [Python-Dev] PEP 208 and __coerce__
Message-ID: <20001209033006.A3737@glacier.fnational.com>

While working on the implementation of PEP 208, I discovered that
__coerce__ has some surprising properties.  Initially I
implemented __coerce__ so that the numberic operation currently
being performed was called on the values returned by __coerce__.
This caused test_class to blow up due to code like this:

    class Test:
        def __coerce__(self, other):
            return (self, other)

The 2.0 "solves" this by not calling __coerce__ again if the
objects returned by __coerce__ are instances.  This has the
effect of making code like:

    class A:
        def __coerce__(self, other):
            return B(), other

    class B:
        def __coerce__(self, other):
            return 1, other

    A() + 1

fail to work in the expected way.  The question is: how should
__coerce__ work?  One option is to leave it work the way it does
in 2.0.  Alternatively, I could change it so that if coerce
returns (self, *) then __coerce__ is not called again.


  Neil



From mal at lemburg.com  Sat Dec  9 19:49:29 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 09 Dec 2000 19:49:29 +0100
Subject: [Python-Dev] PEP 208 and __coerce__
References: <20001209033006.A3737@glacier.fnational.com>
Message-ID: <3A327EB9.BD2CA3CC@lemburg.com>

Neil Schemenauer wrote:
> 
> While working on the implementation of PEP 208, I discovered that
> __coerce__ has some surprising properties.  Initially I
> implemented __coerce__ so that the numberic operation currently
> being performed was called on the values returned by __coerce__.
> This caused test_class to blow up due to code like this:
> 
>     class Test:
>         def __coerce__(self, other):
>             return (self, other)
> 
> The 2.0 "solves" this by not calling __coerce__ again if the
> objects returned by __coerce__ are instances.  This has the
> effect of making code like:
> 
>     class A:
>         def __coerce__(self, other):
>             return B(), other
> 
>     class B:
>         def __coerce__(self, other):
>             return 1, other
> 
>     A() + 1
> 
> fail to work in the expected way.  The question is: how should
> __coerce__ work?  One option is to leave it work the way it does
> in 2.0.  Alternatively, I could change it so that if coerce
> returns (self, *) then __coerce__ is not called again.

+0 -- the idea behind the PEP 208 is to get rid off the 
centralized coercion mechanism, so fixing it to allow yet
more obscure variants should be carefully considered. 

I see __coerce__ et al. as old style mechanisms -- operator methods
have much more information available to do the right thing
than the single bottelneck __coerce__.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Sat Dec  9 21:49:04 2000
From: tim.one at home.com (Tim Peters)
Date: Sat, 9 Dec 2000 15:49:04 -0500
Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECEIDAA.tim.one@home.com>

[Guido]
> ...
> I've implemented this, except I use a static variable for the finger
> intead of a per-dict finger.  I'm concerned about adding 4-8 extra
> bytes to each dict object for a feature that most dictionaries never
> need.

It's a bit ironic that dicts are guaranteed to be at least 1/3 wasted space
<wink>.  Let's pick on Christian's idea to reclaim a few bytes of that.

> So, instead, I use a single shared finger.  This works just as
> well as long as this is used for a single dictionary.  For multiple
> dictionaries (either used by the same thread or in different threads),
> it'll work almost as well, although it's possible to make up a
> pathological example that would work qadratically.
>
> An easy example of such a pathological example is to call popitem()
> for two identical dictionaries in lock step.

Please see my later comments attached to the patch:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102733&group_id=5470

In short, for me (truly) identical dicts perform well with or without my
suggestion, while dicts cloned via dict.copy() perform horribly with or
without my suggestion (their internal structures differ); still curious as
to whether that's also true for you (am I looking at a Windows bug?  I don't
see how, but it's possible ...).  In any case, my suggestion turned out to
be worthless on my box.

Playing around via simulations suggests that a shared finger is going to be
disastrous when consuming more than one dict unless they have identical
internal structure (not just compare equal).  As soon as they get a little
out of synch, it just gets worse with each succeeding probe.

> Comments please!  We could:
>
> - Live with the pathological cases.

How boring <wink>.

> - Forget the whole thing; and then also forget about firstkey()
>   etc. which has the same problem only worse.

I don't know that this is an important idea for dicts in general (it is
important for sets) -- it's akin to an xrange for dicts.  But then I've had
more than one real-life program that built giant dicts then ran out of
memory trying to iterate over them!  I'd like to fix that.

> - Fix the algorithm. Maybe jumping criss-cross through the hash table
>   like lookdict does would improve that; but I don't understand the
>   math used for that ("Cycle through GF(2^n)-{0}" ???).

Christian explained that well (thanks!).

However, I still don't see any point to doing that business in .popitem():
when inserting keys, the jitterbug probe sequence has the crucial benefit of
preventing primary clustering when keys collide.  But when we consume a
dict, we just want to visit every slot as quickly as possible.

[Christian]
> Appendix, on the use of finger:
> -------------------------------
>
> Instead of using a global finger variable, you can do the
> following (involving a cast from object to int) :
>
> - if the 0'th slot of the dict is non-empty:
>   return this element and insert the dummy element
>   as key. Set the value field to the Dictionary Algorithm
>   would give for the removed object's hash. This is the
>   next finger.
> - else:
>   treat the value field of the 0'th slot as the last finger.
>   If it is zero, initialize it with 2^n-1.
>   Repetitively use the DA until you find an entry. Save
>   the finger in slot 0 again.
>
> This dosn't cost an extra slot, and even when the dictionary
> is written between removals, the chance to loose the finger
> is just 1:(2^n-1) on every insertion.

I like that, except:

1) As above, I don't believe the GF business buys anything over
   a straightforward search when consuming a dict.

2) Overloading the value field bristles with problems, in part
   because it breaks the invariant that a slot is unused if
   and only if the value field is NULL, in part because C
   doesn't guarantee that you can get away with casting an
   arbitrary int to a pointer and back again.

None of the problems in #2 arise if we abuse the me_hash field instead, so
the attached does that.  Here's a typical run of Guido's test case using
this (on an 866MHz machine w/ 256Mb RAM -- the early values jump all over
the place from run to run):

run = 0
log2size = 10 size = 1024
    7.4 usec per item to build (total 0.008 sec)
    3.4 usec per item to destroy twins (total 0.003 sec)
log2size = 11 size = 2048
    6.7 usec per item to build (total 0.014 sec)
    3.4 usec per item to destroy twins (total 0.007 sec)
log2size = 12 size = 4096
    7.0 usec per item to build (total 0.029 sec)
    3.7 usec per item to destroy twins (total 0.015 sec)
log2size = 13 size = 8192
    7.1 usec per item to build (total 0.058 sec)
    5.9 usec per item to destroy twins (total 0.048 sec)
log2size = 14 size = 16384
    14.7 usec per item to build (total 0.241 sec)
    6.4 usec per item to destroy twins (total 0.105 sec)
log2size = 15 size = 32768
    12.2 usec per item to build (total 0.401 sec)
    3.9 usec per item to destroy twins (total 0.128 sec)
log2size = 16 size = 65536
    7.8 usec per item to build (total 0.509 sec)
    4.0 usec per item to destroy twins (total 0.265 sec)
log2size = 17 size = 131072
    7.9 usec per item to build (total 1.031 sec)
    4.1 usec per item to destroy twins (total 0.543 sec)

The last one is over 100 usec per item using the original patch (with or
without my first suggestion).

if-i-were-a-betting-man-i'd-say-"bingo"-ly y'rs  - tim


Drop-in replacement for the popitem in the patch:

static PyObject *
dict_popitem(dictobject *mp, PyObject *args)
{
	int i = 0;
	dictentry *ep;
	PyObject *res;

	if (!PyArg_NoArgs(args))
		return NULL;
	if (mp->ma_used == 0) {
		PyErr_SetString(PyExc_KeyError,
				"popitem(): dictionary is empty");
		return NULL;
	}
	/* Set ep to "the first" dict entry with a value.  We abuse the hash
	 * field of slot 0 to hold a search finger:
	 * If slot 0 has a value, use slot 0.
	 * Else slot 0 is being used to hold a search finger,
	 * and we use its hash value as the first index to look.
	 */
	ep = &mp->ma_table[0];
	if (ep->me_value == NULL) {
		i = (int)ep->me_hash;
		/* The hash field may be uninitialized trash, or it
		 * may be a real hash value, or it may be a legit
		 * search finger, or it may be a once-legit search
		 * finger that's out of bounds now because it
		 * wrapped around or the table shrunk -- simply
		 * make sure it's in bounds now.
		 */
		if (i >= mp->ma_size || i < 1)
			i = 1;	/* skip slot 0 */
		while ((ep = &mp->ma_table[i])->me_value == NULL) {
			i++;
			if (i >= mp->ma_size)
				i = 1;
		}
	}
	res = PyTuple_New(2);
	if (res != NULL) {
		PyTuple_SET_ITEM(res, 0, ep->me_key);
		PyTuple_SET_ITEM(res, 1, ep->me_value);
		Py_INCREF(dummy);
		ep->me_key = dummy;
		ep->me_value = NULL;
		mp->ma_used--;
	}
	assert(mp->ma_table[0].me_value == NULL);
	mp->ma_table[0].me_hash = i + 1;  /* next place to start */
	return res;
}




From tim.one at home.com  Sat Dec  9 22:09:30 2000
From: tim.one at home.com (Tim Peters)
Date: Sat, 9 Dec 2000 16:09:30 -0500
Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEECEIDAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECEIDAA.tim.one@home.com>

> 	assert(mp->ma_table[0].me_value == NULL);
> 	mp->ma_table[0].me_hash = i + 1;  /* next place to start */

Ack, those two lines should move up into the "if (res != NULL)" block.

errors-are-error-prone-ly y'rs  - tim




From gvwilson at nevex.com  Sun Dec 10 17:11:09 2000
From: gvwilson at nevex.com (Greg Wilson)
Date: Sun, 10 Dec 2000 11:11:09 -0500
Subject: [Python-Dev] re: So You Want to Write About Python?
Message-ID: <NEBBIACCCGNFLECLCLMHCEFLCBAA.gvwilson@nevex.com>

Hi, folks.  Jon Erickson (Doctor Dobb's Journal), Frank Willison (O'Reilly),
and I (professional loose cannon) are doing a workshop at IPC on writing
books
and magazine articles about Python.  It would be great to have a few
articles
(in various stages of their lives) and/or book proposals from people on this
list to use as examples.  So, if you think the world oughta know about the
things you're doing, and would like to use this to help get yourself
motivated
to start writing, please drop me a line.  I'm particularly interested in:

- the real-world issues involved in moving to Unicode

- non-trivial XML processing using SAX and DOM (where "non-trivial" means
  "including namespaces, entity references, error handling, and all that")

- the theory and practice of stackless, generators, and continuations

- the real-world tradeoffs between the various memory management schemes
  that are now available for Python

- feature comparisons of various Foobars that can be used with Python (where
  "Foobar" could be "GUI toolkit", "IDE", "web scripting toolkit", or just
  about anything else)

- performance analysis and tuning of Python itself (as an example of how you
  speed up real applications --- this is something that matters a lot in the
  real world, but tends to get forgotten in school)

- just about anything else that you wish someone had written for you before
  you started your last big project

Thanks,
Greg




From paul at prescod.net  Sun Dec 10 19:02:27 2000
From: paul at prescod.net (Paul Prescod)
Date: Sun, 10 Dec 2000 10:02:27 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com>
Message-ID: <3A33C533.ABA27C7C@prescod.net>

Guido van Rossum wrote:
> 
> Nobody seems to care much about the warnings PEP so far.  What's up?
> Are you all too busy buying presents for the holidays?  Then get me
> some too, please? :-)

My opinions:

 * it should be a built-in or keyword, not a function in "sys". Warning
is supposed to be as easy as possible so people will do it often.
<irrelevant_aside>sys.argv and sys.stdout annoy me as it is.</>

 * the term "level" applied to warnings typically means "warning level"
as in -W1 -W2 -Wall. Proposal: call it "stacklevel" or something.

 * this level idea gives rise to another question. What if I want to see
the full stack context of a warning? Do I have to implement a whole new
warning output hook? It seems like I should be able to specify this as a
command line option alongside the action.

 * I prefer ":*:*:" to ":::" for leaving parts of the warning spec out.

 * should there be a sys.formatwarning? What if I want to redirect
warnings to a socket -- I'd like to use the standard formatting
machinery. Or vice versa, I might want to change the formatting but not
override the destination.

 * there should be a "RuntimeWarning" -- base category for warnings
about dubious runtime behaviors (e.g. integer division truncated value)

 * it should be possible to strip warnings as an optimization step. That
may require interpreter and syntax support.

 * warnings will usually be tied to tests which the user will want to be
able to optimize out also. (e.g. if __debug__ and type(foo)==StringType:
warn "Should be Unicode!")


I propose:

	>>> warn conditional, message[, category]

to be very parallel with 

	>>> assert conditional, message

I'm not proposing the use of the assert keyword anymore, but I am trying
to reuse the syntax for familiarity. Perhaps -Wstrip would strip
warnings out of the bytecode.

 Paul Prescod



From nas at arctrix.com  Sun Dec 10 14:46:46 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 10 Dec 2000 05:46:46 -0800
Subject: [Python-Dev] Reference implementation for PEP 208 (coercion)
Message-ID: <20001210054646.A5219@glacier.fnational.com>

Sourceforge unloads are not working.  The lastest version of the
patch for PEP 208 is here:

    http://arctrix.com/nas/python/coerce-6.0.diff

Operations on instances now call __coerce__ if it exists.  I
think the patch is now complete.  Converting other builtin types
to "new style numbers" can be done with a separate patch.

  Neil



From guido at python.org  Sun Dec 10 23:17:08 2000
From: guido at python.org (Guido van Rossum)
Date: Sun, 10 Dec 2000 17:17:08 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Sun, 10 Dec 2000 10:02:27 PST."
             <3A33C533.ABA27C7C@prescod.net> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com>  
            <3A33C533.ABA27C7C@prescod.net> 
Message-ID: <200012102217.RAA12550@cj20424-a.reston1.va.home.com>

> My opinions:
> 
>  * it should be a built-in or keyword, not a function in "sys". Warning
> is supposed to be as easy as possible so people will do it often.

Disagree.  Warnings are there mostly for the Python system to warn the
Python programmer.  The most heavy use will come from the standard
library, not from user code.

> <irrelevant_aside>sys.argv and sys.stdout annoy me as it is.</>

Too bad.

>  * the term "level" applied to warnings typically means "warning level"
> as in -W1 -W2 -Wall. Proposal: call it "stacklevel" or something.

Good point.

>  * this level idea gives rise to another question. What if I want to see
> the full stack context of a warning? Do I have to implement a whole new
> warning output hook? It seems like I should be able to specify this as a
> command line option alongside the action.

Turn warnings into errors and you'll get a full traceback.  If you
really want a full traceback without exiting, some creative use of
sys._getframe() and the traceback module will probably suit you well.

>  * I prefer ":*:*:" to ":::" for leaving parts of the warning spec out.

I don't.

>  * should there be a sys.formatwarning? What if I want to redirect
> warnings to a socket -- I'd like to use the standard formatting
> machinery. Or vice versa, I might want to change the formatting but not
> override the destination.

Good point.  I'm changing this to:

def showwarning(message, category, filename, lineno, file=None):
    """Hook to frite a warning to a file; replace if you like."""

and

def formatwarning(message, category, filename, lineno):
    """Hook to format a warning the standard way."""

>  * there should be a "RuntimeWarning" -- base category for warnings
> about dubious runtime behaviors (e.g. integer division truncated value)

OK.

>  * it should be possible to strip warnings as an optimization step. That
> may require interpreter and syntax support.

I don't see the point of this.  I think this comes from our different
views on who should issue warnings.

>  * warnings will usually be tied to tests which the user will want to be
> able to optimize out also. (e.g. if __debug__ and type(foo)==StringType:
> warn "Should be Unicode!")
> 
> I propose:
> 
> 	>>> warn conditional, message[, category]

Sorry, this is not worth a new keyword.

> to be very parallel with 
> 
> 	>>> assert conditional, message
> 
> I'm not proposing the use of the assert keyword anymore, but I am trying
> to reuse the syntax for familiarity. Perhaps -Wstrip would strip
> warnings out of the bytecode.

Why?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at effbot.org  Mon Dec 11 01:16:25 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Mon, 11 Dec 2000 01:16:25 +0100
Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use
References: <200012081851.NAA32254@cj20424-a.reston1.va.home.com>
Message-ID: <000901c06307$9a814d60$3c6340d5@hagrid>

Guido wrote:
> Moshe proposes to add an overridable function sys.displayhook(obj)
> which will be called by the interpreter for the PRINT_EXPR opcode,
> instead of hardcoding the behavior.  The default implementation will
> of course have the current behavior, but this makes it much simpler to
> experiment with alternatives, e.g. using str() instead of repr() (or
> to choose between str() and repr() based on the type).

hmm.  instead of patching here and there, what's stopping us
from doing it the right way?  I'd prefer something like:

    import code

    class myCLI(code.InteractiveConsole):
        def displayhook(self, data):
            # non-standard display hook
            print str(data)

    sys.setcli(myCLI())

(in other words, why not move the *entire* command line interface
over to Python code)

</F>




From guido at python.org  Mon Dec 11 03:24:20 2000
From: guido at python.org (Guido van Rossum)
Date: Sun, 10 Dec 2000 21:24:20 -0500
Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use
In-Reply-To: Your message of "Mon, 11 Dec 2000 01:16:25 +0100."
             <000901c06307$9a814d60$3c6340d5@hagrid> 
References: <200012081851.NAA32254@cj20424-a.reston1.va.home.com>  
            <000901c06307$9a814d60$3c6340d5@hagrid> 
Message-ID: <200012110224.VAA12844@cj20424-a.reston1.va.home.com>

> Guido wrote:
> > Moshe proposes to add an overridable function sys.displayhook(obj)
> > which will be called by the interpreter for the PRINT_EXPR opcode,
> > instead of hardcoding the behavior.  The default implementation will
> > of course have the current behavior, but this makes it much simpler to
> > experiment with alternatives, e.g. using str() instead of repr() (or
> > to choose between str() and repr() based on the type).

Effbot regurgitates:
> hmm.  instead of patching here and there, what's stopping us
> from doing it the right way?  I'd prefer something like:
> 
>     import code
> 
>     class myCLI(code.InteractiveConsole):
>         def displayhook(self, data):
>             # non-standard display hook
>             print str(data)
> 
>     sys.setcli(myCLI())
> 
> (in other words, why not move the *entire* command line interface
> over to Python code)

Indeed, this is why I've been hesitant to bless Moshe's hack.  I
finally decided to go for it because I don't see this redesign of the
CLI happening anytime soon.  In order to do it right, it would require
a redesign of the parser input handling, which is probably the oldest
code in Python (short of the long integer math, which predates Python
by several years).  The current code module is a hack, alas, and
doesn't always get it right the same way as the *real* CLI does
things.

So, rather than wait forever for the perfect solution, I think it's
okay to settle for less sooner.  "Now is better than never."

--Guido van Rossum (home page: http://www.python.org/~guido/)



From paulp at ActiveState.com  Mon Dec 11 07:59:29 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 10 Dec 2000 22:59:29 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com>  
	            <3A33C533.ABA27C7C@prescod.net> <200012102217.RAA12550@cj20424-a.reston1.va.home.com>
Message-ID: <3A347B51.ADB3F12C@ActiveState.com>

Guido van Rossum wrote:
> 
>...
> 
> Disagree.  Warnings are there mostly for the Python system to warn the
> Python programmer.  The most heavy use will come from the standard
> library, not from user code.

Most Python code is part of some library or another. It may not be the
standard library but its still a library. Perl and Java both make
warnings (especially about deprecation) very easy *for user code*.

> >  * it should be possible to strip warnings as an optimization step. That
> > may require interpreter and syntax support.
> 
> I don't see the point of this.  I think this comes from our different
> views on who should issue warnings.

Everyone who creates a reusable library will want to issue warnings.
That is to say, most serious Python programmers.

Anyhow, let's presume that it is only the standard library that issues
warnings (for arguments sake). What if I have a speed-critical module
that triggers warnings in an inner loop. Turning off the warning doesn't
turn off the overhead of the warning infrastructure. I should be able to
turn off the overhead easily -- ideally from the Python command line.
And I still feel that part of that "overhead" is in the code that tests
to determine whether to issue the warnings. There should be a way to
turn off that overhead also.

 Paul



From paulp at ActiveState.com  Mon Dec 11 08:23:17 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 10 Dec 2000 23:23:17 -0800
Subject: [Python-Dev] Online help PEP
Message-ID: <3A3480E5.C2577AE6@ActiveState.com>

PEP: ???
Title: Python Online Help
Version: $Revision: 1.0 $
Author: paul at prescod.net, paulp at activestate.com (Paul Prescod)
Status: Draft
Type: Standards Track
Python-Version: 2.1
Status: Incomplete

Abstract

    This PEP describes a command-line driven online help facility
    for Python. The facility should be able to build on existing
    documentation facilities such as the Python documentation 
    and docstrings. It should also be extensible for new types and
    modules.

Interactive use:

    Simply typing "help" describes the help function (through repr 
    overloading).

    "help" can also be used as a function:

    The function takes the following forms of input:

        help( "string" ) -- built-in topic or global
        help( <ob> ) -- docstring from object or type
        help( "doc:filename" ) -- filename from Python documentation

    If you ask for a global, it can be a fully-qualfied name such as 
    help("xml.dom").

    You can also use the facility from a command-line

    python --help if

    In either situation, the output does paging similar to the "more"
    command. 

Implementation

    The help function is implemented in an onlinehelp module which is
    demand-loaded.

    There should be options for fetching help information from
environments 
    other than the command line through the onlinehelp module:

        onelinehelp.gethelp(object_or_string) -> string

    It should also be possible to override the help display function by
    assigning to onlinehelp.displayhelp(object_or_string).

    The module should be able to extract module information from either 
    the HTML or LaTeX versions of the Python documentation. Links should
    be accommodated in a "lynx-like" manner. 

    Over time, it should also be able to recognize when docstrings are 
    in "special" syntaxes like structured text, HTML and LaTeX and
decode 
    them appropriately.

    A prototype implementation is available with the Python source 
    distribution as nondist/sandbox/doctools/onlinehelp.py.

Built-in Topics

    help( "intro" )  - What is Python? Read this first!
    help( "keywords" )  - What are the keywords?
    help( "syntax" )  - What is the overall syntax?
    help( "operators" )  - What operators are available?
    help( "builtins" )  - What functions, types, etc. are built-in?
    help( "modules" )  - What modules are in the standard library?
    help( "copyright" )  - Who owns Python?
    help( "moreinfo" )  - Where is there more information?
    help( "changes" )  - What changed in Python 2.0?
    help( "extensions" )  - What extensions are installed?
    help( "faq" )  - What questions are frequently asked?
    help( "ack" )  - Who has done work on Python lately?

Security Issues

    This module will attempt to import modules with the same names as
    requested topics. Don't use the modules if you are not confident
that 
    everything in your pythonpath is from a trusted source.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:



From tim.one at home.com  Mon Dec 11 08:36:57 2000
From: tim.one at home.com (Tim Peters)
Date: Mon, 11 Dec 2000 02:36:57 -0500
Subject: [Python-Dev] FW: [Python-Help] indentation
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDPIDAA.tim.one@home.com>

While we're talking about pluggable CLIs, I share this fellow's confusion
over IDLE's CLI variant:  block code doesn't "look right" under IDLE because
sys.ps2 doesn't exist under IDLE.  Some days you can't make *anybody* happy
<wink>.

-----Original Message-----
...

Subject: [Python-Help] indentation
Sent: Sunday, December 10, 2000 7:32 AM

...

My Problem has to do with identation:

I put the following to idle:

>>> if not 1:
	print 'Hallo'
	else:

SyntaxError: invalid syntax

I get the Message above.

I know that else must be 4 spaces to the left, but idle doesn't let me do
this.

I have only the alternative to put to most left point. But than I
disturb the block structure and I get again the error message.

I want to have it like this:

>>> if not 1:
	print 'Hallo'
    else:

Can you help me?

...




From fredrik at pythonware.com  Mon Dec 11 12:36:53 2000
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 11 Dec 2000 12:36:53 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com>
Message-ID: <033701c06366$ab746580$0900a8c0@SPIFF>

christian wrote:
> That algorithm is really a gem which you should know,
> so let me try to explain it.

I think someone just won the "brain exploder 2000" award ;-)

to paraphrase Bertrand Russell,

    "Mathematics may be defined as the subject where I never
    know what you are talking about, nor whether what you are
    saying is true"

cheers /F




From thomas at xs4all.net  Mon Dec 11 13:12:09 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 11 Dec 2000 13:12:09 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <033701c06366$ab746580$0900a8c0@SPIFF>; from fredrik@pythonware.com on Mon, Dec 11, 2000 at 12:36:53PM +0100
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF>
Message-ID: <20001211131208.G4396@xs4all.nl>

On Mon, Dec 11, 2000 at 12:36:53PM +0100, Fredrik Lundh wrote:
> christian wrote:
> > That algorithm is really a gem which you should know,
> > so let me try to explain it.

> I think someone just won the "brain exploder 2000" award ;-)

By acclamation, I'd expect. I know it was the best laugh I had since last
week's Have I Got News For You, even though trying to understand it made me
glad I had boring meetings to recuperate in ;)

Highschool-dropout-ly y'rs,

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mal at lemburg.com  Mon Dec 11 13:33:18 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 11 Dec 2000 13:33:18 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF>
Message-ID: <3A34C98E.7C42FD24@lemburg.com>

Fredrik Lundh wrote:
> 
> christian wrote:
> > That algorithm is really a gem which you should know,
> > so let me try to explain it.
> 
> I think someone just won the "brain exploder 2000" award ;-)
> 
> to paraphrase Bertrand Russell,
> 
>     "Mathematics may be defined as the subject where I never
>     know what you are talking about, nor whether what you are
>     saying is true"

Hmm, I must have missed that one... care to repost ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tismer at tismer.com  Mon Dec 11 14:49:48 2000
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 11 Dec 2000 15:49:48 +0200
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF>
Message-ID: <3A34DB7C.FF7E82CE@tismer.com>


Fredrik Lundh wrote:
> 
> christian wrote:
> > That algorithm is really a gem which you should know,
> > so let me try to explain it.
> 
> I think someone just won the "brain exploder 2000" award ;-)
> 
> to paraphrase Bertrand Russell,
> 
>     "Mathematics may be defined as the subject where I never
>     know what you are talking about, nor whether what you are
>     saying is true"

:-))

Well, I was primarily targeting Guido, who said that he
came from math, and one cannot study math without standing
a basic algebra course, I think. I tried my best to explain
it for those who know at least how groups, fields, rings
and automorphisms work. Going into more details of the
theory would be off-topic for python-dev, but I will try
it in an upcoming DDJ article.

As you might have guessed, I didn't do this just for fun.
It is the old game of explaining what is there, convincing
everybody that you at least know what you are talking about,
and then three days later coming up with an improved
application of the theory.

Today is Monday, 2 days left. :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From guido at python.org  Mon Dec 11 16:12:24 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 10:12:24 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: Your message of "Mon, 11 Dec 2000 15:49:48 +0200."
             <3A34DB7C.FF7E82CE@tismer.com> 
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF>  
            <3A34DB7C.FF7E82CE@tismer.com> 
Message-ID: <200012111512.KAA23622@cj20424-a.reston1.va.home.com>

> Fredrik Lundh wrote:
> > 
> > christian wrote:
> > > That algorithm is really a gem which you should know,
> > > so let me try to explain it.
> > 
> > I think someone just won the "brain exploder 2000" award ;-)
> > 
> > to paraphrase Bertrand Russell,
> > 
> >     "Mathematics may be defined as the subject where I never
> >     know what you are talking about, nor whether what you are
> >     saying is true"
> 
> :-))
> 
> Well, I was primarily targeting Guido, who said that he
> came from math, and one cannot study math without standing
> a basic algebra course, I think. I tried my best to explain
> it for those who know at least how groups, fields, rings
> and automorphisms work. Going into more details of the
> theory would be off-topic for python-dev, but I will try
> it in an upcoming DDJ article.

I do have a math degree, but it is 18 years old and I had to give up
after the first paragraph of your explanation.  It made me vividly
recall the first and only class on Galois Theory that I ever took --
after one hour I realized that this was not for me and I didn't have a
math brain after all.  I went back to the basement where the software
development lab was (i.e. a row of card punches :-).

> As you might have guessed, I didn't do this just for fun.
> It is the old game of explaining what is there, convincing
> everybody that you at least know what you are talking about,
> and then three days later coming up with an improved
> application of the theory.
> 
> Today is Monday, 2 days left. :-)

I'm very impressed.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Dec 11 16:15:02 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 10:15:02 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Sun, 10 Dec 2000 22:59:29 PST."
             <3A347B51.ADB3F12C@ActiveState.com> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <200012102217.RAA12550@cj20424-a.reston1.va.home.com>  
            <3A347B51.ADB3F12C@ActiveState.com> 
Message-ID: <200012111515.KAA23764@cj20424-a.reston1.va.home.com>

[me]
> > Disagree.  Warnings are there mostly for the Python system to warn the
> > Python programmer.  The most heavy use will come from the standard
> > library, not from user code.

[Paul Prescod]
> Most Python code is part of some library or another. It may not be the
> standard library but its still a library. Perl and Java both make
> warnings (especially about deprecation) very easy *for user code*.

Hey.  I'm not making it impossible to use warnings.  I'm making it
very easy.  All you have to do is put "from warnings import warn" at
the top of your library module.  Requesting a built-in or even a new
statement is simply excessive.

> > >  * it should be possible to strip warnings as an optimization step. That
> > > may require interpreter and syntax support.
> > 
> > I don't see the point of this.  I think this comes from our different
> > views on who should issue warnings.
> 
> Everyone who creates a reusable library will want to issue warnings.
> That is to say, most serious Python programmers.
> 
> Anyhow, let's presume that it is only the standard library that issues
> warnings (for arguments sake). What if I have a speed-critical module
> that triggers warnings in an inner loop. Turning off the warning doesn't
> turn off the overhead of the warning infrastructure. I should be able to
> turn off the overhead easily -- ideally from the Python command line.
> And I still feel that part of that "overhead" is in the code that tests
> to determine whether to issue the warnings. There should be a way to
> turn off that overhead also.

So rewrite your code so that it doesn't trigger the warning.  When you
get a warning, you're doing something that could be done in a better
way.  So don't whine about the performance.

It's a quality of implementation issue whether C code that tests for
issues that deserve warnings can do the test without slowing down code
that doesn't deserve a warning.  Ditto for standard library code.

Here's an example.  I expect there will eventually (not in 2.1 yet!)
warnings in the deprecated string module.  If you get such a warning
in a time-critical piece of code, the solution is to use string
methods -- not to while about the performance of the backwards
compatibility code.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Mon Dec 11 17:02:29 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 11 Dec 2000 11:02:29 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>
	<200012082258.RAA02389@cj20424-a.reston1.va.home.com>
	<3A33C533.ABA27C7C@prescod.net>
Message-ID: <14900.64149.910989.998348@anthem.concentric.net>

Some of my thoughts after reading the PEP and Paul/Guido's exchange.

- A function in the warn module is better than one in the sys module.
  "from warnings import warn" is good enough to not warrant a
  built-in.  I get the sense that the PEP description is behind
  Guido's currently implementation here.

- When PyErr_Warn() returns 1, does that mean a warning has been
  transmuted into an exception, or some other exception occurred
  during the setting of the warning?  (I think I know, but the PEP
  could be clearer here).

- It would be nice if lineno can be a range specification.  Other
  matches are based on regexps -- think of this as a line number
  regexp.

- Why not do setupwarnings() in site.py?

- Regexp matching on messages should be case insensitive.

- The second argument to sys.warn() or PyErr_Warn() can be any class,
  right?  If so, it's easy for me to have my own warning classes.
  What if I want to set up my own warnings filters?  Maybe if `action'
  could be a callable as well as a string.  Then in my IDE, I could
  set that to "mygui.popupWarningsDialog".

-Barry



From guido at python.org  Mon Dec 11 16:57:33 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 10:57:33 -0500
Subject: [Python-Dev] Online help PEP
In-Reply-To: Your message of "Sun, 10 Dec 2000 23:23:17 PST."
             <3A3480E5.C2577AE6@ActiveState.com> 
References: <3A3480E5.C2577AE6@ActiveState.com> 
Message-ID: <200012111557.KAA24266@cj20424-a.reston1.va.home.com>

I approve of the general idea.  Barry, please assign a PEP number.

> PEP: ???
> Title: Python Online Help
> Version: $Revision: 1.0 $
> Author: paul at prescod.net, paulp at activestate.com (Paul Prescod)
> Status: Draft
> Type: Standards Track
> Python-Version: 2.1
> Status: Incomplete
> 
> Abstract
> 
>     This PEP describes a command-line driven online help facility
>     for Python. The facility should be able to build on existing
>     documentation facilities such as the Python documentation 
>     and docstrings. It should also be extensible for new types and
>     modules.
> 
> Interactive use:
> 
>     Simply typing "help" describes the help function (through repr 
>     overloading).

Cute -- like license, copyright, credits I suppose.

>     "help" can also be used as a function:
> 
>     The function takes the following forms of input:
> 
>         help( "string" ) -- built-in topic or global

Why does a global require string quotes?

>         help( <ob> ) -- docstring from object or type
>         help( "doc:filename" ) -- filename from Python documentation

I'm missing

          help() -- table of contents

I'm not sure if the table of contents should be printed by the repr
output.

>     If you ask for a global, it can be a fully-qualfied name such as 
>     help("xml.dom").

Why are the string quotes needed?  When are they useful?

>     You can also use the facility from a command-line
> 
>     python --help if

Is this really useful?  Sounds like Perlism to me.

>     In either situation, the output does paging similar to the "more"
>     command. 

Agreed.  But how to implement paging in a platform-dependent manner?
On Unix, os.system("more") or "$PAGER" is likely to work.  On Windows,
I suppose we could use its MORE, although that's pretty braindead.  On
the Mac?  Also, inside IDLE or Pythonwin, invoking the system pager
isn't a good idea.

> Implementation
> 
>     The help function is implemented in an onlinehelp module which is
>     demand-loaded.

What does "demand-loaded" mean in a Python context?

>     There should be options for fetching help information from
>     environments other than the command line through the onlinehelp
>     module:
> 
>         onelinehelp.gethelp(object_or_string) -> string

Good idea.

>     It should also be possible to override the help display function by
>     assigning to onlinehelp.displayhelp(object_or_string).

Good idea.  Pythonwin and IDLE could use this.  But I'd like it to
work at least "okay" if they don't.

>     The module should be able to extract module information from either 
>     the HTML or LaTeX versions of the Python documentation. Links should
>     be accommodated in a "lynx-like" manner. 

I think this is beyond the scope.  The LaTeX isn't installed anywhere
(and processing would be too much work).  The HTML is installed only
on Windows, where there already is a way to get it to pop up in your
browser (actually two: it's in the Start menu, and also in IDLE's Help
menu).

>     Over time, it should also be able to recognize when docstrings are 
>     in "special" syntaxes like structured text, HTML and LaTeX and
>     decode them appropriately.

A standard syntax for docstrings is under development, PEP 216.  I
don't agree with the proposal there, but in any case the help PEP
should not attempt to legalize a different format than PEP 216.

>     A prototype implementation is available with the Python source 
>     distribution as nondist/sandbox/doctools/onlinehelp.py.

Neat.  I noticed that in a 24-line screen, the pagesize must be set to
21 to avoid stuff scrolling off the screen.  Maybe there's an off-by-3
error somewhere?

I also noticed that it always prints '1' when invoked as a function.
The new license pager in site.py avoids this problem.

help("operators") and several others raise an
AttributeError('handledocrl').

The "lynx-line links" don't work.

> Built-in Topics
> 
>     help( "intro" )  - What is Python? Read this first!
>     help( "keywords" )  - What are the keywords?
>     help( "syntax" )  - What is the overall syntax?
>     help( "operators" )  - What operators are available?
>     help( "builtins" )  - What functions, types, etc. are built-in?
>     help( "modules" )  - What modules are in the standard library?
>     help( "copyright" )  - Who owns Python?
>     help( "moreinfo" )  - Where is there more information?
>     help( "changes" )  - What changed in Python 2.0?
>     help( "extensions" )  - What extensions are installed?
>     help( "faq" )  - What questions are frequently asked?
>     help( "ack" )  - Who has done work on Python lately?

I think it's naive to expect this help facility to replace browsing
the website or the full documentation package.  There should be one
entry that says to point your browser there (giving the local
filesystem URL if available), and that's it.  The rest of the online
help facility should be concerned with exposing doc strings.

> Security Issues
> 
>     This module will attempt to import modules with the same names as
>     requested topics. Don't use the modules if you are not confident
>     that everything in your pythonpath is from a trusted source.

Yikes!  Another reason to avoid the "string" -> global variable
option.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Dec 11 17:53:37 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 11:53:37 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 11:02:29 EST."
             <14900.64149.910989.998348@anthem.concentric.net> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net>  
            <14900.64149.910989.998348@anthem.concentric.net> 
Message-ID: <200012111653.LAA24545@cj20424-a.reston1.va.home.com>

> Some of my thoughts after reading the PEP and Paul/Guido's exchange.
> 
> - A function in the warn module is better than one in the sys module.
>   "from warnings import warn" is good enough to not warrant a
>   built-in.  I get the sense that the PEP description is behind
>   Guido's currently implementation here.

Yes.  I've updated the PEP to match the (2nd) implementation.

> - When PyErr_Warn() returns 1, does that mean a warning has been
>   transmuted into an exception, or some other exception occurred
>   during the setting of the warning?  (I think I know, but the PEP
>   could be clearer here).

I've clarified this now: it returns 1 in either case.  You have to do
exception handling in either case.  I'm not telling why -- you don't
need to know.  The caller of PyErr_Warn() should not attempt to catch
the exception -- if that's your intent, you shouldn't be calling
PyErr_Warn().  And PyErr_Warn() is complicated enough that it has to
allow raising an exception.

> - It would be nice if lineno can be a range specification.  Other
>   matches are based on regexps -- think of this as a line number
>   regexp.

Too much complexity already.

> - Why not do setupwarnings() in site.py?

See the PEP and the current implementation.  The delayed-loading of
the warnings module means that we have to save the -W options as
sys.warnoptions.  (This also makes them work when multiple
interpreters are used -- they all get the -W options.)

> - Regexp matching on messages should be case insensitive.

Good point!  Done in my version of the code.

> - The second argument to sys.warn() or PyErr_Warn() can be any class,
>   right?

Almost.  It must be derived from __builtin__.Warning.

>   If so, it's easy for me to have my own warning classes.
>   What if I want to set up my own warnings filters?  Maybe if `action'
>   could be a callable as well as a string.  Then in my IDE, I could
>   set that to "mygui.popupWarningsDialog".

No, for that purpose you would override warnings.showwarning().

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Mon Dec 11 17:58:39 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 11 Dec 2000 17:58:39 +0100
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <14900.64149.910989.998348@anthem.concentric.net>; from barry@digicool.com on Mon, Dec 11, 2000 at 11:02:29AM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net>
Message-ID: <20001211175839.H4396@xs4all.nl>

On Mon, Dec 11, 2000 at 11:02:29AM -0500, Barry A. Warsaw wrote:

> - A function in the warn module is better than one in the sys module.
>   "from warnings import warn" is good enough to not warrant a
>   built-in.  I get the sense that the PEP description is behind
>   Guido's currently implementation here.

+1 on this. I have a response to Guido's first posted PEP on my laptop, but
due to a weekend in Germany wasn't able to post it before he updated the
PEP. I guess I can delete the arguments for this, now ;) but lets just say I
think 'sys' is being a bit overused, and the case of a function in sys and
its data in another module is just plain silly.

> - When PyErr_Warn() returns 1, does that mean a warning has been
>   transmuted into an exception, or some other exception occurred
>   during the setting of the warning?  (I think I know, but the PEP
>   could be clearer here).

How about returning 1 for 'warning turned into exception' and -1 for 'normal
exception' ? It would be slightly more similar to other functions if '-1'
meant 'exception', and it would be easy to put in an if statement -- and
still allow C code to ignore the produced error, if it wanted to.

> - It would be nice if lineno can be a range specification.  Other
>   matches are based on regexps -- think of this as a line number
>   regexp.

+0 on this... I'm not sure if such fine-grained control is really necessary.
I liked the hint at 'per function' granularity, but I realise it's tricky to
do right, what with naming issues and all that. 

> - Regexp matching on messages should be case insensitive.

How about being able to pass in compiled regexp objects as well as strings ?
I haven't looked at the implementation at all, so I'm not sure how expensive
it would be, but it might also be nice to have users (= programmers) pass in
an object with its own 'match' method, so you can 'interactively' decide
whether or not to raise an exception, popup a window, and what not. Sort of
like letting 'action' be a callable, which I think is a good idea as well.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Mon Dec 11 18:11:02 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 12:11:02 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 17:58:39 +0100."
             <20001211175839.H4396@xs4all.nl> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net>  
            <20001211175839.H4396@xs4all.nl> 
Message-ID: <200012111711.MAA24818@cj20424-a.reston1.va.home.com>

> > - When PyErr_Warn() returns 1, does that mean a warning has been
> >   transmuted into an exception, or some other exception occurred
> >   during the setting of the warning?  (I think I know, but the PEP
> >   could be clearer here).
> 
> How about returning 1 for 'warning turned into exception' and -1 for 'normal
> exception' ? It would be slightly more similar to other functions if '-1'
> meant 'exception', and it would be easy to put in an if statement -- and
> still allow C code to ignore the produced error, if it wanted to.

Why would you want this?  The user clearly said that they wanted the
exception!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at effbot.org  Mon Dec 11 18:13:10 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Mon, 11 Dec 2000 18:13:10 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34C98E.7C42FD24@lemburg.com>
Message-ID: <009a01c06395$a9da3220$3c6340d5@hagrid>

> Hmm, I must have missed that one... care to repost ?

doesn't everyone here read the daily URL?

here's a link:
http://mail.python.org/pipermail/python-dev/2000-December/010913.html

</F>




From barry at digicool.com  Mon Dec 11 18:18:04 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 11 Dec 2000 12:18:04 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>
	<200012082258.RAA02389@cj20424-a.reston1.va.home.com>
	<3A33C533.ABA27C7C@prescod.net>
	<14900.64149.910989.998348@anthem.concentric.net>
	<200012111653.LAA24545@cj20424-a.reston1.va.home.com>
Message-ID: <14901.3149.109401.151742@anthem.concentric.net>

>>>>> "GvR" == Guido van Rossum <guido at python.org> writes:

    GvR> I've clarified this now: it returns 1 in either case.  You
    GvR> have to do exception handling in either case.  I'm not
    GvR> telling why -- you don't need to know.  The caller of
    GvR> PyErr_Warn() should not attempt to catch the exception -- if
    GvR> that's your intent, you shouldn't be calling PyErr_Warn().
    GvR> And PyErr_Warn() is complicated enough that it has to allow
    GvR> raising an exception.

Makes sense.

    >> - It would be nice if lineno can be a range specification.
    >> Other matches are based on regexps -- think of this as a line
    >> number regexp.

    GvR> Too much complexity already.

Okay, no biggie I think.

    >> - Why not do setupwarnings() in site.py?

    GvR> See the PEP and the current implementation.  The
    GvR> delayed-loading of the warnings module means that we have to
    GvR> save the -W options as sys.warnoptions.  (This also makes
    GvR> them work when multiple interpreters are used -- they all get
    GvR> the -W options.)

Cool.

    >> - Regexp matching on messages should be case insensitive.

    GvR> Good point!  Done in my version of the code.

Cool.

    >> - The second argument to sys.warn() or PyErr_Warn() can be any
    >> class, right?

    GvR> Almost.  It must be derived from __builtin__.Warning.

__builtin__.Warning == exceptions.Warning, right?

    >> If so, it's easy for me to have my own warning classes.  What
    >> if I want to set up my own warnings filters?  Maybe if `action'
    >> could be a callable as well as a string.  Then in my IDE, I
    >> could set that to "mygui.popupWarningsDialog".

    GvR> No, for that purpose you would override
    GvR> warnings.showwarning().

Cool.

Looks good.
-Barry



From thomas at xs4all.net  Mon Dec 11 19:04:56 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 11 Dec 2000 19:04:56 +0100
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012111711.MAA24818@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 12:11:02PM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> <200012111711.MAA24818@cj20424-a.reston1.va.home.com>
Message-ID: <20001211190455.I4396@xs4all.nl>

On Mon, Dec 11, 2000 at 12:11:02PM -0500, Guido van Rossum wrote:

> > How about returning 1 for 'warning turned into exception' and -1 for 'normal
> > exception' ? It would be slightly more similar to other functions if '-1'
> > meant 'exception', and it would be easy to put in an if statement -- and
> > still allow C code to ignore the produced error, if it wanted to.

> Why would you want this?  The user clearly said that they wanted the
> exception!

The difference is that in one case, the user will see the original
warning-turned-exception, and in the other she won't -- the warning will be
lost. At best she'll see (by looking at the traceback) the code intended to
give a warning (that might or might not have been turned into an exception)
and failed. The warning code might decide to do something aditional to
notify the user of the thing it intended to warn about, which ended up as a
'real' exception because of something else.

It's no biggy, obviously, except that if you change your mind it will be
hard to add it without breaking code. Even if you explicitly state the
return value should be tested for boolean value, not greater-than-zero
value.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Mon Dec 11 19:16:58 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 13:16:58 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 19:04:56 +0100."
             <20001211190455.I4396@xs4all.nl> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> <200012111711.MAA24818@cj20424-a.reston1.va.home.com>  
            <20001211190455.I4396@xs4all.nl> 
Message-ID: <200012111816.NAA25214@cj20424-a.reston1.va.home.com>

> > > How about returning 1 for 'warning turned into exception' and -1 for 'normal
> > > exception' ? It would be slightly more similar to other functions if '-1'
> > > meant 'exception', and it would be easy to put in an if statement -- and
> > > still allow C code to ignore the produced error, if it wanted to.
> 
> > Why would you want this?  The user clearly said that they wanted the
> > exception!
> 
> The difference is that in one case, the user will see the original
> warning-turned-exception, and in the other she won't -- the warning will be
> lost. At best she'll see (by looking at the traceback) the code intended to
> give a warning (that might or might not have been turned into an exception)
> and failed.

Yes -- this is a standard convention in Python.  if there's a bug in
code that is used to raise or handle an exception, you get a traceback
from that bug.

> The warning code might decide to do something aditional to
> notify the user of the thing it intended to warn about, which ended up as a
> 'real' exception because of something else.

Nah.  The warning code shouldn't worry about that.  If there's a bug
in PyErr_Warn(), that should get top priority until it's fixed.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Mon Dec 11 19:12:56 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 11 Dec 2000 19:12:56 +0100
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34C98E.7C42FD24@lemburg.com> <009a01c06395$a9da3220$3c6340d5@hagrid>
Message-ID: <3A351928.3A41C970@lemburg.com>

Fredrik Lundh wrote:
> 
> > Hmm, I must have missed that one... care to repost ?
> 
> doesn't everyone here read the daily URL?

No time for pull logic... only push logic ;-)

> here's a link:
> http://mail.python.org/pipermail/python-dev/2000-December/010913.html

Thanks.

A very nice introduction indeed. The only thing which
didn't come through in the first reading: why do we need
GF(p^n)'s in the first place ? The second reading then made this
clear: we need to assure that by iterating through the set of
possible coefficients we can actually reach all slots in the
dictionary... a gem indeed.

Now if we could only figure out an equally simple way of
producing perfect hash functions on-the-fly we could eliminate
the need for the PyObject_Compare()s... ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Mon Dec 11 21:22:55 2000
From: tim.one at home.com (Tim Peters)
Date: Mon, 11 Dec 2000 15:22:55 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <033701c06366$ab746580$0900a8c0@SPIFF>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEFJIDAA.tim.one@home.com>

[/F, on Christian's GF tutorial]
> I think someone just won the "brain exploder 2000" award ;-)

Well, anyone can play.  When keys collide, what we need is a function f(i)
such that repeating
    i = f(i)
visits every int in (0, 2**N) exactly once before setting i back to its
initial value, for a fixed N and where the first i is in (0, 2**N).  This is
the quickest:

def f(i):
    i -= 1
    if i == 0:
        i = 2**N-1
    return i

Unfortunately, this leads to performance-destroying "primary collisions"
(see Knuth, or any other text w/ a section on hashing).

Other *good* possibilities include a pseudo-random number generator of
maximal period, or viewing the ints in (0, 2**N) as bit vectors indicating
set membership and generating all subsets of an N-element set in a Grey code
order.

The *form* of the function dictobject.c actually uses is:

def f(i):
    i <<= 1
    if i >= 2**N:
       i ^= MAGIC_CONSTANT_DEPENDING_ON_N
    return i

which is suitably non-linear and as fast as the naive method.  Given the
form of the function, you don't need any theory at all to find a value for
MAGIC_CONSTANT_DEPENDING_ON_N that simply works.  In fact, I verified all
the magic values in dictobject.c via brute force, because the person who
contributed the original code botched the theory slightly and gave us some
values that didn't work.  I'll rely on the theory if and only if we have to
extend this to 64-bit machines someday:  I'm too old to wait for a brute
search of a space with 2**64 elements <wink>.

mathematics-is-a-battle-against-mortality-ly y'rs  - tim




From greg at cosc.canterbury.ac.nz  Mon Dec 11 22:46:11 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 12 Dec 2000 10:46:11 +1300 (NZDT)
Subject: [Python-Dev] Online help PEP
In-Reply-To: <200012111557.KAA24266@cj20424-a.reston1.va.home.com>
Message-ID: <200012112146.KAA01771@s454.cosc.canterbury.ac.nz>

Guido:
> Paul Prescod:
> > In either situation, the output does paging similar to the "more"
> > command. 
> Agreed.

Only if it can be turned off! I usually prefer to use the
scrolling capabilities of whatever shell window I'm using
rather than having some program's own idea of how to do
paging forced upon me when I don't want it.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From moshez at zadka.site.co.il  Tue Dec 12 07:33:02 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue, 12 Dec 2000 08:33:02 +0200 (IST)
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
Message-ID: <20001212063302.05E0BA82E@darjeeling.zadka.site.co.il>

On Mon, 11 Dec 2000 15:22:55 -0500, "Tim Peters" <tim.one at home.com> wrote:

> Well, anyone can play.  When keys collide, what we need is a function f(i)
> such that repeating
>     i = f(i)
> visits every int in (0, 2**N) exactly once before setting i back to its
> initial value, for a fixed N and where the first i is in (0, 2**N).  

OK, maybe this is me being *real* stupid, but why? Why not [0, 2**n)?
Did 0 harm you in your childhood, and you're trying to get back? <0 wink>.

If we had an affine operation, instead of a linear one, we could have 
[0, 2**n). I won't repeat the proof here but changing

> def f(i):
>     i <<= 1
      i^=1 # This is the line I added
>     if i >= 2**N:
>        i ^= MAGIC_CONSTANT_DEPENDING_ON_N
>     return i

Makes you waltz all over [0, 2**n) if the original made you comple 
(0, 2**n). 

if-i'm-wrong-then-someone-should-shoot-me-to-save-me-the-embarrasment-ly y'rs,
Z.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tim.one at home.com  Mon Dec 11 23:38:56 2000
From: tim.one at home.com (Tim Peters)
Date: Mon, 11 Dec 2000 17:38:56 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <20001212063302.05E0BA82E@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEGBIDAA.tim.one@home.com>

[Tim]
> Well, anyone can play.  When keys collide, what we need is a
> function f(i) such that repeating
>     i = f(i)
> visits every int in (0, 2**N) exactly once before setting i back to its
> initial value, for a fixed N and where the first i is in (0, 2**N).

[Moshe Zadka]
> OK, maybe this is me being *real* stupid, but why? Why not [0, 2**n)?
> Did 0 harm you in your childhood, and you're trying to get
> back? <0 wink>.

We don't need f at all unless we've already determined there's a collision
at some index h.  The i sequence is used to offset h (mod 2**N).  An
increment of 0 would waste time (h+0 == h, but we've already done a full
compare on the h'th table entry and already determined it wasn't equal to
what we're looking for).

IOW, there are only 2**N-1 slots still of interest by the time f is needed.

> If we had an affine operation, instead of a linear one, we could have
> [0, 2**n). I won't repeat the proof here but changing
>
> def f(i):
>     i <<= 1
>     i^=1 # This is the line I added
>     if i >= 2**N:
>         i ^= MAGIC_CONSTANT_DEPENDING_ON_N
>     return i
>
> Makes you waltz all over [0, 2**n) if the original made you comple
> (0, 2**n).

But, Moshe!  The proof would have been the most interesting part <wink>.




From gstein at lyra.org  Tue Dec 12 01:15:50 2000
From: gstein at lyra.org (Greg Stein)
Date: Mon, 11 Dec 2000 16:15:50 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012111653.LAA24545@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 11:53:37AM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com>
Message-ID: <20001211161550.Y7732@lyra.org>

On Mon, Dec 11, 2000 at 11:53:37AM -0500, Guido van Rossum wrote:
>...
> > - The second argument to sys.warn() or PyErr_Warn() can be any class,
> >   right?
> 
> Almost.  It must be derived from __builtin__.Warning.

Since you must do "from warnings import warn" before using the warnings,
then I think it makes sense to put the Warning classes into the warnings
module. (e.g. why increase the size of the builtins?)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From guido at python.org  Tue Dec 12 01:39:31 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 19:39:31 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 16:15:50 PST."
             <20001211161550.Y7732@lyra.org> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com>  
            <20001211161550.Y7732@lyra.org> 
Message-ID: <200012120039.TAA02983@cj20424-a.reston1.va.home.com>

> Since you must do "from warnings import warn" before using the warnings,
> then I think it makes sense to put the Warning classes into the warnings
> module. (e.g. why increase the size of the builtins?)

I don't particularly care whether the Warning category classes are
builtins, but I can't declare them in the warnings module.  Typical
use from C is:

    if (PyErr_Warn(PyExc_DeprecationWarning,
		   "the strop module is deprecated"))
            return NULL;

PyErr_Warn() imports the warnings module on its first call.  But the
value of PyExc_DeprecationWarning c.s. must be available *before* the
first call, so they can't be imported from the warnings module!

My first version imported warnings at the start of the program, but
this almost doubled the start-up time, hence the design where the
module is imported only when needed.

The most convenient place to create the Warning category classes is in
the _exceptions module; doing it the easiest way there means that they
are automatically exported to __builtin__.  This doesn't bother me
enough to try and hide them.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gstein at lyra.org  Tue Dec 12 02:11:02 2000
From: gstein at lyra.org (Greg Stein)
Date: Mon, 11 Dec 2000 17:11:02 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012120039.TAA02983@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 07:39:31PM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com>
Message-ID: <20001211171102.C7732@lyra.org>

On Mon, Dec 11, 2000 at 07:39:31PM -0500, Guido van Rossum wrote:
> > Since you must do "from warnings import warn" before using the warnings,
> > then I think it makes sense to put the Warning classes into the warnings
> > module. (e.g. why increase the size of the builtins?)
> 
> I don't particularly care whether the Warning category classes are
> builtins, but I can't declare them in the warnings module.  Typical
> use from C is:
> 
>     if (PyErr_Warn(PyExc_DeprecationWarning,
> 		   "the strop module is deprecated"))
>             return NULL;
> 
> PyErr_Warn() imports the warnings module on its first call.  But the
> value of PyExc_DeprecationWarning c.s. must be available *before* the
> first call, so they can't be imported from the warnings module!

Do the following:

pywarn.h or pyerrors.h:

#define PyWARN_DEPRECATION "DeprecationWarning"

     ...
     if (PyErr_Warn(PyWARN_DEPRECATION,
 		   "the strop module is deprecated"))
             return NULL;

The PyErr_Warn would then use the string to dynamically look up / bind to
the correct value from the warnings module. By using the symbolic constant,
you will catch typos in the C code (e.g. if people passed raw strings, then
a typo won't be found until runtime; using symbols will catch the problem at
compile time).

The above strategy will allow for fully-delayed loading, and for all the
warnings to be located in the "warnings" module.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From guido at python.org  Tue Dec 12 02:21:41 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 11 Dec 2000 20:21:41 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Mon, 11 Dec 2000 17:11:02 PST."
             <20001211171102.C7732@lyra.org> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com>  
            <20001211171102.C7732@lyra.org> 
Message-ID: <200012120121.UAA04576@cj20424-a.reston1.va.home.com>

> > PyErr_Warn() imports the warnings module on its first call.  But the
> > value of PyExc_DeprecationWarning c.s. must be available *before* the
> > first call, so they can't be imported from the warnings module!
> 
> Do the following:
> 
> pywarn.h or pyerrors.h:
> 
> #define PyWARN_DEPRECATION "DeprecationWarning"
> 
>      ...
>      if (PyErr_Warn(PyWARN_DEPRECATION,
>  		   "the strop module is deprecated"))
>              return NULL;
> 
> The PyErr_Warn would then use the string to dynamically look up / bind to
> the correct value from the warnings module. By using the symbolic constant,
> you will catch typos in the C code (e.g. if people passed raw strings, then
> a typo won't be found until runtime; using symbols will catch the problem at
> compile time).
> 
> The above strategy will allow for fully-delayed loading, and for all the
> warnings to be located in the "warnings" module.

Yeah, that would be a possibility, if it was deemed evil that the
warnings appear in __builtin__.  I don't see what's so evil about
that.

(There's also the problem that the C code must be able to create new
warning categories, as long as they are derived from the Warning base
class.  Your approach above doesn't support this.  I'm sure you can
figure a way around that too.  But I prefer to hear why you think it's
needed first.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gstein at lyra.org  Tue Dec 12 02:26:00 2000
From: gstein at lyra.org (Greg Stein)
Date: Mon, 11 Dec 2000 17:26:00 -0800
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012120121.UAA04576@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 08:21:41PM -0500
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com> <20001211171102.C7732@lyra.org> <200012120121.UAA04576@cj20424-a.reston1.va.home.com>
Message-ID: <20001211172600.E7732@lyra.org>

On Mon, Dec 11, 2000 at 08:21:41PM -0500, Guido van Rossum wrote:
>...
> > The above strategy will allow for fully-delayed loading, and for all the
> > warnings to be located in the "warnings" module.
> 
> Yeah, that would be a possibility, if it was deemed evil that the
> warnings appear in __builtin__.  I don't see what's so evil about
> that.
> 
> (There's also the problem that the C code must be able to create new
> warning categories, as long as they are derived from the Warning base
> class.  Your approach above doesn't support this.  I'm sure you can
> figure a way around that too.  But I prefer to hear why you think it's
> needed first.)

I'm just attempting to avoid dumping more names into __builtins__ is all. I
don't believe there is anything intrinsically bad about putting more names
in there, but avoiding the kitchen-sink metaphor for __builtins__ has got to
be a Good Thing :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From guido at python.org  Tue Dec 12 14:43:59 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 12 Dec 2000 08:43:59 -0500
Subject: [Python-Dev] Request review of gdbm patch
Message-ID: <200012121343.IAA06713@cj20424-a.reston1.va.home.com>

I'm asking for a review of the patch to gdbm at 

http://sourceforge.net/patch/?func=detailpatch&patch_id=102638&group_id=5470

I asked the author for clarification and this is what I got.

Can anybody suggest what to do?  His mail doesn't give me much
confidence in the patch. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Tue, 12 Dec 2000 13:24:13 +0100
From:    Damjan <arhiv at freemail.org.mk>
To:      Guido van Rossum <guido at python.org>
Subject: Re: your gdbm patch for Python

On Mon, Dec 11, 2000 at 03:51:03PM -0500, Guido van Rossum wrote:
> I'm looking at your patch at SourceForge:

First, I'm sorry it was such a mess of a patch, but I could't figure it out how
to send a more elaborate comment. (But then again, I would't have an email from
Guido van Rossum in my mail-box, to show of my friends :)

> and wondering two things:
> 
> (1) what does the patch do?
> 
> (2) why does the patch remove the 'f' / GDBM_FAST option?

 From the gdbm info page:
     ...The following may also be
     logically or'd into the database flags: GDBM_SYNC, which causes
     all database operations to be synchronized to the disk, and
     GDBM_NOLOCK, which prevents the library from performing any
     locking on the database file.  The option GDBM_FAST is now
     obsolete, since `gdbm' defaults to no-sync mode...
     ^^^^^^^^
(1) My patch adds two options to the gdbm.open(..) function. These are 'u' for
GDBM_NOLOCK, and 's' for GDBM_SYNC.

(2) GDBM_FAST is obsolete because gdbm defaults to GDBM_FAST, so it's removed.

I'm also thinking about adding a lock and unlock methods to the gdbm object,
but it seems that a gdbm database can only be locked and not unlocked.


- -- 
Damjan Georgievski		|           ???????????? ??????????????????????
Skopje, Macedonia		|           ????????????, ????????????????????

------- End of Forwarded Message




From mal at lemburg.com  Tue Dec 12 14:49:40 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 12 Dec 2000 14:49:40 +0100
Subject: [Python-Dev] Codec aliasing and naming
Message-ID: <3A362CF4.2082A606@lemburg.com>

I just wanted to inform you of a change I plan for the standard
encodings search function to enable better support for aliasing
of encoding names.

The current implementation caches the aliases returned from the
codecs .getaliases() function in the encodings lookup cache
rather than in the alias cache. As a consequence, the hyphen to
underscore mapping is not applied to the aliases. A codec would
have to return a list of all combinations of names with hyphens
and underscores in order to emulate the standard lookup 
behaviour.

I have a ptach which fixes this and also assures that aliases
cannot be overwritten by codecs which register at some later
point in time. This assures that we won't run into situations
where a codec import suddenly overrides behaviour of previously
active codecs.

I would also like to propose the use of a new naming scheme
for codecs which enables drop-in installation. As discussed
on the i18n-sig list, people would like to install codecs
without having the users to call a codec registration function
or to touch site.py.

The standard search function in the encodings package has a
nice property (which I only noticed after the fact ;) which
allows using Python package names in the encoding names,
e.g. you can install a package 'japanese' and the access the
codecs in that package using 'japanese.shiftjis' without
having to bother registering a new codec search function
for the package -- the encodings package search function
will redirect the lookup to the 'japanese' package.

Using package names in the encoding name has several
advantages:
* you know where the codec comes from
* you can have mutliple codecs for the same encoding
* drop-in installation without registration is possible
* the need for a non-default encoding package is visible in the
  source code
* you no longer need to drop new codecs into the Python
  standard lib

Perhaps someone could add a note about this possibility
to the codec docs ?!

If noone objects, I'll apply the patch for the enhanced alias
support later today.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at python.org  Tue Dec 12 14:57:01 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 12 Dec 2000 08:57:01 -0500
Subject: [Python-Dev] Codec aliasing and naming
In-Reply-To: Your message of "Tue, 12 Dec 2000 14:49:40 +0100."
             <3A362CF4.2082A606@lemburg.com> 
References: <3A362CF4.2082A606@lemburg.com> 
Message-ID: <200012121357.IAA06846@cj20424-a.reston1.va.home.com>

> Perhaps someone could add a note about this possibility
> to the codec docs ?!

You can check it in yourself or mail it to Fred or submit it to SF...
I don't expect anyone else will jump in and document this properly.

> If noone objects, I'll apply the patch for the enhanced alias
> support later today.

Fine with me (but I don't use codecs -- where's the Dutch language
support? :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Tue Dec 12 15:38:20 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 12 Dec 2000 15:38:20 +0100
Subject: [Python-Dev] Codec aliasing and naming
References: <3A362CF4.2082A606@lemburg.com> <200012121357.IAA06846@cj20424-a.reston1.va.home.com>
Message-ID: <3A36385C.60C7F2B@lemburg.com>

Guido van Rossum wrote:
> 
> > Perhaps someone could add a note about this possibility
> > to the codec docs ?!
> 
> You can check it in yourself or mail it to Fred or submit it to SF...
> I don't expect anyone else will jump in and document this properly.

I'll submit a bug report so that this doesn't get lost in
the archives. Don't have time for it myself... alas, noone
really does seem to have time these days ;-)
 
> > If noone objects, I'll apply the patch for the enhanced alias
> > support later today.
> 
> Fine with me (but I don't use codecs -- where's the Dutch language
> support? :-).

OK. 

About the Dutch language support: this would make a nice
Christmas fun-project... a new standard module which interfaces
to babel.altavista.com (hmm, they don't list Dutch as a supported
language yet, but maybe if we bug them enough... ;).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From paulp at ActiveState.com  Tue Dec 12 19:11:13 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Tue, 12 Dec 2000 10:11:13 -0800
Subject: [Python-Dev] Online help PEP
References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com>
Message-ID: <3A366A41.1A14EFD4@ActiveState.com>

Guido van Rossum wrote:
> 
>...
> >         help( "string" ) -- built-in topic or global
> 
> Why does a global require string quotes?

It doesn't, but if you happen to say 

help( "dir" ) instead of help( dir ), I think it should do the right
thing.

> I'm missing
> 
>           help() -- table of contents
> 
> I'm not sure if the table of contents should be printed by the repr
> output.

I don't see any benefit in having different behaviors for help and
help().

> >     If you ask for a global, it can be a fully-qualfied name such as
> >     help("xml.dom").
> 
> Why are the string quotes needed?  When are they useful?

When you haven't imported the thing you are asking about. Or when the
string comes from another UI like an editor window, command line or web
form.

> >     You can also use the facility from a command-line
> >
> >     python --help if
> 
> Is this really useful?  Sounds like Perlism to me.

I'm just trying to make it easy to quickly get answers to Python
questions. I could totally see someone writing code in VIM switching to
a bash window to type:

python --help os.path.dirname

That's alot easier than:

$ python
Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import os
>>> help(os.path.dirname)

And what does it hurt?

> >     In either situation, the output does paging similar to the "more"
> >     command.
> 
> Agreed.  But how to implement paging in a platform-dependent manner?
> On Unix, os.system("more") or "$PAGER" is likely to work.  On Windows,
> I suppose we could use its MORE, although that's pretty braindead.  On
> the Mac?  Also, inside IDLE or Pythonwin, invoking the system pager
> isn't a good idea.

The current implementation does paging internally. You could override it
to use the system pager (or no pager).

> What does "demand-loaded" mean in a Python context?

When you "touch" the help object, it loads the onlinehelp module which
has the real implementation. The thing in __builtins__ is just a
lightweight proxy.

> >     It should also be possible to override the help display function by
> >     assigning to onlinehelp.displayhelp(object_or_string).
> 
> Good idea.  Pythonwin and IDLE could use this.  But I'd like it to
> work at least "okay" if they don't.

Agreed. 

> >     The module should be able to extract module information from either
> >     the HTML or LaTeX versions of the Python documentation. Links should
> >     be accommodated in a "lynx-like" manner.
> 
> I think this is beyond the scope.  

Well, we have to do one of:

 * re-write a subset of the docs in a form that can be accessed from the
command line
 * access the existing docs in a form that's installed
 * auto-convert the docs into a form that's compatible

I've already implemented HTML parsing and LaTeX parsing is actually not
that far off. I just need impetus to finish a LaTeX-parsing project I
started on my last vacation.

The reason that LaTeX is interesting is because it would be nice to be
able to move documentation from existing LaTeX files into docstrings.

> The LaTeX isn't installed anywhere
> (and processing would be too much work).  
> The HTML is installed only
> on Windows, where there already is a way to get it to pop up in your
> browser (actually two: it's in the Start menu, and also in IDLE's Help
> menu).

If the documentation becomes an integral part of the Python code, then
it will be installed. It's ridiculous that it isn't already.
ActivePython does install the docs on all platforms.

> A standard syntax for docstrings is under development, PEP 216.  I
> don't agree with the proposal there, but in any case the help PEP
> should not attempt to legalize a different format than PEP 216.

I won't hold my breath for a standard Python docstring format. I've gone
out of my way to make the code format independent..

> Neat.  I noticed that in a 24-line screen, the pagesize must be set to
> 21 to avoid stuff scrolling off the screen.  Maybe there's an off-by-3
> error somewhere?

Yes.

> I also noticed that it always prints '1' when invoked as a function.
> The new license pager in site.py avoids this problem.

Okay.

> help("operators") and several others raise an
> AttributeError('handledocrl').

Fixed.

> The "lynx-line links" don't work.

I don't think that's implemented yet.

> I think it's naive to expect this help facility to replace browsing
> the website or the full documentation package.  There should be one
> entry that says to point your browser there (giving the local
> filesystem URL if available), and that's it.  The rest of the online
> help facility should be concerned with exposing doc strings.

I don't want to replace the documentation. But there is no reason we
should set out to make it incomplete. If its integrated with the HTML
then people can choose whatever access mechanism is easiest for them
right now

I'm trying hard not to be "naive". Realistically, nobody is going to
write a million docstrings between now and Python 2.1. It is much more
feasible to leverage the existing documentation that Fred and others
have spent months on.

> > Security Issues
> > 
> >     This module will attempt to import modules with the same names as
> >     requested topics. Don't use the modules if you are not confident
> >     that everything in your pythonpath is from a trusted source.
> Yikes!  Another reason to avoid the "string" -> global variable
> option.

I don't think we should lose that option. People will want to look up
information from non-executable environments like command lines, GUIs
and web pages. Perhaps you can point me to techniques for extracting
information from Python modules and packages without executing them.

 Paul



From guido at python.org  Tue Dec 12 21:46:09 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 12 Dec 2000 15:46:09 -0500
Subject: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE
Message-ID: <200012122046.PAA16915@cj20424-a.reston1.va.home.com>

------- Forwarded Message

Date:    Tue, 12 Dec 2000 12:38:20 -0800
From:    noreply at sourceforge.net
To:      noreply at sourceforge.net
Subject: SourceForge: PROJECT DOWNTIME NOTICE

ATTENTION SOURCEFORGE PROJECT ADMINISTRATORS

This update is being sent to project administrators only and contains
important information regarding your project. Please read it in its
entirety.


INFRASTRUCTURE UPGRADE, EXPANSION AND RELOCATION

As noted in the sitewide email sent this week, the SourceForge.net
infrastructure is being upgraded (and relocated). As part of this
projects, plans are underway to further increase capacity and
responsiveness.

We are scheduling the relocation of the systems serving project
subdomain web pages.


IMPORTANT:

This move will affect you in the following ways:

1. Service and availability of SourceForge.net and the development
tools provided will continue uninterupted.

2. Project page webservers hosting subdomains
(yourprojectname.sourceforge.net) will be down Friday December 15 from
9PM PST (12AM EST) until 3AM PST.

3. CVS will be unavailable (read only part of the time) from 7PM
until 3AM PST

4. Mailing lists and mail aliases will be unavailable until 3AM PST


- ---------------------
This email was sent from sourceforge.net. To change your email receipt
preferences, please visit the site and edit your account via the
"Account Maintenance" link.

Direct any questions to admin at sourceforge.net, or reply to this email.

------- End of Forwarded Message




From greg at cosc.canterbury.ac.nz  Tue Dec 12 23:42:01 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 13 Dec 2000 11:42:01 +1300 (NZDT)
Subject: [Python-Dev] Online help PEP
In-Reply-To: <3A366A41.1A14EFD4@ActiveState.com>
Message-ID: <200012122242.LAA01902@s454.cosc.canterbury.ac.nz>

Paul Prescod:
> Guido:
> > Why are the string quotes needed?  When are they useful?
> When you haven't imported the thing you are asking about.

It would be interesting if the quoted form allowed you to
extract doc info from a module *without* having the side
effect of importing it.

This could no doubt be done for pure Python modules.
Would be rather tricky for extension modules, though,
I expect.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+




From barry at digicool.com  Wed Dec 13 03:21:36 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 12 Dec 2000 21:21:36 -0500
Subject: [Python-Dev] Two new PEPs, 232 & 233
Message-ID: <14902.56624.20961.768525@anthem.concentric.net>

I've just uploaded two new PEPs.  232 is a revision of my pre-PEP era
function attribute proposal.  233 is Paul Prescod's proposal for an
on-line help facility.

http://python.sourceforge.net/peps/pep-0232.html
http://python.sourceforge.net/peps/pep-0233.html

Let the games begin,
-Barry



From tim.one at home.com  Wed Dec 13 04:34:35 2000
From: tim.one at home.com (Tim Peters)
Date: Tue, 12 Dec 2000 22:34:35 -0500
Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEGBIDAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEJHIDAA.tim.one@home.com>

[Moshe Zadka]
> If we had an affine operation, instead of a linear one, we could have
> [0, 2**n). I won't repeat the proof here but changing
>
> def f(i):
>     i <<= 1
>     i^=1 # This is the line I added
>     if i >= 2**N:
>         i ^= MAGIC_CONSTANT_DEPENDING_ON_N
>     return i
>
> Makes you waltz all over [0, 2**n) if the original made you comple
> (0, 2**n).

[Tim]
> But, Moshe!  The proof would have been the most interesting part <wink>.

Turns out the proof would have been intensely interesting, as you can see by
running the attached with and without the new line commented out.

don't-ever-trust-a-theoretician<wink>-ly y'rs  - tim


N = 2
MAGIC_CONSTANT_DEPENDING_ON_N = 7

def f(i):
    i <<= 1
    # i^=1 # This is the line I added
    if i >= 2**N:
        i ^= MAGIC_CONSTANT_DEPENDING_ON_N
    return i

i = 1
for nothing in range(4):
    print i,
    i = f(i)
print i




From amk at mira.erols.com  Wed Dec 13 04:55:33 2000
From: amk at mira.erols.com (A.M. Kuchling)
Date: Tue, 12 Dec 2000 22:55:33 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
Message-ID: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>

At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
received a patch from Thomas Gellekum that adds support for the panel
library that will add another 500 lines.  I'd like to split the C file
into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
get #included from the master _cursesmodule.c file.  

Do the powers that be approve of this idea?

--amk



From tim.one at home.com  Wed Dec 13 04:54:20 2000
From: tim.one at home.com (Tim Peters)
Date: Tue, 12 Dec 2000 22:54:20 -0500
Subject: FW: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE
Message-ID: <LNBBLJKPBEHFEDALKOLCKEJKIDAA.tim.one@home.com>

FYI, looks like SourceForge is scheduled to be unusable in a span covering
late Friday thru early Saturday (OTT -- One True Time, defined by the clocks
in Guido's house).

-----Original Message-----
From: python-dev-admin at python.org [mailto:python-dev-admin at python.org]On
Behalf Of Guido van Rossum
Sent: Tuesday, December 12, 2000 3:46 PM
To: python-dev at python.org
Subject: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE



------- Forwarded Message

Date:    Tue, 12 Dec 2000 12:38:20 -0800
From:    noreply at sourceforge.net
To:      noreply at sourceforge.net
Subject: SourceForge: PROJECT DOWNTIME NOTICE

ATTENTION SOURCEFORGE PROJECT ADMINISTRATORS

This update is being sent to project administrators only and contains
important information regarding your project. Please read it in its
entirety.


INFRASTRUCTURE UPGRADE, EXPANSION AND RELOCATION

As noted in the sitewide email sent this week, the SourceForge.net
infrastructure is being upgraded (and relocated). As part of this
projects, plans are underway to further increase capacity and
responsiveness.

We are scheduling the relocation of the systems serving project
subdomain web pages.


IMPORTANT:

This move will affect you in the following ways:

1. Service and availability of SourceForge.net and the development
tools provided will continue uninterupted.

2. Project page webservers hosting subdomains
(yourprojectname.sourceforge.net) will be down Friday December 15 from
9PM PST (12AM EST) until 3AM PST.

3. CVS will be unavailable (read only part of the time) from 7PM
until 3AM PST

4. Mailing lists and mail aliases will be unavailable until 3AM PST


---------------------
This email was sent from sourceforge.net. To change your email receipt
preferences, please visit the site and edit your account via the
"Account Maintenance" link.

Direct any questions to admin at sourceforge.net, or reply to this email.

------- End of Forwarded Message


_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
http://www.python.org/mailman/listinfo/python-dev




From esr at thyrsus.com  Wed Dec 13 05:29:17 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 12 Dec 2000 23:29:17 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>; from amk@mira.erols.com on Tue, Dec 12, 2000 at 10:55:33PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
Message-ID: <20001212232917.A22839@thyrsus.com>

A.M. Kuchling <amk at mira.erols.com>:
> At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
> received a patch from Thomas Gellekum that adds support for the panel
> library that will add another 500 lines.  I'd like to split the C file
> into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
> get #included from the master _cursesmodule.c file.  
> 
> Do the powers that be approve of this idea?

I doubt I qualify as a power that be, but I'm certainly +1 on panel support.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The biggest hypocrites on gun control are those who live in upscale
developments with armed security guards -- and who want to keep other
people from having guns to defend themselves.  But what about
lower-income people living in high-crime, inner city neighborhoods?
Should such people be kept unarmed and helpless, so that limousine
liberals can 'make a statement' by adding to the thousands of gun laws
already on the books?"
	--Thomas Sowell



From fdrake at acm.org  Wed Dec 13 07:24:01 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 13 Dec 2000 01:24:01 -0500 (EST)
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
Message-ID: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>

A.M. Kuchling writes:
 > At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
 > received a patch from Thomas Gellekum that adds support for the panel
 > library that will add another 500 lines.  I'd like to split the C file
 > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
 > get #included from the master _cursesmodule.c file.  

  Would it be reasonable to add panel support as a second extension
module?  Is there really a need for them to be in the same module,
since the panel library is a separate library?


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From gstein at lyra.org  Wed Dec 13 08:58:38 2000
From: gstein at lyra.org (Greg Stein)
Date: Tue, 12 Dec 2000 23:58:38 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>; from amk@mira.erols.com on Tue, Dec 12, 2000 at 10:55:33PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
Message-ID: <20001212235838.T8951@lyra.org>

On Tue, Dec 12, 2000 at 10:55:33PM -0500, A.M. Kuchling wrote:
> At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
> received a patch from Thomas Gellekum that adds support for the panel
> library that will add another 500 lines.  I'd like to split the C file
> into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
> get #included from the master _cursesmodule.c file.  

Why should they be #included? I thought that we can build multiple .c files
into a module...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From gstein at lyra.org  Wed Dec 13 09:05:05 2000
From: gstein at lyra.org (Greg Stein)
Date: Wed, 13 Dec 2000 00:05:05 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects dictobject.c,2.68,2.69
In-Reply-To: <200012130102.RAA31828@slayer.i.sourceforge.net>; from tim_one@users.sourceforge.net on Tue, Dec 12, 2000 at 05:02:49PM -0800
References: <200012130102.RAA31828@slayer.i.sourceforge.net>
Message-ID: <20001213000505.U8951@lyra.org>

On Tue, Dec 12, 2000 at 05:02:49PM -0800, Tim Peters wrote:
> Update of /cvsroot/python/python/dist/src/Objects
> In directory slayer.i.sourceforge.net:/tmp/cvs-serv31776/python/dist/src/objects
> 
> Modified Files:
> 	dictobject.c 
> Log Message:
> Bring comments up to date (e.g., they still said the table had to be
> a prime size, which is in fact never true anymore ...).
>...
> --- 55,78 ----
>   
>   /*
> ! There are three kinds of slots in the table:
> ! 
> ! 1. Unused.  me_key == me_value == NULL
> !    Does not hold an active (key, value) pair now and never did.  Unused can
> !    transition to Active upon key insertion.  This is the only case in which
> !    me_key is NULL, and is each slot's initial state.
> ! 
> ! 2. Active.  me_key != NULL and me_key != dummy and me_value != NULL
> !    Holds an active (key, value) pair.  Active can transition to Dummy upon
> !    key deletion.  This is the only case in which me_value != NULL.
> ! 
> ! 3. Dummy.  me_key == dummy && me_value == NULL
> !    Previously held an active (key, value) pair, but that was deleted and an
> !    active pair has not yet overwritten the slot.  Dummy can transition to
> !    Active upon key insertion.  Dummy slots cannot be made Unused again
> !    (cannot have me_key set to NULL), else the probe sequence in case of
> !    collision would have no way to know they were once active.

4. The popitem finger.


:-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From moshez at zadka.site.co.il  Wed Dec 13 20:19:53 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Wed, 13 Dec 2000 21:19:53 +0200 (IST)
Subject: [Python-Dev] Splitting up _cursesmodule
Message-ID: <20001213191953.7208DA82E@darjeeling.zadka.site.co.il>

On Tue, 12 Dec 2000 23:29:17 -0500, "Eric S. Raymond" <esr at thyrsus.com> wrote:
> A.M. Kuchling <amk at mira.erols.com>:
> > At 2502 lines, _cursesmodule.c is cumbersomely large.  I've just
> > received a patch from Thomas Gellekum that adds support for the panel
> > library that will add another 500 lines.  I'd like to split the C file
> > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that
> > get #included from the master _cursesmodule.c file.  
> > 
> > Do the powers that be approve of this idea?
> 
> I doubt I qualify as a power that be, but I'm certainly +1 on panel support.
 
I'm +1 on panel support, but that seems the wrong solution. Why not
have several C moudles (_curses_panel,...) and manage a more unified
namespace with the Python wrapper modules?

/curses/panel.py -- from _curses_panel import *
etc.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From akuchlin at mems-exchange.org  Wed Dec 13 13:44:23 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 13 Dec 2000 07:44:23 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 01:24:01AM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
Message-ID: <20001213074423.A30348@kronos.cnri.reston.va.us>

[CC'ing Thomas Gellekum <tg at melaten.rwth-aachen.de>]

On Wed, Dec 13, 2000 at 01:24:01AM -0500, Fred L. Drake, Jr. wrote:
>  Would it be reasonable to add panel support as a second extension
>module?  Is there really a need for them to be in the same module,
>since the panel library is a separate library?

Quite possibly, though the patch isn't structured that way.  The panel
module will need access to the type object for the curses window
object, so it'll have to ensure that _curses is already imported, but
that's no problem.

Thomas, do you feel capable of implementing it as a separate module,
or should I work on it?  Probably a _cursesmodule.h header will have
to be created to make various definitions available to external users
of the basic objects in _curses.  (Bonus: this means that the menu and
form libraries, if they ever get wrapped, can be separate modules, too.)

--amk




From tg at melaten.rwth-aachen.de  Wed Dec 13 15:00:46 2000
From: tg at melaten.rwth-aachen.de (Thomas Gellekum)
Date: 13 Dec 2000 15:00:46 +0100
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: Andrew Kuchling's message of "Wed, 13 Dec 2000 07:44:23 -0500"
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
	<14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
	<20001213074423.A30348@kronos.cnri.reston.va.us>
Message-ID: <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>

Andrew Kuchling <akuchlin at mems-exchange.org> writes:

> [CC'ing Thomas Gellekum <tg at melaten.rwth-aachen.de>]
> 
> On Wed, Dec 13, 2000 at 01:24:01AM -0500, Fred L. Drake, Jr. wrote:
> >  Would it be reasonable to add panel support as a second extension
> >module?  Is there really a need for them to be in the same module,
> >since the panel library is a separate library?
> 
> Quite possibly, though the patch isn't structured that way.  The panel
> module will need access to the type object for the curses window
> object, so it'll have to ensure that _curses is already imported, but
> that's no problem.

You mean as separate modules like

import curses
import panel

? Hm. A panel object is associated with a window object, so it's
created from a window method. This means you'd need to add
window.new_panel() to PyCursesWindow_Methods[] and
curses.update_panels(), curses.panel_above() and curses.panel_below()
(or whatever they're called after we're through discussing this ;-))
to PyCurses_Methods[].

Also, the curses.panel_{above,below}() wrappers need access to the
list_of_panels via find_po().

> Thomas, do you feel capable of implementing it as a separate module,
> or should I work on it?

It's probably finished a lot sooner when you do it; OTOH, it would be
fun to try it. Let's carry this discussion a bit further.

>  Probably a _cursesmodule.h header will have
> to be created to make various definitions available to external
> users of the basic objects in _curses.

That's easy. The problem is that we want to extend those basic objects
in _curses.

>  (Bonus: this means that the
> menu and form libraries, if they ever get wrapped, can be separate
> modules, too.)

Sure, if we solve this for panel, the others are a SMOP. :-)

tg



From guido at python.org  Wed Dec 13 15:31:52 2000
From: guido at python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 09:31:52 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src README,1.106,1.107
In-Reply-To: Your message of "Wed, 13 Dec 2000 06:14:35 PST."
             <200012131414.GAA20849@slayer.i.sourceforge.net> 
References: <200012131414.GAA20849@slayer.i.sourceforge.net> 
Message-ID: <200012131431.JAA21243@cj20424-a.reston1.va.home.com>

> + --with-cxx=<compiler>: Some C++ compilers require that main() is
> +         compiled with the C++ if there is any C++ code in the application.
> +         Specifically, g++ on a.out systems may require that to support
> +         construction of global objects. With this option, the main() function
> +         of Python will be compiled with <compiler>; use that only if you
> +         plan to use C++ extension modules, and if your compiler requires
> +         compilation of main() as a C++ program.

Thanks for documenting this; see my continued reservation in the
(reopened) bug report.

Another question remains regarding the docs though: why is it bad to
always compile main.c with a C++ compiler?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at acm.org  Wed Dec 13 16:19:01 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 13 Dec 2000 10:19:01 -0500 (EST)
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
	<14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
	<20001213074423.A30348@kronos.cnri.reston.va.us>
	<kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>
Message-ID: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>

Thomas Gellekum writes:
 > You mean as separate modules like
 > 
 > import curses
 > import panel

  Or better yet:

	import curses
	import curses.panel

 > ? Hm. A panel object is associated with a window object, so it's
 > created from a window method. This means you'd need to add
 > window.new_panel() to PyCursesWindow_Methods[] and
 > curses.update_panels(), curses.panel_above() and curses.panel_below()
 > (or whatever they're called after we're through discussing this ;-))
 > to PyCurses_Methods[].

  Do these new functions have to be methods on the window objects, or
can they be functions in the new module that take a window as a
parameter?  The underlying window object can certainly provide slots
for the use of the panel (forms, ..., etc.) bindings, and simply
initialize them to NULL (or whatever) for newly created windows.

 > Also, the curses.panel_{above,below}() wrappers need access to the
 > list_of_panels via find_po().

  There's no reason that underlying utilities can't be provided by
_curses using a CObject.  The Extending & Embedding manual has a
section on using CObjects to provide a C API to a module without
having to link to it directly.

 > That's easy. The problem is that we want to extend those basic objects
 > in _curses.

  Again, I'm curious about the necessity of this.  I suspect it can be
avoided.  I think the approach I've hinted at above will allow you to
avoid this, and will allow the panel (forms, ...) support to be added
simply by adding additional modules as they are written and the
underlying libraries are installed on the host.
  I know the question of including these modules in the core
distribution has come up before, but the resurgence in interest in
these makes me want to bring it up again:  Does the curses package
(and the associated C extension(s)) belong in the standard library, or
does it make sense to spin out a distutils-based package?  I've no
objection to them being in the core, but it seems that the release
cycle may want to diverge from Python's.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From guido at python.org  Wed Dec 13 16:48:50 2000
From: guido at python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 10:48:50 -0500
Subject: [Python-Dev] Online help PEP
In-Reply-To: Your message of "Tue, 12 Dec 2000 10:11:13 PST."
             <3A366A41.1A14EFD4@ActiveState.com> 
References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com>  
            <3A366A41.1A14EFD4@ActiveState.com> 
Message-ID: <200012131548.KAA21344@cj20424-a.reston1.va.home.com>

[Paul's PEP]
> > >         help( "string" ) -- built-in topic or global

[me]
> > Why does a global require string quotes?

[Paul]
> It doesn't, but if you happen to say 
> 
> help( "dir" ) instead of help( dir ), I think it should do the right
> thing.

Fair enough.

> > I'm missing
> > 
> >           help() -- table of contents
> > 
> > I'm not sure if the table of contents should be printed by the repr
> > output.
> 
> I don't see any benefit in having different behaviors for help and
> help().

Having the repr() overloading invoke the pager is dangerous.  The beta
version of the license command did this, and it caused some strange
side effects, e.g. vars(__builtins__) would start reading from input
and confuse the users.  The new version's repr() returns the desired
string if it's less than a page, and 'Type license() to see the full
license text' if the pager would need to be invoked.

> > >     If you ask for a global, it can be a fully-qualfied name such as
> > >     help("xml.dom").
> > 
> > Why are the string quotes needed?  When are they useful?
> 
> When you haven't imported the thing you are asking about. Or when the
> string comes from another UI like an editor window, command line or web
> form.

The implied import is a major liability.  If you can do this without
importing (e.g. by source code inspection), fine.  Otherwise, you
might issue some kind of message like "you must first import XXX.YYY".

> > >     You can also use the facility from a command-line
> > >
> > >     python --help if
> > 
> > Is this really useful?  Sounds like Perlism to me.
> 
> I'm just trying to make it easy to quickly get answers to Python
> questions. I could totally see someone writing code in VIM switching to
> a bash window to type:
> 
> python --help os.path.dirname
> 
> That's alot easier than:
> 
> $ python
> Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32
> Type "copyright", "credits" or "license" for more information.
> >>> import os
> >>> help(os.path.dirname)
> 
> And what does it hurt?

The hurt is code bloat in the interpreter and creeping featurism.  If
you need command line access to the docs (which may be a reasonable
thing to ask for, although to me it sounds backwards :-), it's better
to provide a separate command, e.g. pythondoc.  (Analog to perldoc.)

> > >     In either situation, the output does paging similar to the "more"
> > >     command.
> > 
> > Agreed.  But how to implement paging in a platform-dependent manner?
> > On Unix, os.system("more") or "$PAGER" is likely to work.  On Windows,
> > I suppose we could use its MORE, although that's pretty braindead.  On
> > the Mac?  Also, inside IDLE or Pythonwin, invoking the system pager
> > isn't a good idea.
> 
> The current implementation does paging internally. You could override it
> to use the system pager (or no pager).

Yes.  Please add that option to the PEP.

> > What does "demand-loaded" mean in a Python context?
> 
> When you "touch" the help object, it loads the onlinehelp module which
> has the real implementation. The thing in __builtins__ is just a
> lightweight proxy.

Please suggest an implementation.

> > >     It Should Also Be Possible To Override The Help Display Function By
> > >     Assigning To Onlinehelp.Displayhelp(Object_Or_String).
> > 
> > Good Idea.  Pythonwin And Idle Could Use This.  But I'D Like It To
> > Work At Least "Okay" If They Don'T.
> 
> Agreed. 

Glad You'Re So Agreeable. :)

> > >     The Module Should Be Able To Extract Module Information From Either
> > >     The Html Or Latex Versions Of The Python Documentation. Links Should
> > >     Be Accommodated In A "Lynx-Like" Manner.
> > 
> > I Think This Is Beyond The Scope.  
> 
> Well, We Have To Do One Of:
> 
>  * Re-Write A Subset Of The Docs In A Form That Can Be Accessed From The
> Command Line
>  * Access The Existing Docs In A Form That'S Installed
>  * Auto-Convert The Docs Into A Form That'S Compatible

I Really Don'T Think That This Tool Should Attempt To Do Everything.

If Someone *Really* Wants To Browse The Existing (Large) Doc Set In A
Terminal Emulation Window, Let Them Use Lynx And Point It To The
Documentation Set.  (I Agree That The Html Docs Should Be Installed,
By The Way.)

> I'Ve Already Implemented Html Parsing And Latex Parsing Is Actually Not
> That Far Off. I Just Need Impetus To Finish A Latex-Parsing Project I
> Started On My Last Vacation.

A Latex Parser Would Be Most Welcome -- If It Could Replace
Latex2html!  That Old Perl Program Is Really Ready For Retirement.
(Ask Fred.)

> The Reason That Latex Is Interesting Is Because It Would Be Nice To Be
> Able To Move Documentation From Existing Latex Files Into Docstrings.

That'S What Some People Think.  I Disagree That It Would Be Either
Feasible Or A Good Idea To Put All Documentation For A Typical Module
In Its Doc Strings.

> > The Latex Isn'T Installed Anywhere
> > (And Processing Would Be Too Much Work).  
> > The Html Is Installed Only
> > On Windows, Where There Already Is A Way To Get It To Pop Up In Your
> > Browser (Actually Two: It'S In The Start Menu, And Also In Idle'S Help
> > Menu).
> 
> If The Documentation Becomes An Integral Part Of The Python Code, Then
> It Will Be Installed. It'S Ridiculous That It Isn'T Already.

Why Is That Ridiculous?  It'S Just As Easy To Access Them Through The
Web For Most People.  If It'S Not, They Are Available In Easily
Downloadable Tarballs Supporting A Variety Of Formats.  That'S Just
Too Much To Be Included In The Standard Rpms.  (Also, Latex2html
Requires So Much Hand-Holding, And Is So Slow, That It'S Really Not A
Good Idea To Let "Make Install" Install The Html By Default.)

> Activepython Does Install The Docs On All Platforms.

Great.  More Power To You.

> > A Standard Syntax For Docstrings Is Under Development, Pep 216.  I
> > Don'T Agree With The Proposal There, But In Any Case The Help Pep
> > Should Not Attempt To Legalize A Different Format Than Pep 216.
> 
> I Won'T Hold My Breath For A Standard Python Docstring Format. I'Ve Gone
> Out Of My Way To Make The Code Format Independent..

To Tell You The Truth, I'M Not Holding My Breath Either. :-)  So your
code should just dump the doc string on stdout without interpreting it
in any way (except for paging).

> > Neat.  I noticed that in a 24-line screen, the pagesize must be set to
> > 21 to avoid stuff scrolling off the screen.  Maybe there's an off-by-3
> > error somewhere?
> 
> Yes.

It's buggier than just that.  The output of the pager prints an extra
"| " at the start of each page except for the first, and the first
page is a line longer than subsequent pages.

BTW, another bug: try help(cgi).  It's nice that it gives the default
value for arguments, but the defaults for FieldStorage.__init__ happen
to include os.environ.  Its entire value is dumped -- which causes the
pager to be off (it wraps over about 20 lines for me).  I think you
may have to truncate long values a bit, e.g. by using the repr module.

> > I also noticed that it always prints '1' when invoked as a function.
> > The new license pager in site.py avoids this problem.
> 
> Okay.

Where's the check-in? :-)

> > help("operators") and several others raise an
> > AttributeError('handledocrl').
> 
> Fixed.
> 
> > The "lynx-line links" don't work.
> 
> I don't think that's implemented yet.

I'm not sure what you intended to implement there.  I prefer to see
the raw URLs, then I can do whatever I normally do to paste them into
my preferred webbrowser (which *not* lynx :-).

> > I think it's naive to expect this help facility to replace browsing
> > the website or the full documentation package.  There should be one
> > entry that says to point your browser there (giving the local
> > filesystem URL if available), and that's it.  The rest of the online
> > help facility should be concerned with exposing doc strings.
> 
> I don't want to replace the documentation. But there is no reason we
> should set out to make it incomplete. If its integrated with the HTML
> then people can choose whatever access mechanism is easiest for them
> right now
> 
> I'm trying hard not to be "naive". Realistically, nobody is going to
> write a million docstrings between now and Python 2.1. It is much more
> feasible to leverage the existing documentation that Fred and others
> have spent months on.

I said above, and I'll say it again: I think the majority of people
would prefer to use their standard web browser to read the standard
docs.  It's not worth the effort to try to make those accessible
through help().  In fact, I'd encourage the development of a
command-line-invoked help facility that shows doc strings in the
user's preferred web browser -- the webbrowser module makes this
trivial.

> > > Security Issues
> > > 
> > >     This module will attempt to import modules with the same names as
> > >     requested topics. Don't use the modules if you are not confident
> > >     that everything in your pythonpath is from a trusted source.
> > Yikes!  Another reason to avoid the "string" -> global variable
> > option.
> 
> I don't think we should lose that option. People will want to look up
> information from non-executable environments like command lines, GUIs
> and web pages. Perhaps you can point me to techniques for extracting
> information from Python modules and packages without executing them.

I don't know specific tools, but any serious docstring processing tool
ends up parsing the source code for this very reason, so there's
probably plenty of prior art.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at acm.org  Wed Dec 13 17:07:22 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 13 Dec 2000 11:07:22 -0500 (EST)
Subject: [Python-Dev] Online help PEP
In-Reply-To: <200012131548.KAA21344@cj20424-a.reston1.va.home.com>
References: <3A3480E5.C2577AE6@ActiveState.com>
	<200012111557.KAA24266@cj20424-a.reston1.va.home.com>
	<3A366A41.1A14EFD4@ActiveState.com>
	<200012131548.KAA21344@cj20424-a.reston1.va.home.com>
Message-ID: <14903.40634.569192.704368@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > A Latex Parser Would Be Most Welcome -- If It Could Replace
 > Latex2html!  That Old Perl Program Is Really Ready For Retirement.
 > (Ask Fred.)

  Note that Doc/tools/sgmlconv/latex2esis.py already includes a
moderate start at a LaTeX parser.  Paragraph marking is done as a
separate step in Doc/tools/sgmlconv/docfixer.py, but I'd like to push
that down into the LaTeX handler.
  (Note that these tools are mostly broken at the moment, except for
latex2esis.py, which does most of what I need other than paragraph
marking.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From Barrett at stsci.edu  Wed Dec 13 17:34:40 2000
From: Barrett at stsci.edu (Paul Barrett)
Date: Wed, 13 Dec 2000 11:34:40 -0500 (EST)
Subject: [Python-Dev] Reference implementation for PEP 208 (coercion)
In-Reply-To: <20001210054646.A5219@glacier.fnational.com>
References: <20001210054646.A5219@glacier.fnational.com>
Message-ID: <14903.41669.883591.420446@nem-srvr.stsci.edu>

Neil Schemenauer writes:
 > Sourceforge unloads are not working.  The lastest version of the
 > patch for PEP 208 is here:
 > 
 >     http://arctrix.com/nas/python/coerce-6.0.diff
 > 
 > Operations on instances now call __coerce__ if it exists.  I
 > think the patch is now complete.  Converting other builtin types
 > to "new style numbers" can be done with a separate patch.

My one concern about this patch is whether the non-commutativity of
operators is preserved.  This issue being important for matrix
operations (not to be confused with element-wise array operations).

 -- Paul





From guido at python.org  Wed Dec 13 17:45:12 2000
From: guido at python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 11:45:12 -0500
Subject: [Python-Dev] Reference implementation for PEP 208 (coercion)
In-Reply-To: Your message of "Wed, 13 Dec 2000 11:34:40 EST."
             <14903.41669.883591.420446@nem-srvr.stsci.edu> 
References: <20001210054646.A5219@glacier.fnational.com>  
            <14903.41669.883591.420446@nem-srvr.stsci.edu> 
Message-ID: <200012131645.LAA21719@cj20424-a.reston1.va.home.com>

> Neil Schemenauer writes:
>  > Sourceforge unloads are not working.  The lastest version of the
>  > patch for PEP 208 is here:
>  > 
>  >     http://arctrix.com/nas/python/coerce-6.0.diff
>  > 
>  > Operations on instances now call __coerce__ if it exists.  I
>  > think the patch is now complete.  Converting other builtin types
>  > to "new style numbers" can be done with a separate patch.
> 
> My one concern about this patch is whether the non-commutativity of
> operators is preserved.  This issue being important for matrix
> operations (not to be confused with element-wise array operations).

Yes, this is preserved.  (I'm spending most of my waking hours
understanding this patch -- it is a true piece of wizardry.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Wed Dec 13 18:38:00 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 13 Dec 2000 18:38:00 +0100
Subject: [Python-Dev] Reference implementation for PEP 208 (coercion)
References: <20001210054646.A5219@glacier.fnational.com>  
	            <14903.41669.883591.420446@nem-srvr.stsci.edu> <200012131645.LAA21719@cj20424-a.reston1.va.home.com>
Message-ID: <3A37B3F7.5640FAFC@lemburg.com>

Guido van Rossum wrote:
> 
> > Neil Schemenauer writes:
> >  > Sourceforge unloads are not working.  The lastest version of the
> >  > patch for PEP 208 is here:
> >  >
> >  >     http://arctrix.com/nas/python/coerce-6.0.diff
> >  >
> >  > Operations on instances now call __coerce__ if it exists.  I
> >  > think the patch is now complete.  Converting other builtin types
> >  > to "new style numbers" can be done with a separate patch.
> >
> > My one concern about this patch is whether the non-commutativity of
> > operators is preserved.  This issue being important for matrix
> > operations (not to be confused with element-wise array operations).
> 
> Yes, this is preserved.  (I'm spending most of my waking hours
> understanding this patch -- it is a true piece of wizardry.)

The fact that coercion didn't allow detection of parameter
order was the initial cause for my try at fixing it back then.
I was confronted with the fact that at C level there was no way
to tell whether the operands were in the order left, right or
right, left -- as a result I used a gross hack in mxDateTime
to still make this work...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Wed Dec 13 22:01:46 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 13 Dec 2000 16:01:46 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 10:19:01AM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>
Message-ID: <20001213160146.A24753@thyrsus.com>

Fred L. Drake, Jr. <fdrake at acm.org>:
>   I know the question of including these modules in the core
> distribution has come up before, but the resurgence in interest in
> these makes me want to bring it up again:  Does the curses package
> (and the associated C extension(s)) belong in the standard library, or
> does it make sense to spin out a distutils-based package?  I've no
> objection to them being in the core, but it seems that the release
> cycle may want to diverge from Python's.

Curses needs to be in the core for political reasons.  Specifically, 
to support CML2 without requiring any extra packages or downloads
beyond the stock Python interpreter.

And what makes CML2 so constrained and so important?  It's my bid to
replace the Linux kernel's configuration machinery.  It has many
advantages over the existing config system, but the linux developers
are *very* resistant to adding things to the kernel's minimum build
kit.  Python alone may prove too much for them to swallow (though
there are hopeful signs they will); Python plus a separately
downloadable curses module would definitely be too much.

Guido attaches sufficient importance to getting Python into the kernel
build machinery that he approved adding ncurses to the standard modules
on that basis.  This would be a huge design win for us, raising Python's
visibility considerably.

So curses must stay in the core.  I don't have a requirement for
panels; my present curses front end simulates them. But if panels were
integrated into the core I could simplify the front-end code
significantly.  Every line I can remove from my stuff (even if it, in
effect, is just migrating into the Python core) makes it easier to
sell CML2 into the kernel.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Experience should teach us to be most on our guard to protect liberty
when the government's purposes are beneficient...  The greatest dangers
to liberty lurk in insidious encroachment by men of zeal, well meaning
but without understanding."
	-- Supreme Court Justice Louis Brandeis



From jheintz at isogen.com  Wed Dec 13 22:10:32 2000
From: jheintz at isogen.com (John D. Heintz)
Date: Wed, 13 Dec 2000 15:10:32 -0600
Subject: [Python-Dev] Announcing ZODB-Corba code release
Message-ID: <3A37E5C8.7000800@isogen.com>

Here is the first release of code that exposes a ZODB database through 
CORBA (omniORB).

The code is functioning, the docs are sparse, and it should work on your 
machines.  ;-)

I am only going to be in town for the next two days, then I will be 
unavailable until Jan 1.

See http://www.zope.org/Members/jheintz/ZODB_CORBA_Connection to 
download the code.

It's not perfect, but it works for me.

Enjoy,
John


-- 
. . . . . . . . . . . . . . . . . . . . . . . .

John D. Heintz | Senior Engineer

1016 La Posada Dr. | Suite 240 | Austin TX 78752
T 512.633.1198 | jheintz at isogen.com

w w w . d a t a c h a n n e l . c o m




From guido at python.org  Wed Dec 13 22:19:01 2000
From: guido at python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 16:19:01 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: Your message of "Wed, 13 Dec 2000 16:01:46 EST."
             <20001213160146.A24753@thyrsus.com> 
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>  
            <20001213160146.A24753@thyrsus.com> 
Message-ID: <200012132119.QAA11060@cj20424-a.reston1.va.home.com>

> So curses must stay in the core.  I don't have a requirement for
> panels; my present curses front end simulates them. But if panels were
> integrated into the core I could simplify the front-end code
> significantly.  Every line I can remove from my stuff (even if it, in
> effect, is just migrating into the Python core) makes it easier to
> sell CML2 into the kernel.

On the other hand you may want to be conservative.  You already have
to require Python 2.0 (I presume).  The panel stuff will be available
in 2.1 at the earliest.  You probably shouldn't throw out your panel
emulation until your code has already been accepted...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin at loewis.home.cs.tu-berlin.de  Wed Dec 13 22:56:27 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 13 Dec 2000 22:56:27 +0100
Subject: [Python-Dev] CVS: python/dist/src README,1.106,1.107
Message-ID: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de>

> Another question remains regarding the docs though: why is it bad to
> always compile main.c with a C++ compiler?

For the whole thing to work, it may also be necessary to link the
entire application with a C++ compiler; that in turn may bind to the
C++ library. Linking with the system's C++ library means that the
Python executable cannot be as easily exchanged between installations
of the operating system - you'd also need to have the right version of
the C++ library to run it. If the C++ library is static, that may also
increase the size of the executable.

I can't really point to a specific problem that would occur on a
specific system I use if main() was compiled with a C++
compiler. However, on the systems I use (Windows, Solaris, Linux), you
can build C++ extension modules even if Python was not compiled as a
C++ application.

On Solaris and Windows, you'd also have to chose the C++ compiler you
want to use (MSVC++, SunPro CC, or g++); in turn, different C++
runtime systems would be linked into the application.

Regards,
Martin



From esr at thyrsus.com  Wed Dec 13 23:03:59 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 13 Dec 2000 17:03:59 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012132119.QAA11060@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 04:19:01PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com>
Message-ID: <20001213170359.A24915@thyrsus.com>

Guido van Rossum <guido at python.org>:
> > So curses must stay in the core.  I don't have a requirement for
> > panels; my present curses front end simulates them. But if panels were
> > integrated into the core I could simplify the front-end code
> > significantly.  Every line I can remove from my stuff (even if it, in
> > effect, is just migrating into the Python core) makes it easier to
> > sell CML2 into the kernel.
> 
> On the other hand you may want to be conservative.  You already have
> to require Python 2.0 (I presume).  The panel stuff will be available
> in 2.1 at the earliest.  You probably shouldn't throw out your panel
> emulation until your code has already been accepted...

Yes, that's how I am currently expecting it to play out -- but if the 2.4.0
kernel is delayed another six months, I'd change my mind.  I'll explain this,
because python-dev people should grok what the surrounding politics and timing 
are.

I actually debated staying with 1.5.2 as a base version.  What changed
my mind was two things.  One: by going to 2.0 I could drop close to 600
lines and three entire support modules from CML2, slimming down its 
footprint in the kernel tree significantly (by more than 10% of the 
entire code volume, actually).

Second: CML2 is not going to be seriously evaluated until 2.4.0 final is out.
Linus made this clear when I demoed it for him at LWE.  My best guess about 
when that will happen is late January into Februrary.  By the time Red Hat
issues its next distro after that (probably May or thenabouts) it's a safe
bet 2.0 will be on it, and everywhere else.

But if the 2.4.0 kernel slips another six months yet again, and our
2.1 commes out relatively quickly (like, just before the 9th Python
Conference :-)) then we *might* have time to get 2.1 into the distros
before CML2 gets the imprimatur.

So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel
will be delayed yet again :-).
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Ideology, politics and journalism, which luxuriate in failure, are
impotent in the face of hope and joy.
	-- P. J. O'Rourke



From nas at arctrix.com  Wed Dec 13 16:37:45 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 07:37:45 -0800
Subject: [Python-Dev] CVS: python/dist/src README,1.106,1.107
In-Reply-To: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Wed, Dec 13, 2000 at 10:56:27PM +0100
References: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de>
Message-ID: <20001213073745.C17148@glacier.fnational.com>

These are issues to consider for Python 3000 as well.  AFAKI, C++
ABIs are a nighmare.

  Neil



From fdrake at acm.org  Wed Dec 13 23:29:25 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 13 Dec 2000 17:29:25 -0500 (EST)
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001213170359.A24915@thyrsus.com>
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
	<14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
	<20001213074423.A30348@kronos.cnri.reston.va.us>
	<kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>
	<14903.37733.339921.131872@cj42289-a.reston1.va.home.com>
	<20001213160146.A24753@thyrsus.com>
	<200012132119.QAA11060@cj20424-a.reston1.va.home.com>
	<20001213170359.A24915@thyrsus.com>
Message-ID: <14903.63557.282592.796169@cj42289-a.reston1.va.home.com>

Eric S. Raymond writes:
 > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel
 > will be delayed yet again :-).

  Politics aside, I think development of curses-related extensions
like panels and forms doesn't need to be delayed.  I've posted what I
think are relavant technical comments already, and leave it up to the
developers of any new modules to get them written -- I don't know
enough curses to offer any help there.
  Regardless of how the curses package is distributed and deployed, I
don't see any reason to delay development in its existing location in
the Python CVS repository.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From nas at arctrix.com  Wed Dec 13 16:41:54 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 07:41:54 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001213170359.A24915@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 13, 2000 at 05:03:59PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com>
Message-ID: <20001213074154.D17148@glacier.fnational.com>

On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote:
> CML2 is not going to be seriously evaluated until 2.4.0 final
> is out.  Linus made this clear when I demoed it for him at LWE.
> My best guess about when that will happen is late January into
> Februrary.  By the time Red Hat issues its next distro after
> that (probably May or thenabouts) it's a safe bet 2.0 will be
> on it, and everywhere else.

I don't think that is a very safe bet.  Python 2.0 missed the
Debian Potato boat.  I have no idea when Woody is expected to be
released but I expect it may take longer than that if history is
any indication.

  Neil



From guido at python.org  Thu Dec 14 00:03:31 2000
From: guido at python.org (Guido van Rossum)
Date: Wed, 13 Dec 2000 18:03:31 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: Your message of "Wed, 13 Dec 2000 07:41:54 PST."
             <20001213074154.D17148@glacier.fnational.com> 
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com>  
            <20001213074154.D17148@glacier.fnational.com> 
Message-ID: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>

> I don't think that is a very safe bet.  Python 2.0 missed the
> Debian Potato boat.

This may have had to do more with the unresolved GPL issues.  I
recently received a mail from Stallman indicating that an agreement
with CNRI has been reached; they have agreed (in principle, at least)
to specific changes to the CNRI license that will defuse the
choice-of-law clause when it is combined with GPL-licensed code "in a
non-separable way".  A glitch here is that the BeOpen license probably
has to be changed too, but I believe that that's all doable.

> I have no idea when Woody is expected to be
> released but I expect it may take longer than that if history is
> any indication.

And who or what is Woody?

Feeling-left-out,

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gstein at lyra.org  Thu Dec 14 00:16:09 2000
From: gstein at lyra.org (Greg Stein)
Date: Wed, 13 Dec 2000 15:16:09 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com>
Message-ID: <20001213151609.E8951@lyra.org>

On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote:
>...
> > I have no idea when Woody is expected to be
> > released but I expect it may take longer than that if history is
> > any indication.
> 
> And who or what is Woody?

One of the Debian releases. Dunno if it is the "next" release, but there ya
go.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From gstein at lyra.org  Thu Dec 14 00:18:34 2000
From: gstein at lyra.org (Greg Stein)
Date: Wed, 13 Dec 2000 15:18:34 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001213170359.A24915@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 13, 2000 at 05:03:59PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com>
Message-ID: <20001213151834.F8951@lyra.org>

On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote:
>...
> So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel
> will be delayed yet again :-).

The kernel is not going to be delayed that much. Linus wants it to go out
this month. Worst case, I could see January. But no way on six months.

But as Fred said: that should not change panels going into the curses
support at all. You can always have a "compat.py" module in CML2 that
provides functionality for prior-to-2.1 releases of Python.

I'd also be up for a separate _curses_panels module, loaded into the curses
package.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From esr at thyrsus.com  Thu Dec 14 00:33:02 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 13 Dec 2000 18:33:02 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001213151834.F8951@lyra.org>; from gstein@lyra.org on Wed, Dec 13, 2000 at 03:18:34PM -0800
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213151834.F8951@lyra.org>
Message-ID: <20001213183302.A25160@thyrsus.com>

Greg Stein <gstein at lyra.org>:
> On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote:
> >...
> > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel
> > will be delayed yet again :-).
> 
> The kernel is not going to be delayed that much. Linus wants it to go out
> this month. Worst case, I could see January. But no way on six months.

I know what Linus wants.  That's why I'm estimating end of January or 
earlier Februrary -- the man's error curve on these estimates has a 
certain, er, *consistency* about it.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Alcohol still kills more people every year than all `illegal' drugs put
together, and Prohibition only made it worse.  Oppose the War On Some Drugs!



From nas at arctrix.com  Wed Dec 13 18:18:48 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 09:18:48 -0800
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com>
Message-ID: <20001213091848.A17326@glacier.fnational.com>

On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote:
> > I don't think that is a very safe bet.  Python 2.0 missed the
> > Debian Potato boat.
> 
> This may have had to do more with the unresolved GPL issues.

I can't remember the exact dates but I think Debian Potato was
frozen before Python 2.0 was released.  Once a Debian release is
frozen packages are not upgraded except under unusual
circumstances.

> I recently received a mail from Stallman indicating that an
> agreement with CNRI has been reached; they have agreed (in
> principle, at least) to specific changes to the CNRI license
> that will defuse the choice-of-law clause when it is combined
> with GPL-licensed code "in a non-separable way".  A glitch here
> is that the BeOpen license probably has to be changed too, but
> I believe that that's all doable.

This is great news.

> > I have no idea when Woody is expected to be
> > released but I expect it may take longer than that if history is
> > any indication.
> 
> And who or what is Woody?

Woody would be another character from the Pixar movie "Toy Story"
(just like Rex, Bo, Potato, Slink, and Hamm).  I believe Bruce
Perens used to work a Pixar.  Debian uses a code name for the
development release until a release number is assigned.  This
avoids some problems but has the disadvantage of confusing people
who are not familiar with Debian.  I should have said "the next
stable release of Debian".


  Neil (aka nas at debian.org)



From akuchlin at mems-exchange.org  Thu Dec 14 01:26:32 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 13 Dec 2000 19:26:32 -0500
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 10:19:01AM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>
Message-ID: <20001213192632.A30585@kronos.cnri.reston.va.us>

On Wed, Dec 13, 2000 at 10:19:01AM -0500, Fred L. Drake, Jr. wrote:
>  Do these new functions have to be methods on the window objects, or
>can they be functions in the new module that take a window as a
>parameter?  The underlying window object can certainly provide slots

Panels and windows have a 1-1 association, but they're separate
objects.  The window.new_panel function could become just a method
which takes a window as its first argument; it would only need the
TypeObject for PyCursesWindow, in order to do typechecking.

> > Also, the curses.panel_{above,below}() wrappers need access to the
> > list_of_panels via find_po().

The list_of_panels is used only in the curses.panel module, so it
could be private to that module, since only panel-related functions
care about it.  

I'm ambivalent about the list_of_panels.  It's a linked list storing
(PyWindow, PyPanel) pairs.  Probably it should use a dictionary
instead of implementing a little list, just to reduce the amount of
code.

>does it make sense to spin out a distutils-based package?  I've no
>objection to them being in the core, but it seems that the release
>cycle may want to diverge from Python's.

Consensus seemed to be to leave it in; I'd have no objection to
removing it, but either course is fine with me.

So, I suggest we create _curses_panel.c, which would be available as
curses.panel.  (A panel.py module could then add any convenience
functions that are required.)

Thomas, do you want to work on this, or should I?

--amk



From nas at arctrix.com  Wed Dec 13 18:43:06 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 09:43:06 -0800
Subject: [Python-Dev] OT: Debian and Python
In-Reply-To: <20001214010534.M4396@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 14, 2000 at 01:05:34AM +0100
References: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl>
Message-ID: <20001213094306.C17326@glacier.fnational.com>

On Thu, Dec 14, 2000 at 01:05:34AM +0100, Thomas Wouters wrote:
> Note to the debian-pythoneers: woody still carries Python 1.5.2, not 2.0.
> Someone created a separate set of 2.0-packages, but they didn't include
> readline and gdbm support because of the licencing issues. (Posted on c.l.py
> sometime this week.)

I've had Python packages for Debian stable for a while.  I guess
I should have posted a link:

    http://arctrix.com/nas/python/debian/

Most useful modules are enabled.

> I'm *almost* tempted enough to learn enough about
> dpkg/.deb files to build my own licence-be-damned set

Its quite easy.  Debian source packages are basicly a diff.
Applying the diff will create a "debian" directory and in that
directory will be a makefile called "rules".  Use the target
"binary" to create new binary packages.  Good things to know are
that you must be in the source directory when you run the
makefile (ie. ./debian/rules binary).  You should be running a
shell under fakeroot to get the install permissions right
(running "fakeroot" will do).  You need to have the Debian
developer tools installed.  There is a list somewhere on
debian.org.  "apt-get source <packagename>" will get, extract and
patch a package ready for tweaking and building (handy for
getting stuff from unstable to run on stable).  

This is too off topic for python-dev.  If anyone needs more info
they can email me directly.

  Neil



From thomas at xs4all.net  Thu Dec 14 01:05:34 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 14 Dec 2000 01:05:34 +0100
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com>
Message-ID: <20001214010534.M4396@xs4all.nl>

On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote:
> > I don't think that is a very safe bet.  Python 2.0 missed the Debian
> > Potato boat.
> 
> This may have had to do more with the unresolved GPL issues.

This is very likely. Debian is very licence -- or at least GPL -- aware.
Which is a pity, really, because I already prefer it over RedHat in all
other cases (and RedHat is also pretty licence aware, just less piously,
devoutly, beyond-practicality-IMHO dedicated to the GPL.)

> > I have no idea when Woody is expected to be released but I expect it may
> > take longer than that if history is any indication.

BTW, I believe Debian uses a fairly steady release schedule, something like
an unstable->stable switch every year or 6 months or so ? I seem to recall
seeing something like that on the debian website, but can't check right now.

> And who or what is Woody?

Woody is Debian's current development branch, the current bearer of the
alias 'unstable'. It'll become Debian 2.3 (I believe, I don't pay attention
to version numbers, I just run unstable :) once it's stabilized. 'potato' is
the previous development branch, and currently the 'stable' branch. You can
compare them with 'rawhide' and 'redhat-7.0', respectively :)

(With the enormous difference that you can upgrade your debian install to a
new version (even the devel version, or update your machine to the latest
devel snapshot) while you are using it, without having to reboot ;)

Note to the debian-pythoneers: woody still carries Python 1.5.2, not 2.0.
Someone created a separate set of 2.0-packages, but they didn't include
readline and gdbm support because of the licencing issues. (Posted on c.l.py
sometime this week.) I'm *almost* tempted enough to learn enough about
dpkg/.deb files to build my own licence-be-damned set, but it'd be a lot of
work to mirror the current debian 1.5.2 set of packages (which include
numeric, imaging, mxTools, GTK/GNOME, and a shitload of 3rd party modules)
in 2.0. Ponder, maybe it could be done semi-automatically, from the
src-deb's of those packages.

By the way, in woody, there are 52 packages with 'python' in the name, and
32 with 'perl' in the name... Pity all of my perl-hugging hippy-friends are
still blindly using RedHat, and refuse to listen to my calls from the
Debian/Python-dark-side :-)

Oh, and the names 'woody' and 'potato' came from the movie Toy Story, in
case you wondered ;)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From esr at snark.thyrsus.com  Thu Dec 14 01:46:37 2000
From: esr at snark.thyrsus.com (Eric S. Raymond)
Date: Wed, 13 Dec 2000 19:46:37 -0500
Subject: [Python-Dev] Business related to the upcoming Python conference
Message-ID: <200012140046.TAA25289@snark.thyrsus.com>

I'm sending this to python-dev because I believe most or all of the
reviewers for my PC9 paper are on this list.  Paul, would you please
forward to any who were not?

First, my humble apologies for not having got my PC9 reviews in on time.
I diligently read my assigned papers early, but I couldn't do the
reviews early because of technical problems with my Foretec account --
and then I couldn't do them late because the pre-deadline crunch
happened while I was on a ten-day speaking and business trip in Japan
and California, with mostly poor or nonexistent Internet access.

Matters were not helped by a nasty four-month-old problem in my
personal life coming to a head right in the middle of the trip.  Nor
by the fact that the trip included the VA Linux Systems annual
stockholders' meeting and the toughest Board of Directors' meeting in
my tenure.  We had to hammer out a strategic theory of what to do now
that the dot-com companies who used to be our best companies aren't
getting funded any more.  Unfortunately, it's at times like this that
Board members earn their stock options.  Management oversight.
Fiduciary responsibility.  Mumble...

Second, the feedback I received on the paper was *excellent*, and I
will be making many of the recommended changes.  I've already extended
the discussion of "Why Python?" including addressing the weaknesses of
Scheme and Prolog for this application.  I have said more about uses
of CML2 beyond the Linux kernel.  I am working on a discussion of the
politics of CML2 option, but may save that for the stand-up talk
rather than the written paper.  I will try to trim the CML2 language
reference for the final version.

(The reviewer who complained about the lack of references on the SAT 
problem should be pleased to hear that URLs to relevant papers are in
fact included in the masters.  I hope they show in the final version
as rendered for publication.)
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The Constitution is not neutral. It was designed to take the
government off the backs of the people.
	-- Justice William O. Douglas 



From moshez at zadka.site.co.il  Thu Dec 14 13:22:24 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Thu, 14 Dec 2000 14:22:24 +0200 (IST)
Subject: [Python-Dev] Splitting up _cursesmodule
Message-ID: <20001214122224.739EEA82E@darjeeling.zadka.site.co.il>

On Wed, 13 Dec 2000 07:41:54 -0800, Neil Schemenauer <nas at arctrix.com> wrote:

> I don't think that is a very safe bet.  Python 2.0 missed the
> Debian Potato boat.

By a long time -- potato was frozen for a few months when 2.0 came out.

>  I have no idea when Woody is expected to be
> released but I expect it may take longer than that if history is
> any indication.

My bet is that woody starts freezing as soon as 2.4.0 is out. 
Note that once it starts freezing, 2.1 doesn't have a shot
of getting in, regardless of how long it takes to freeze.
OTOH, since in woody time there's a good chance for the "testing"
distribution, a lot more people would be running something
that *can* and *will* upgrade to 2.1 almost as soon as it is
out.
(For the record, most of the Debian users I know run woody on 
their server)
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From jeremy at alum.mit.edu  Thu Dec 14 06:04:43 2000
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 14 Dec 2000 00:04:43 -0500 (EST)
Subject: [Python-Dev] new draft of PEP 227
Message-ID: <14904.21739.804346.650062@bitdiddle.concentric.net>

I've got a new draft of PEP 227.  The terminology and wording are more
convoluted than they need to be.  I'll do at least one revision just
to say things more clearly, but I'd appreciate comments on the
proposed spec if you can read the current draft.

Jeremy




From cgw at fnal.gov  Thu Dec 14 07:03:01 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Thu, 14 Dec 2000 00:03:01 -0600 (CST)
Subject: [Python-Dev] Memory leaks in tupleobject.c
Message-ID: <14904.25237.654143.861733@buffalo.fnal.gov>

I've been running a set of memory-leak tests against the latest Python
and have found that running "test_extcall" leaks memory.  This gave me
a strange sense of deja vu, having fixed this once before...


From nas at arctrix.com  Thu Dec 14 00:43:43 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 13 Dec 2000 15:43:43 -0800
Subject: [Python-Dev] Memory leaks in tupleobject.c
In-Reply-To: <14904.25237.654143.861733@buffalo.fnal.gov>; from cgw@fnal.gov on Thu, Dec 14, 2000 at 12:03:01AM -0600
References: <14904.25237.654143.861733@buffalo.fnal.gov>
Message-ID: <20001213154343.A18303@glacier.fnational.com>

On Thu, Dec 14, 2000 at 12:03:01AM -0600, Charles G Waldman wrote:
>  date: 2000/10/05 19:36:49;  author: nascheme;  state: Exp;  lines: +24 -86
>  Simplify _PyTuple_Resize by not using the tuple free list and dropping
>  support for the last_is_sticky flag.  A few hard to find bugs may be
>  fixed by this patch since the old code was buggy.
> 
> The 2.47 patch seems to have re-introduced the memory leak which was
> fixed in 2.31.  Maybe the old code was buggy, but the "right thing"
> would have been to fix it, not to throw it away.... if _PyTuple_Resize
> simply ignores the tuple free list, memory will be leaked.

Guilty as charged.  Can you explain how the current code is
leaking memory?  I can see one problem with deallocating size=0
tuples.  Are there any more leaks?

  Neil



From cgw at fnal.gov  Thu Dec 14 07:57:05 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Thu, 14 Dec 2000 00:57:05 -0600 (CST)
Subject: [Python-Dev] Memory leaks in tupleobject.c
In-Reply-To: <20001213154343.A18303@glacier.fnational.com>
References: <14904.25237.654143.861733@buffalo.fnal.gov>
	<20001213154343.A18303@glacier.fnational.com>
Message-ID: <14904.28481.292539.354303@buffalo.fnal.gov>

Neil Schemenauer writes:

 > Guilty as charged.  Can you explain how the current code is
 > leaking memory?  I can see one problem with deallocating size=0
 > tuples.  Are there any more leaks?

Actually, I think I may have spoken too hastily - it's late and I'm
tired and I should be sleeping rather than staring at the screen 
(like I've been doing since 8:30 this morning) - I jumped to
conclusions - I'm not really sure that it was your patch that caused
the leak; all I can say with 100% certainty is that if you run
"test_extcall" in a loop, memory usage goes through the ceiling....
It's not just the cyclic garbage caused by the "saboteur" function
because even with this commented out, the memory leak persists.

I'm actually trying to track down a different memory leak, something
which is currently causing trouble in one of our production servers
(more about this some other time) and just as a sanity check I ran my
little "leaktest.py" script over all the test_*.py modules in the
distribution, and found that test_extcall triggers leaks... having
analyzed and fixed this once before (see the CVS logs for
tupleobject.c), I jumped to conclusions about the reason for its
return.  I'll take a more clear-headed and careful look tomorrow and
post something (hopefully) a little more conclusive.  It may have been
some other change that caused this memory leak to re-appear.  If you
feel inclined to investigate, just do "reload(test.test_extcall)" in a
loop and watch the memory usage with ps or top or what-have-you...

	 -C




From paulp at ActiveState.com  Thu Dec 14 08:00:21 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Wed, 13 Dec 2000 23:00:21 -0800
Subject: [Python-Dev] new draft of PEP 227
References: <14904.21739.804346.650062@bitdiddle.concentric.net>
Message-ID: <3A387005.6725DAAE@ActiveState.com>

Jeremy Hylton wrote:
> 
> I've got a new draft of PEP 227.  The terminology and wording are more
> convoluted than they need to be.  I'll do at least one revision just
> to say things more clearly, but I'd appreciate comments on the
> proposed spec if you can read the current draft.

It set me to thinking:

Python should never require declarations. But would it necessarily be a
problem for Python to have a variable declaration syntax? Might not the
existence of declarations simplify some aspects of the proposal and of
backwards compatibility?

Along the same lines, might a new rule make Python code more robust? We
could say that a local can only shadow a global if the local is formally
declared. It's pretty rare that there is a good reason to shadow a
global and Python makes it too easy to do accidentally.

 Paul Prescod



From paulp at ActiveState.com  Thu Dec 14 08:29:35 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Wed, 13 Dec 2000 23:29:35 -0800
Subject: [Python-Dev] Online help scope
Message-ID: <3A3876DF.5554080C@ActiveState.com>

I think Guido and I are pretty far apart on the scope and requirements
of this online help thing so I'd like some clarification and opinions
from the peanut gallery.

Consider these scenarios

a) Signature

>>> help( dir )
dir([object]) -> list of stringsb) 

b) Usage hint

>>> help( dir )
dir([object]) -> list of stringsb) 

Return an alphabetized list of names comprising (some of) the attributes
of the given object.  Without an argument, the names in the current
scope
are listed.  With an instance argument, only the instance attributes are
returned.  With a class argument, attributes of the base class are not
returned.  For other types or arguments, this may list members or
methods.

c) Complete documentation, paged(man-style)

>>> help( dir )
dir([object]) -> list of stringsb) 

Without arguments, return the list of names in the current local symbol
table. With an argument, attempts to return a list of valid attribute
for that object. This information is gleaned from the object's __dict__,
__methods__ and __members__ attributes, if defined. The list is not
necessarily complete; e.g., for classes, attributes defined in base
classes are not included, and for class instances, methods are not
included. The resulting list is sorted alphabetically. For example: 

  >>> import sys
  >>> dir()
  ['sys']
  >>> dir(sys)
  ['argv', 'exit', 'modules', 'path', 'stderr', 'stdin', 'stdout']

d) Complete documentation in a user-chosen hypertext window

>>> help( dir )
(Netscape or lynx pops up)

I'm thinking that maybe we need two functions:

 * help
 * pythondoc

pythondoc("dir") would launch the Python documentation for the "dir"
command.

> That'S What Some People Think.  I Disagree That It Would Be Either
> Feasible Or A Good Idea To Put All Documentation For A Typical Module
> In Its Doc Strings.

Java and Perl people do it regularly. I think that in the greater world
of software development, the inline model has won (or is winning) and I
don't see a compelling reason to fight the tide. There will always be
out-of-line tutorials, discussions, books etc. 

The canonical module documentation could be inline. That improves the
liklihood of it being maintained. The LaTeX documentation is a major
bottleneck and moving to XML or SGML will not help. Programmers do not
want to learn documentation systems or syntaxes. They want to write code
and comments.

> I said above, and I'll say it again: I think the majority of people
> would prefer to use their standard web browser to read the standard
> docs.  It's not worth the effort to try to make those accessible
> through help().  

No matter what we decide on the issue above, reusing the standard
documentation is the only practical way of populating the help system in
the short-term. Right now, today, there is a ton of documentation that
exists only in LaTeX and HTML. Tons of modules have no docstrings.
Keywords have no docstrings. Compare the docstring for
urllib.urlretrieve to the HTML documentation.

In fact, you've given me a good idea: if the HTML is not available
locally, I can access it over the web.

 Paul Prescod



From paulp at ActiveState.com  Thu Dec 14 08:29:53 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Wed, 13 Dec 2000 23:29:53 -0800
Subject: [Python-Dev] Online help PEP
References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com>  
		            <3A366A41.1A14EFD4@ActiveState.com> <200012131548.KAA21344@cj20424-a.reston1.va.home.com>
Message-ID: <3A3876F1.D3E65E90@ActiveState.com>

Guido van Rossum wrote:
> 
> Having the repr() overloading invoke the pager is dangerous.  The beta
> version of the license command did this, and it caused some strange
> side effects, e.g. vars(__builtins__) would start reading from input
> and confuse the users.  The new version's repr() returns the desired
> string if it's less than a page, and 'Type license() to see the full
> license text' if the pager would need to be invoked.

I'll add this to the PEP.

> The implied import is a major liability.  If you can do this without
> importing (e.g. by source code inspection), fine.  Otherwise, you
> might issue some kind of message like "you must first import XXX.YYY".

Okay, I'll add to the PEP that an open issue is what strategy to use,
but that we want to avoid implicit import.

> The hurt is code bloat in the interpreter and creeping featurism.  If
> you need command line access to the docs (which may be a reasonable
> thing to ask for, although to me it sounds backwards :-), it's better
> to provide a separate command, e.g. pythondoc.  (Analog to perldoc.)

Okay, I'll add a pythondoc proposal to the PEP.

> Yes.  Please add that option to the PEP.

Done.

> > > What does "demand-loaded" mean in a Python context?
> >
> > When you "touch" the help object, it loads the onlinehelp module which
> > has the real implementation. The thing in __builtins__ is just a
> > lightweight proxy.
> 
> Please suggest an implementation.

In the PEP.

> Glad You'Re So Agreeable. :)

What happened to your capitalization? elisp gone awry? 

> ...
> To Tell You The Truth, I'M Not Holding My Breath Either. :-)  So your
> code should just dump the doc string on stdout without interpreting it
> in any way (except for paging).

I'll do this for the first version.

> It's buggier than just that.  The output of the pager prints an extra
> "| " at the start of each page except for the first, and the first
> page is a line longer than subsequent pages.

For some reason that I now I forget, that code is pretty hairy.

> BTW, another bug: try help(cgi).  It's nice that it gives the default
> value for arguments, but the defaults for FieldStorage.__init__ happen
> to include os.environ.  Its entire value is dumped -- which causes the
> pager to be off (it wraps over about 20 lines for me).  I think you
> may have to truncate long values a bit, e.g. by using the repr module.

Okay. There are a lot of little things we need to figure out. Such as
whether we should print out docstrings for private methods etc.

>...
> I don't know specific tools, but any serious docstring processing tool
> ends up parsing the source code for this very reason, so there's
> probably plenty of prior art.

Okay, I'll look into it.

 Paul



From tim.one at home.com  Thu Dec 14 08:35:00 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 14 Dec 2000 02:35:00 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <3A387005.6725DAAE@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENPIDAA.tim.one@home.com>

[Paul Prescod]
> ...
> Along the same lines, might a new rule make Python code more robust?
> We could say that a local can only shadow a global if the local is
> formally declared. It's pretty rare that there is a good reason to
> shadow a global and Python makes it too easy to do accidentally.

I've rarely seen problems due to shadowing a global, but have often seen
problems due to shadowing a builtin.  Alas, if this rule were extended to
builtins too-- where it would do the most good --then the names of builtins
would effectively become reserved words (any code shadowing them today would
be broken until declarations were added, and any code working today may
break tomorrow if a new builtin were introduced that happened to have the
same name as a local).




From pf at artcom-gmbh.de  Thu Dec 14 08:42:59 2000
From: pf at artcom-gmbh.de (Peter Funk)
Date: Thu, 14 Dec 2000 08:42:59 +0100 (MET)
Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py)
In-Reply-To: <200012132039.MAA07496@slayer.i.sourceforge.net> from Moshe Zadka at "Dec 13, 2000 12:39:24 pm"
Message-ID: <m146T2Z-000DmFC@artcom0.artcom-gmbh.de>

Hi,

I think the following change is incompatible and will break applications.

At least I have some server type applications that rely on 
'allow_reuse_address' defaulting to 0, because they use
the 'address already in use' exception, to make sure, that exactly one
server process is running on this port.  One of these applications, 
which is BTW build on top of Fredrik Lundhs 'xmlrpclib' fails to work,
if I change this default in SocketServer.py.  

Would you please explain the reasoning behind this change?

Moshe Zadka:
> *** SocketServer.py	2000/09/01 03:25:14	1.19
> --- SocketServer.py	2000/12/13 20:39:17	1.20
> ***************
> *** 158,162 ****
>       request_queue_size = 5
>   
> !     allow_reuse_address = 0
>   
>       def __init__(self, server_address, RequestHandlerClass):
> --- 158,162 ----
>       request_queue_size = 5
>   
> !     allow_reuse_address = 1
>   
>       def __init__(self, server_address, RequestHandlerClass):

Regards, Peter
-- 
Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260
office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen)



From paul at prescod.net  Thu Dec 14 08:57:30 2000
From: paul at prescod.net (Paul Prescod)
Date: Wed, 13 Dec 2000 23:57:30 -0800
Subject: [Python-Dev] new draft of PEP 227
References: <LNBBLJKPBEHFEDALKOLCEENPIDAA.tim.one@home.com>
Message-ID: <3A387D6A.782E6A3B@prescod.net>

Tim Peters wrote:
> 
> ...
> 
> I've rarely seen problems due to shadowing a global, but have often seen
> problems due to shadowing a builtin.  

Really?

I think that there are two different issues here. One is consciously
choosing to create a new variable but not understanding that there
already exists a variable by that name. (i.e. str, list).

Another is trying to assign to a global but actually shadowing it. There
is no way that anyone coming from another language is going to consider
this transcript reasonable:

>>> a=5
>>> def show():
...    print a
...
>>> def set(val):
...     a=val
...
>>> a
5
>>> show()
5
>>> set(10)
>>> show()
5

It doesn't seem to make any sense. My solution is to make the assignment
in "set" illegal unless you add a declaration that says: "No, really. I
mean it. Override that sucker." As the PEP points out, overriding is
seldom a good idea so the requirement to declare would be rarely
invoked.

Actually, one could argue that there is no good reason to even *allow*
the shadowing of globals. You can always add an underscore to the end of
the variable name to disambiguate.

> Alas, if this rule were extended to
> builtins too-- where it would do the most good --then the names of builtins
> would effectively become reserved words (any code shadowing them today would
> be broken until declarations were added, and any code working today may
> break tomorrow if a new builtin were introduced that happened to have the
> same name as a local).

I have no good solutions to the shadowing-builtins accidently problem.
But I will say that those sorts of problems are typically less subtle:

str = "abcdef"
...
str(5) # You'll get a pretty good error message here!

The "right answer" in terms of namespace theory is to consistently refer
to builtins with a prefix (whether "__builtins__" or "$") but that's
pretty unpalatable from an aesthetic point of view.

 Paul Prescod



From tim.one at home.com  Thu Dec 14 09:41:19 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 14 Dec 2000 03:41:19 -0500
Subject: [Python-Dev] Online help scope
In-Reply-To: <3A3876DF.5554080C@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEOBIDAA.tim.one@home.com>

[Paul Prescod]
> I think Guido and I are pretty far apart on the scope and requirements
> of this online help thing so I'd like some clarification and opinions
> from the peanut gallery.
>
> Consider these scenarios
>
> a) Signature
> ...
> b) Usage hint
> ...
> c) Complete documentation, paged(man-style)
> ...
> d) Complete documentation in a user-chosen hypertext window
> ...

Guido's style guide has a lot to say about docstrings, suggesting that they
were intended to support two scenarios:  #a+#b together (the first line of a
multi-line docstring), and #c+#d together (the entire docstring).  In this
respect I think Guido was (consciously or not) aping elisp's conventions, up
to but not including the elisp convention for naming the arguments in the
first line of a docstring.  The elisp conventions were very successful
(simple, and useful in practice), so aping them is a good thing.

We've had stalemate ever since:  there isn't a single style of writing
docstrings in practice because no single docstring processor has been
blessed, while no docstring processor can gain momentum before being
blessed.  Every attempt to date has erred by trying to do too much, thus
attracting so much complaint that it can't ever become blessed.  The current
argument over PEP 233 appears more of the same.

The way to break the stalemate is to err on the side of simplicity:  just
cater to the two obvious (first-line vs whole-string) cases, and for
existing docstrings only.  HTML vs plain text is fluff.  Paging vs
non-paging is fluff.  Dumping to stdout vs displaying in a browser is fluff.
Jumping through hoops for functions and modules whose authors didn't bother
to write docstrings is fluff.  Etc.  People fight over fluff until it fills
the air and everyone chokes to death on it <0.9 wink>.

Something dirt simple can get blessed, and once *anything* is blessed, a
million docstrings will bloom.

[Guido]
> That'S What Some People Think.  I Disagree That It Would Be Either
> Feasible Or A Good Idea To Put All Documentation For A Typical Module
> In Its Doc Strings.

I'm with Paul on this one:  that's what module.__doc__ is for, IMO (Javadoc
is great, Eiffel's embedded doc tools are great, Perl POD is great, even
REBOL's interactive help is great).  All Java, Eiffel, Perl and REBOL have
in common that Python lacks is *a* blessed system, no matter how crude.

[back to Paul]
> ...
> No matter what we decide on the issue above, reusing the standard
> documentation is the only practical way of populating the help system
> in the short-term. Right now, today, there is a ton of documentation
> that exists only in LaTeX and HTML. Tons of modules have no docstrings.

Then write tools to automatically create docstrings from the LaTeX and HTML,
but *check in* the results (i.e., add the docstrings so created to the
codebase), and keep the help system simple.

> Keywords have no docstrings.

Neither do integers, but they're obvious too <wink>.




From thomas at xs4all.net  Thu Dec 14 10:13:49 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 14 Dec 2000 10:13:49 +0100
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: <20001214010534.M4396@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 14, 2000 at 01:05:34AM +0100
References: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl>
Message-ID: <20001214101348.N4396@xs4all.nl>

On Thu, Dec 14, 2000 at 01:05:34AM +0100, Thomas Wouters wrote:

> By the way, in woody, there are 52 packages with 'python' in the name, and
> 32 with 'perl' in the name...

Ah, not true, sorry. I shouldn't have posted off-topic stuff after being
awoken by machine-down-alarms ;) That was just what my reasonably-default
install had installed. Debian has what looks like most CPAN modules as
packages, too, so it's closer to a 110/410 spread (python/perl.) Still, not
a bad number :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mal at lemburg.com  Thu Dec 14 11:32:58 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 11:32:58 +0100
Subject: [Python-Dev] new draft of PEP 227
References: <14904.21739.804346.650062@bitdiddle.concentric.net>
Message-ID: <3A38A1DA.7EC49149@lemburg.com>

Jeremy Hylton wrote:
> 
> I've got a new draft of PEP 227.  The terminology and wording are more
> convoluted than they need to be.  I'll do at least one revision just
> to say things more clearly, but I'd appreciate comments on the
> proposed spec if you can read the current draft.

The PEP doesn't mention the problems I pointed out about 
breaking the lookup schemes w/r to symbols in methods, classes
and globals.

Please add a comment about this to the PEP + maybe the example
I gave in one the posts to python-dev about it. I consider
the problem serious enough to limit the nested scoping
to lambda functions (or functions in general) only if that's
possible.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Thu Dec 14 11:55:38 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 11:55:38 +0100
Subject: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule)
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl>
Message-ID: <3A38A72A.4011B5BD@lemburg.com>

Thomas Wouters wrote:
> 
> On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote:
> > > I don't think that is a very safe bet.  Python 2.0 missed the Debian
> > > Potato boat.
> >
> > This may have had to do more with the unresolved GPL issues.
> 
> This is very likely. Debian is very licence -- or at least GPL -- aware.
> Which is a pity, really, because I already prefer it over RedHat in all
> other cases (and RedHat is also pretty licence aware, just less piously,
> devoutly, beyond-practicality-IMHO dedicated to the GPL.)
 
About the GPL issue: as I understood Guido's post, RMS still regards
the choice of law clause as being incompatible to the GPL (heck,
doesn't this guy ever think about international trade terms,
the United Nations Convention on International Sale of Goods
or local law in one of the 200+ countries where you could deploy
GPLed software... is the GPL only meant for US programmers ?).

I am currently rewriting my open source licenses as well and among
other things I chose to integrate a choice of law clause as well.
Seeing RMS' view of things, I guess that my license will be regarded
as incompatible to the GPL which is sad even though I'm in good
company... e.g. the Apache license, the Zope license, etc. Dual
licensing is not possible as it would reopen the loop-wholes in the
GPL I tried to fix in my license. Any idea on how to proceed ?

Another issue: since Python doesn't link Python scripts, is it
still true that if one (pure) Python package is covered by the GPL, 
then all other packages needed by that application will also fall
under GPL ?

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From gstein at lyra.org  Thu Dec 14 12:57:43 2000
From: gstein at lyra.org (Greg Stein)
Date: Thu, 14 Dec 2000 03:57:43 -0800
Subject: (offtopic) Re: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <3A38A72A.4011B5BD@lemburg.com>; from mal@lemburg.com on Thu, Dec 14, 2000 at 11:55:38AM +0100
References: <20001213074423.A30348@kronos.cnri.reston.va.us> <kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> <3A38A72A.4011B5BD@lemburg.com>
Message-ID: <20001214035742.Z8951@lyra.org>

On Thu, Dec 14, 2000 at 11:55:38AM +0100, M.-A. Lemburg wrote:
>...
> I am currently rewriting my open source licenses as well and among
> other things I chose to integrate a choice of law clause as well.
> Seeing RMS' view of things, I guess that my license will be regarded
> as incompatible to the GPL which is sad even though I'm in good
> company... e.g. the Apache license, the Zope license, etc. Dual
> licensing is not possible as it would reopen the loop-wholes in the
> GPL I tried to fix in my license. Any idea on how to proceed ?

Only RMS is under the belief that the Apache license is incompatible. It is
either clause 4 or 5 (I forget which) where we state that certain names
(e.g. "Apache") cannot be used in derived products' names and promo
materials. RMS views this as an "additional restriction on redistribution",
which is apparently not allowed by the GPL.

We (the ASF) generally feel he is being a royal pain in the ass with this.
We've sent him a big, long email asking for clarification / resolution, but
haven't heard back (we sent it a month or so ago). Basically, his FUD
creates views such as yours ("the Apache license is incompatible with the
GPL") because people just take his word for it. We plan to put together a
web page to outline our own thoughts and licensing beliefs/philosophy.

We're also planning to rev our license to rephrase/alter the particular
clause, but for logistic purposes (putting the project name in there ties it
to the particular project; we want a generic ASF license that can be applied
to all of the projects without a search/replace).

At this point, the ASF is taking the position of ignoring him and his
controlling attitude(*) and beliefs. There is the outstanding letter to him,
but that doesn't really change our point of view.

Cheers,
-g

(*) for a person espousing freedom, it is rather ironic just how much of a
control freak he is (stemming from a no-compromise position to guarantee
peoples' freedoms, he always wants things done his way)

-- 
Greg Stein, http://www.lyra.org/



From tg at melaten.rwth-aachen.de  Thu Dec 14 14:07:12 2000
From: tg at melaten.rwth-aachen.de (Thomas Gellekum)
Date: 14 Dec 2000 14:07:12 +0100
Subject: [Python-Dev] Splitting up _cursesmodule
In-Reply-To: Andrew Kuchling's message of "Wed, 13 Dec 2000 19:26:32 -0500"
References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>
	<14903.5633.941563.568690@cj42289-a.reston1.va.home.com>
	<20001213074423.A30348@kronos.cnri.reston.va.us>
	<kqpuiwwhdd.fsf@cip12.melaten.rwth-aachen.de>
	<14903.37733.339921.131872@cj42289-a.reston1.va.home.com>
	<20001213192632.A30585@kronos.cnri.reston.va.us>
Message-ID: <kq3dfrkv7j.fsf@cip12.melaten.rwth-aachen.de>

Andrew Kuchling <akuchlin at mems-exchange.org> writes:

> I'm ambivalent about the list_of_panels.  It's a linked list storing
> (PyWindow, PyPanel) pairs.  Probably it should use a dictionary
> instead of implementing a little list, just to reduce the amount of
> code.

I don't like it either, so feel free to shred it. As I said, this is
the first (piece of an) extension module I've written and I thought it
would be easier to implement a little list than to manage a Python
list or such in C.

> So, I suggest we create _curses_panel.c, which would be available as
> curses.panel.  (A panel.py module could then add any convenience
> functions that are required.)
> 
> Thomas, do you want to work on this, or should I?

Just do it. I'll try to add more examples in the meantime.

tg



From fredrik at pythonware.com  Thu Dec 14 14:19:08 2000
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 14 Dec 2000 14:19:08 +0100
Subject: [Python-Dev] fuzzy logic?
Message-ID: <015101c065d0$717d1680$0900a8c0@SPIFF>

here's a simple (but somewhat strange) test program:

def spam():
    a = 1
    if (0):
        global a
        print "global a"
    a = 2

def egg():
    b = 1
    if 0:
        global b
        print "global b"
    b = 2

egg()
spam()

print a
print b

if I run this under 1.5.2, I get:

    2
    Traceback (innermost last):
        File "<stdin>", line 19, in ?
    NameError: b

</F>




From gstein at lyra.org  Thu Dec 14 14:42:11 2000
From: gstein at lyra.org (Greg Stein)
Date: Thu, 14 Dec 2000 05:42:11 -0800
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: <015101c065d0$717d1680$0900a8c0@SPIFF>; from fredrik@pythonware.com on Thu, Dec 14, 2000 at 02:19:08PM +0100
References: <015101c065d0$717d1680$0900a8c0@SPIFF>
Message-ID: <20001214054210.G8951@lyra.org>

I would take a guess that the "if 0:" is optimized away *before* the
inspection for a "global" statement. But the compiler doesn't know how to
optimize away "if (0):", so the global statement remains.

Ah. Just checked. Look at compile.c::com_if_stmt(). There is a call to
"is_constant_false()" in there.

Heh. Looks like is_constant_false() could be made a bit smarter. But the
point is valid: you can make is_constant_false() as smart as you want, and
you'll still end up with "funny" global behavior.

Cheers,
-g

On Thu, Dec 14, 2000 at 02:19:08PM +0100, Fredrik Lundh wrote:
> here's a simple (but somewhat strange) test program:
> 
> def spam():
>     a = 1
>     if (0):
>         global a
>         print "global a"
>     a = 2
> 
> def egg():
>     b = 1
>     if 0:
>         global b
>         print "global b"
>     b = 2
> 
> egg()
> spam()
> 
> print a
> print b
> 
> if I run this under 1.5.2, I get:
> 
>     2
>     Traceback (innermost last):
>         File "<stdin>", line 19, in ?
>     NameError: b
> 
> </F>
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Greg Stein, http://www.lyra.org/



From mwh21 at cam.ac.uk  Thu Dec 14 14:58:24 2000
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 14 Dec 2000 13:58:24 +0000
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: "Fredrik Lundh"'s message of "Thu, 14 Dec 2000 14:19:08 +0100"
References: <015101c065d0$717d1680$0900a8c0@SPIFF>
Message-ID: <m3snnr14vz.fsf@atrus.jesus.cam.ac.uk>

1) Is there anything is the standard library that does the equivalent
   of

import symbol,token

def decode_ast(ast):
    if token.ISTERMINAL(ast[0]):
        return (token.tok_name[ast[0]], ast[1])
    else:
        return (symbol.sym_name[ast[0]],)+tuple(map(decode_ast,ast[1:]))

  so that, eg:

>>> pprint.pprint(decode.decode_ast(parser.expr("0").totuple()))
('eval_input',
 ('testlist',
  ('test',
   ('and_test',
    ('not_test',
     ('comparison',
      ('expr',
       ('xor_expr',
        ('and_expr',
         ('shift_expr',
          ('arith_expr',
           ('term',
            ('factor', ('power', ('atom', ('NUMBER', '0'))))))))))))))),
 ('NEWLINE', ''),
 ('ENDMARKER', ''))

  ?  Should there be?  (Especially if it was a bit better written).

... and Greg's just said everything else I wanted to!

Cheers,
M.

-- 
  please realize that the Common  Lisp community is more than 40 
  years old.  collectively, the community has already been where 
  every clueless newbie  will be going for the next three years.  
  so relax, please.                     -- Erik Naggum, comp.lang.lisp




From guido at python.org  Thu Dec 14 15:51:26 2000
From: guido at python.org (Guido van Rossum)
Date: Thu, 14 Dec 2000 09:51:26 -0500
Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py)
In-Reply-To: Your message of "Thu, 14 Dec 2000 08:42:59 +0100."
             <m146T2Z-000DmFC@artcom0.artcom-gmbh.de> 
References: <m146T2Z-000DmFC@artcom0.artcom-gmbh.de> 
Message-ID: <200012141451.JAA15637@cj20424-a.reston1.va.home.com>

> I think the following change is incompatible and will break applications.
> 
> At least I have some server type applications that rely on 
> 'allow_reuse_address' defaulting to 0, because they use
> the 'address already in use' exception, to make sure, that exactly one
> server process is running on this port.  One of these applications, 
> which is BTW build on top of Fredrik Lundhs 'xmlrpclib' fails to work,
> if I change this default in SocketServer.py.  
> 
> Would you please explain the reasoning behind this change?

The reason for the patch is that without this, if you kill a TCP server
and restart it right away, you'll get a 'port in use" error -- TCP has
some kind of strange wait period after a connection is closed before
it can be reused.  The patch avoids this error.

As far as I know, with TCP, code using SO_REUSEADDR still cannot bind
to the port when another process is already using it, but for UDP, the
semantics may be different.

Is your server using UDP?

Try this patch if your problem is indeed related to UDP:

*** SocketServer.py	2000/12/13 20:39:17	1.20
--- SocketServer.py	2000/12/14 14:48:16
***************
*** 268,273 ****
--- 268,275 ----
  
      """UDP server class."""
  
+     allow_reuse_address = 0
+ 
      socket_type = socket.SOCK_DGRAM
  
      max_packet_size = 8192

If this works for you, I'll check it in, of course.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Thu Dec 14 15:52:37 2000
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 14 Dec 2000 09:52:37 -0500 (EST)
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <3A38A1DA.7EC49149@lemburg.com>
References: <14904.21739.804346.650062@bitdiddle.concentric.net>
	<3A38A1DA.7EC49149@lemburg.com>
Message-ID: <14904.57013.371474.691948@bitdiddle.concentric.net>

>>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:

  MAL> Jeremy Hylton wrote:
  >>
  >> I've got a new draft of PEP 227.  The terminology and wording are
  >> more convoluted than they need to be.  I'll do at least one
  >> revision just to say things more clearly, but I'd appreciate
  >> comments on the proposed spec if you can read the current draft.

  MAL> The PEP doesn't mention the problems I pointed out about
  MAL> breaking the lookup schemes w/r to symbols in methods, classes
  MAL> and globals.

I believe it does.  There was some discussion on python-dev and
with others in private email about how classes should be handled.

The relevant section of the specification is:

    If a name is used within a code block, but it is not bound there
    and is not declared global, the use is treated as a reference to
    the nearest enclosing function region.  (Note: If a region is
    contained within a class definition, the name bindings that occur
    in the class block are not visible to enclosed functions.)

  MAL> Please add a comment about this to the PEP + maybe the example
  MAL> I gave in one the posts to python-dev about it. I consider the
  MAL> problem serious enough to limit the nested scoping to lambda
  MAL> functions (or functions in general) only if that's possible.

If there was some other concern you had, then I don't know what it
was.  I recall that you had a longish example that raised a NameError
immediately :-).

Jeremy



From mal at lemburg.com  Thu Dec 14 16:02:33 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 16:02:33 +0100
Subject: [Python-Dev] new draft of PEP 227
References: <14904.21739.804346.650062@bitdiddle.concentric.net>
		<3A38A1DA.7EC49149@lemburg.com> <14904.57013.371474.691948@bitdiddle.concentric.net>
Message-ID: <3A38E109.54C07565@lemburg.com>

Jeremy Hylton wrote:
> 
> >>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:
> 
>   MAL> Jeremy Hylton wrote:
>   >>
>   >> I've got a new draft of PEP 227.  The terminology and wording are
>   >> more convoluted than they need to be.  I'll do at least one
>   >> revision just to say things more clearly, but I'd appreciate
>   >> comments on the proposed spec if you can read the current draft.
> 
>   MAL> The PEP doesn't mention the problems I pointed out about
>   MAL> breaking the lookup schemes w/r to symbols in methods, classes
>   MAL> and globals.
> 
> I believe it does.  There was some discussion on python-dev and
> with others in private email about how classes should be handled.
> 
> The relevant section of the specification is:
> 
>     If a name is used within a code block, but it is not bound there
>     and is not declared global, the use is treated as a reference to
>     the nearest enclosing function region.  (Note: If a region is
>     contained within a class definition, the name bindings that occur
>     in the class block are not visible to enclosed functions.)

Well hidden ;-)

Honestly, I think that you should either make this specific case
more visible to readers of the PEP since this single detail would
produce most of the problems with nested scopes.

BTW, what about nested classes ? AFAIR, the PEP only talks about
nested functions.

>   MAL> Please add a comment about this to the PEP + maybe the example
>   MAL> I gave in one the posts to python-dev about it. I consider the
>   MAL> problem serious enough to limit the nested scoping to lambda
>   MAL> functions (or functions in general) only if that's possible.
> 
> If there was some other concern you had, then I don't know what it
> was.  I recall that you had a longish example that raised a NameError
> immediately :-).

The idea behind the example should have been clear, though.

x = 1
class C:
   x = 2
   def test(self):
       print x
  
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fdrake at acm.org  Thu Dec 14 16:09:57 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 14 Dec 2000 10:09:57 -0500 (EST)
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: <m3snnr14vz.fsf@atrus.jesus.cam.ac.uk>
References: <015101c065d0$717d1680$0900a8c0@SPIFF>
	<m3snnr14vz.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <14904.58053.282537.260186@cj42289-a.reston1.va.home.com>

Michael Hudson writes:
 > 1) Is there anything is the standard library that does the equivalent
 >    of

  No, but I have a chunk of code that does in a different way.  Where
in the library do you think it belongs?  The compiler package sounds
like the best place, but that's not installed by default.  (Jeremy, is
that likely to change soon?)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From mwh21 at cam.ac.uk  Thu Dec 14 16:47:33 2000
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 14 Dec 2000 15:47:33 +0000
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: "Fred L. Drake, Jr."'s message of "Thu, 14 Dec 2000 10:09:57 -0500 (EST)"
References: <015101c065d0$717d1680$0900a8c0@SPIFF> <m3snnr14vz.fsf@atrus.jesus.cam.ac.uk> <14904.58053.282537.260186@cj42289-a.reston1.va.home.com>
Message-ID: <m3vgsnovhm.fsf@atrus.jesus.cam.ac.uk>

"Fred L. Drake, Jr." <fdrake at acm.org> writes:

> Michael Hudson writes:
>  > 1) Is there anything is the standard library that does the equivalent
>  >    of
> 
>   No, but I have a chunk of code that does in a different way.  

I'm guessing everyone who's played with the parser much does, hence
the suggestion.  I agree my implementation is probably not optimal - I
just threw it together as quickly as I could!

> Where in the library do you think it belongs?  The compiler package
> sounds like the best place, but that's not installed by default.
> (Jeremy, is that likely to change soon?)

Actually, I'd have thought the parser module would be most natural,
but that would probably mean doing the _module.c trick, and it's
probably not worth the bother.  OTOH, it seems that wrapping any given
extension module in a python module is becoming if anything the norm,
so maybe it is.

Cheers,
M.

-- 
  I don't remember any dirty green trousers.
                                             -- Ian Jackson, ucam.chat




From nowonder at nowonder.de  Thu Dec 14 16:50:10 2000
From: nowonder at nowonder.de (Peter Schneider-Kamp)
Date: Thu, 14 Dec 2000 16:50:10 +0100
Subject: [Python-Dev] [PEP-212] new draft
Message-ID: <3A38EC32.210BD1A2@nowonder.de>

In an attempt to revive PEP 212 - Loop counter iteration I have
updated the draft. The HTML version can be found at:

http://python.sourceforge.net/peps/pep-0212.html

I will appreciate any form of comments and/or criticisms.

Peter

P.S.: Now I have posted it - should I update the Post-History?
      Or is that for posts to c.l.py?



From pf at artcom-gmbh.de  Thu Dec 14 16:56:08 2000
From: pf at artcom-gmbh.de (Peter Funk)
Date: Thu, 14 Dec 2000 16:56:08 +0100 (MET)
Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py)
In-Reply-To: <200012141451.JAA15637@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 14, 2000  9:51:26 am"
Message-ID: <m146ajo-000DmFC@artcom0.artcom-gmbh.de>

Hi,

Moshes checkin indeed makes a lot of sense.  Sorry for the irritation.

Guido van Rossum:
> The reason for the patch is that without this, if you kill a TCP server
> and restart it right away, you'll get a 'port in use" error -- TCP has
> some kind of strange wait period after a connection is closed before
> it can be reused.  The patch avoids this error.
> 
> As far as I know, with TCP, code using SO_REUSEADDR still cannot bind
> to the port when another process is already using it, but for UDP, the
> semantics may be different.
> 
> Is your server using UDP?

No and I must admit, that I didn't tested carefully enough:  From
a quick look at my process listing I assumed there were indeed
two server processes running concurrently which would have broken
the needed mutual exclusion.  But the second process went in
a sleep-and-retry-to-connect-loop which I simply forgot about.
This loop was initially built into my server to wait until the
"strange wait period" you mentioned above was over or a certain
number of retries has been exceeded.

I guess I can take this ugly work-around out with Python 2.0 and newer,
since the BaseHTTPServer.py shipped with Python 2.0 already contained
allow_reuse_address = 1 default in the HTTPServer class.

BTW: I've took my old W.Richard Stevens Unix Network Programming
from the shelf.  After rereading the rather terse paragraph about
SO_REUSEADDR I guess the wait period is necessary to make sure, that
their is no connect pending from an outside client on this TCP port.
I can't find nothing about UDP and REUSE.

Regards, Peter



From guido at python.org  Thu Dec 14 17:17:27 2000
From: guido at python.org (Guido van Rossum)
Date: Thu, 14 Dec 2000 11:17:27 -0500
Subject: [Python-Dev] Online help scope
In-Reply-To: Your message of "Wed, 13 Dec 2000 23:29:35 PST."
             <3A3876DF.5554080C@ActiveState.com> 
References: <3A3876DF.5554080C@ActiveState.com> 
Message-ID: <200012141617.LAA16179@cj20424-a.reston1.va.home.com>

> I think Guido and I are pretty far apart on the scope and requirements
> of this online help thing so I'd like some clarification and opinions
> from the peanut gallery.

I started replying but I think Tim's said it all.  Let's do something
dead simple.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Thu Dec 14 18:14:01 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Thu, 14 Dec 2000 12:14:01 -0500
Subject: [Python-Dev] [PEP-212] new draft
References: <3A38EC32.210BD1A2@nowonder.de>
Message-ID: <14904.65497.940293.975775@anthem.concentric.net>

>>>>> "PS" == Peter Schneider-Kamp <nowonder at nowonder.de> writes:

    PS> P.S.: Now I have posted it - should I update the Post-History?
    PS> Or is that for posts to c.l.py?

Originally, I'd thought of it as tracking the posting history to
c.l.py.  I'm not sure how useful that header is after all -- maybe in
just giving a start into the python-list archives...

-Barry



From tim.one at home.com  Thu Dec 14 18:33:41 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 14 Dec 2000 12:33:41 -0500
Subject: [Python-Dev] fuzzy logic?
In-Reply-To: <015101c065d0$717d1680$0900a8c0@SPIFF>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEOPIDAA.tim.one@home.com>

Note that the behavior of both functions is undefined ("Names listed in a
global statement must not be used in the same code block textually preceding
that global statement", from the Lang Ref, and "if" does not introduce a new
code block in Python's terminology).

But you'll get the same outcome via these trivial variants, which sidestep
that problem:

def spam():
    if (0):
        global a
        print "global a"
    a = 2

def egg():
    if 0:
        global b
        print "global b"
    b = 2

*Now* you can complain <wink>.


> -----Original Message-----
> From: python-dev-admin at python.org [mailto:python-dev-admin at python.org]On
> Behalf Of Fredrik Lundh
> Sent: Thursday, December 14, 2000 8:19 AM
> To: python-dev at python.org
> Subject: [Python-Dev] fuzzy logic?
>
>
> here's a simple (but somewhat strange) test program:
>
> def spam():
>     a = 1
>     if (0):
>         global a
>         print "global a"
>     a = 2
>
> def egg():
>     b = 1
>     if 0:
>         global b
>         print "global b"
>     b = 2
>
> egg()
> spam()
>
> print a
> print b
>
> if I run this under 1.5.2, I get:
>
>     2
>     Traceback (innermost last):
>         File "<stdin>", line 19, in ?
>     NameError: b
>
> </F>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://www.python.org/mailman/listinfo/python-dev




From tim.one at home.com  Thu Dec 14 19:46:09 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 14 Dec 2000 13:46:09 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule)
In-Reply-To: <3A38A72A.4011B5BD@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPFIDAA.tim.one@home.com>

[MAL]
> About the GPL issue: as I understood Guido's post, RMS still regards
> the choice of law clause as being incompatible to the GPL

Yes.  Actually, I don't know what RMS really thinks -- his public opinions
on legal issues appear to be echoes of what Eben Moglen tells him.  Like his
views or not, Moglen is a tenured law professor

> (heck, doesn't this guy ever think about international trade terms,
> the United Nations Convention on International Sale of Goods
> or local law in one of the 200+ countries where you could deploy
> GPLed software...

Yes.

> is the GPL only meant for US programmers ?).

No.  Indeed, that's why the GPL is grounded in copyright law, because
copyright law is the most uniform (across countries) body of law we've got.
Most commentary I've seen suggests that the GPL has its *weakest* legal legs
in the US!

> I am currently rewriting my open source licenses as well and among
> other things I chose to integrate a choice of law clause as well.
> Seeing RMS' view of things, I guess that my license will be regarded
> as incompatible to the GPL

Yes.

> which is sad even though I'm in good company... e.g. the Apache
> license, the Zope license, etc. Dual licensing is not possible as
> it would reopen the loop-wholes in the GPL I tried to fix in my
> license. Any idea on how to proceed ?

You can wait to see how the CNRI license turns out, then copy it if it's
successful; you can approach the FSF directly; you can stop trying to do it
yourself and reuse some license that's already been blessed by the FSF; or
you can give up on GPL compatibility (according to the FSF).  I don't see
any other choices.

> Another issue: since Python doesn't link Python scripts, is it
> still true that if one (pure) Python package is covered by the GPL,
> then all other packages needed by that application will also fall
> under GPL ?

Sorry, couldn't make sense of the question.  Just as well, since you should
ask about it on a GNU forum anyway <wink>.




From mal at lemburg.com  Thu Dec 14 21:02:05 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 21:02:05 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <LNBBLJKPBEHFEDALKOLCIEPFIDAA.tim.one@home.com>
Message-ID: <3A39273D.4AE24920@lemburg.com>

Tim Peters wrote:
> 
> [MAL]
> > About the GPL issue: as I understood Guido's post, RMS still regards
> > the choice of law clause as being incompatible to the GPL
> 
> Yes.  Actually, I don't know what RMS really thinks -- his public opinions
> on legal issues appear to be echoes of what Eben Moglen tells him.  Like his
> views or not, Moglen is a tenured law professor

But it's his piece of work, isn't it ? He's the one who can change it.
 
> > (heck, doesn't this guy ever think about international trade terms,
> > the United Nations Convention on International Sale of Goods
> > or local law in one of the 200+ countries where you could deploy
> > GPLed software...
> 
> Yes.

Strange, then how come he sees the choice of law clause as a problem:
without explicitely ruling out the applicability of the UN CISC,
this clause is waived by it anyway... at least according to a 
specialist on software law here in Germany.

> > is the GPL only meant for US programmers ?).
> 
> No.  Indeed, that's why the GPL is grounded in copyright law, because
> copyright law is the most uniform (across countries) body of law we've got.
> Most commentary I've seen suggests that the GPL has its *weakest* legal legs
> in the US!

Huh ? Just an example: in Germany customer rights assure a 6 month
warranty on everything you buy or obtain in some other way. Liability
is another issue: there are some very unpleasant laws which render
most of the "no liability" paragraphs in licenses useless in Germany.

Even better: since the license itself is written in English a
German party could simply consider the license non-binding, since
he or she hasn't agreed to accept contract in foreign languages.
France has similar interpretations.

> > I am currently rewriting my open source licenses as well and among
> > other things I chose to integrate a choice of law clause as well.
> > Seeing RMS' view of things, I guess that my license will be regarded
> > as incompatible to the GPL
> 
> Yes.
> 
> > which is sad even though I'm in good company... e.g. the Apache
> > license, the Zope license, etc. Dual licensing is not possible as
> > it would reopen the loop-wholes in the GPL I tried to fix in my
> > license. Any idea on how to proceed ?
> 
> You can wait to see how the CNRI license turns out, then copy it if it's
> successful; you can approach the FSF directly; you can stop trying to do it
> yourself and reuse some license that's already been blessed by the FSF; or
> you can give up on GPL compatibility (according to the FSF).  I don't see
> any other choices.

I guess I'll go with the latter.
 
> > Another issue: since Python doesn't link Python scripts, is it
> > still true that if one (pure) Python package is covered by the GPL,
> > then all other packages needed by that application will also fall
> > under GPL ?
> 
> Sorry, couldn't make sense of the question.  Just as well, since you should
> ask about it on a GNU forum anyway <wink>.

Isn't this question (whether the GPL virus applies to byte-code
as well) important to Python programmers as well ?

Oh well, nevermind... it's still nice to hear that CNRI and RMS
have finally made up their minds to render Python GPL-compatible --
whatever this means ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From cgw at fnal.gov  Thu Dec 14 22:06:43 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Thu, 14 Dec 2000 15:06:43 -0600 (CST)
Subject: [Python-Dev] memory leaks
Message-ID: <14905.13923.659879.100243@buffalo.fnal.gov>

The following code (extracted from test_extcall.py) leaks memory:

class Foo:
   def method(self, arg1, arg2):
        return arg1 + arg2

def f():
    err = None
    try:
        Foo.method(*(1, 2, 3))
    except TypeError, err:
        pass
    del err



One-line fix (also posted to Sourceforge):

--- Python/ceval.c	2000/10/30 17:15:19	2.213
+++ Python/ceval.c	2000/12/14 20:54:02
@@ -1905,8 +1905,7 @@
 							 class))) {
 				    PyErr_SetString(PyExc_TypeError,
 	    "unbound method must be called with instance as first argument");
-				    x = NULL;
-				    break;
+				    goto extcall_fail;
 				}
 			    }
 			}



I think that there are a bunch more memory leaks lurking around...
this only fixes one of them.  I'll send more info as I find out what's
going on.




From tim.one at home.com  Thu Dec 14 22:28:09 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 14 Dec 2000 16:28:09 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <3A39273D.4AE24920@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEPLIDAA.tim.one@home.com>

I'm not going to argue about the GPL.  Take it up with the FSF!  I will say
that if you do get the FSF's attention, Moglen will have an instant counter
to any objection you're likely to raise -- he's been thinking about this for
10 years, and he's heard it all.  And in our experience, RMS won't commit to
anything before running it past Moglen.

[MAL]
> But it's his [RMS's] piece of work, isn't it ? He's the one who can
> change it.

Akin to saying Python is Guido's piece of work.  Yes, no, kinda, more true
at some times than others, ditto respects.  RMS has consistently said that
any changes for the next version of the GPL will take at least a year, due
to extensive legal review required first.  Would be more clearly true to say
that the first version of the GPL was RMS's alone -- but version 2 came out
in 1991.

> ...
> Strange, then how come he sees the choice of law clause as a problem:
> without explicitely ruling out the applicability of the UN CISC,
> this clause is waived by it anyway... at least according to a
> specialist on software law here in Germany.
> ... [and other "who knows?" objections] ...

Guido quoted the text of your Wed, 06 Sep 2000 14:19:09 +0200 "Re:
[License-py20] Re: GPL incompability as seen from Europe" msg to Moglen, who
dismissed it almost offhandedly as "layman's commentary".  You'll have to
ask him why:  MAL, we're not lawyers.  We're incompetent to have this
discussion -- or at least I am, and Moglen thinks you are too <wink>.

>>> Another issue: since Python doesn't link Python scripts, is it
>>> still true that if one (pure) Python package is covered by the GPL,
>>> then all other packages needed by that application will also fall
>>> under GPL ?

[Tim]
>> Sorry, couldn't make sense of the question.  Just as well,
>> since you should ask about it on a GNU forum anyway <wink>.

[MAL]
> Isn't this question (whether the GPL virus applies to byte-code
> as well) important to Python programmers as well ?

I don't know -- like I said, I couldn't make sense of the question, i.e. I
couldn't figure out what it is you're asking.  I *suspect* it's based on a
misunderstanding of the GPL; for example, gcc is a GPL'ed application that
requires stuff from the OS in order to do its job of compiling, but that
doesn't mean that every OS it runs on falls under the GPL.  The GPL contains
no restrictions on *use*, it restricts only copying, modifying and
distributing (the specific rights granted by copyright law).  I don't see
any way to read the GPL as restricting your ability to distribute a GPL'ed
program P on its own, no matter what the status of the packages that P may
rely upon for operation.

The GPL is also not viral in the sense that it cannot infect an unwitting
victim.  Nothing whatsoever you do or don't do can make *any* other program
Q "fall under" the GPL -- only Q's owner can set the license for Q.  The GPL
purportedly can prevent you from distributing (but not from using) a program
that links with a GPL'ed program, but that doesn't appear to be what you're
asking about.  Or is it?

If you were to put, say, mxDateTime, under the GPL, then yes, I believe the
FSF would claim I could not distribute my program T that uses mxDateTime
unless T were also under the GPL or a GPL-compatible license.  But if
mxDateTime is not under the GPL, then nothing I do with T can magically
change the mxDateTime license to the GPL (although if your mxDateTime
license allows me to redistribute mxDateTime under a different license, then
it allows me to ship a copy of mxDateTime under the GPL).

That said, the whole theory of GPL linking is muddy to me, especially since
the word "link" (and its variants) doesn't appear in the GPL.

> Oh well, nevermind... it's still nice to hear that CNRI and RMS
> have finally made up their minds to render Python GPL-compatible --
> whatever this means ;-)

I'm not sure it means anything yet.  CNRI and the FSF believed they reached
agreement before, but that didn't last after Moglen and Kahn each figured
out what the other was really suggesting.




From mal at lemburg.com  Thu Dec 14 23:25:31 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 14 Dec 2000 23:25:31 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <LNBBLJKPBEHFEDALKOLCEEPLIDAA.tim.one@home.com>
Message-ID: <3A3948DB.9165E404@lemburg.com>

Tim Peters wrote:
> 
> I'm not going to argue about the GPL.  Take it up with the FSF! 

Sorry, I got a bit carried away -- I don't want to take it up
with the FSF, simply because I couldn't care less. What's bugging
me is that this one guy is splitting the OSS world in two even 
though both halfs actually want the same thing: software which
you can use for free with full source code. I find that a very 
poor situation.

> I will say
> that if you do get the FSF's attention, Moglen will have an instant counter
> to any objection you're likely to raise -- he's been thinking about this for
> 10 years, and he's heard it all.  And in our experience, RMS won't commit to
> anything before running it past Moglen.
> 
> [MAL]
> > But it's his [RMS's] piece of work, isn't it ? He's the one who can
> > change it.
> 
> Akin to saying Python is Guido's piece of work.  Yes, no, kinda, more true
> at some times than others, ditto respects.  RMS has consistently said that
> any changes for the next version of the GPL will take at least a year, due
> to extensive legal review required first.  Would be more clearly true to say
> that the first version of the GPL was RMS's alone -- but version 2 came out
> in 1991.

Point taken.
 
> > ...
> > Strange, then how come he sees the choice of law clause as a problem:
> > without explicitely ruling out the applicability of the UN CISC,
> > this clause is waived by it anyway... at least according to a
> > specialist on software law here in Germany.
> > ... [and other "who knows?" objections] ...
> 
> Guido quoted the text of your Wed, 06 Sep 2000 14:19:09 +0200 "Re:
> [License-py20] Re: GPL incompability as seen from Europe" msg to Moglen, who
> dismissed it almost offhandedly as "layman's commentary".  You'll have to
> ask him why:  MAL, we're not lawyers.  We're incompetent to have this
> discussion -- or at least I am, and Moglen thinks you are too <wink>.

I'm not a lawyer either, but I am able to apply common sense and 
know about German trade laws. Anyway, here a reference which
covers all the controversial subjects. It's in German, but these
guys qualify as lawyers ;-) ...

	http://www.ifross.de/ifross_html/index.html

There's also a book on the subject in German which covers
all aspects of software licensing. Here's the reference in
case anyone cares:

	Jochen Marly, Software?berlassungsvertr?ge
	C.H. Beck, M?nchen, 2000
 
> >>> Another issue: since Python doesn't link Python scripts, is it
> >>> still true that if one (pure) Python package is covered by the GPL,
> >>> then all other packages needed by that application will also fall
> >>> under GPL ?
> 
> [Tim]
> >> Sorry, couldn't make sense of the question.  Just as well,
> >> since you should ask about it on a GNU forum anyway <wink>.
> 
> [MAL]
> > Isn't this question (whether the GPL virus applies to byte-code
> > as well) important to Python programmers as well ?
> 
> I don't know -- like I said, I couldn't make sense of the question, i.e. I
> couldn't figure out what it is you're asking.  I *suspect* it's based on a
> misunderstanding of the GPL; for example, gcc is a GPL'ed application that
> requires stuff from the OS in order to do its job of compiling, but that
> doesn't mean that every OS it runs on falls under the GPL.  The GPL contains
> no restrictions on *use*, it restricts only copying, modifying and
> distributing (the specific rights granted by copyright law).  I don't see
> any way to read the GPL as restricting your ability to distribute a GPL'ed
> program P on its own, no matter what the status of the packages that P may
> rely upon for operation.

This is very controversial: if an application Q needs a GPLed 
library P to work, then P and Q form a new whole in the sense of
the GPL. And this even though P wasn't even distributed together
with Q. Don't ask me why, but that's how RMS and folks look at it.

It can be argued that the dynamic linker actually integrates
P into Q, but is the same argument valid for a Python program Q
which relies on a GPLed package P ? (The relationship between
Q and P is one of providing interfaces -- there is no call address
patching required for the setup to work.)

> The GPL is also not viral in the sense that it cannot infect an unwitting
> victim.  Nothing whatsoever you do or don't do can make *any* other program
> Q "fall under" the GPL -- only Q's owner can set the license for Q.  The GPL
> purportedly can prevent you from distributing (but not from using) a program
> that links with a GPL'ed program, but that doesn't appear to be what you're
> asking about.  Or is it?

No. What's viral about the GPL is that you can turn an application
into a GPLed one by merely linking the two together -- that's why
e.g. the libc is distributed under the LGPL which doesn't have this
viral property.
 
> If you were to put, say, mxDateTime, under the GPL, then yes, I believe the
> FSF would claim I could not distribute my program T that uses mxDateTime
> unless T were also under the GPL or a GPL-compatible license.  But if
> mxDateTime is not under the GPL, then nothing I do with T can magically
> change the mxDateTime license to the GPL (although if your mxDateTime
> license allows me to redistribute mxDateTime under a different license, then
> it allows me to ship a copy of mxDateTime under the GPL).
>
> That said, the whole theory of GPL linking is muddy to me, especially since
> the word "link" (and its variants) doesn't appear in the GPL.

True.
 
> > Oh well, nevermind... it's still nice to hear that CNRI and RMS
> > have finally made up their minds to render Python GPL-compatible --
> > whatever this means ;-)
> 
> I'm not sure it means anything yet.  CNRI and the FSF believed they reached
> agreement before, but that didn't last after Moglen and Kahn each figured
> out what the other was really suggesting.

Oh boy...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From greg at cosc.canterbury.ac.nz  Fri Dec 15 00:19:09 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 15 Dec 2000 12:19:09 +1300 (NZDT)
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <3A3948DB.9165E404@lemburg.com>
Message-ID: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz>

"M.-A. Lemburg" <mal at lemburg.com>:
> if an application Q needs a GPLed 
> library P to work, then P and Q form a new whole in the sense of
> the GPL.

I don't see how Q can *need* any particular library P
to work. The most it can need is some library with
an API which is compatible with P's. So I don't
buy that argument.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From greg at cosc.canterbury.ac.nz  Fri Dec 15 00:58:24 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 15 Dec 2000 12:58:24 +1300 (NZDT)
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <3A387005.6725DAAE@ActiveState.com>
Message-ID: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz>

Paul Prescod <paulp at ActiveState.com>:

> We could say that a local can only shadow a global 
> if the local is formally declared.

How do you intend to enforce that? Seems like it would
require a test on every assignment to a local, to make
sure nobody has snuck in a new global since the function
was compiled.

> Actually, one could argue that there is no good reason to 
> even *allow* the shadowing of globals.

If shadowing were completely disallowed, it would make it
impossible to write a completely self-contained function
whose source could be moved from one environment to another
without danger of it breaking. I wouldn't like the language
to have a characteristic like that.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From greg at cosc.canterbury.ac.nz  Fri Dec 15 01:06:12 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 15 Dec 2000 13:06:12 +1300 (NZDT)
Subject: [Python-Dev] Online help scope
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEOBIDAA.tim.one@home.com>
Message-ID: <200012150006.NAA02154@s454.cosc.canterbury.ac.nz>

Tim Peters <tim.one at home.com>:

> [Paul Prescod]

> > Keywords have no docstrings.

> Neither do integers, but they're obvious too <wink>.

Oh, I don't know, it could be useful.

>>> help(2)
The first prime number.

>>> help(2147483647)
sys.maxint, the largest Python small integer.

>>> help(42)
The answer to the ultimate question of life, the universe
and everything. See also: ultimate_question.

>>> help("ultimate_question")
[Importing research.mice.earth]
[Calling earth.find_ultimate_question]
This may take about 10 million years, please be patient...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From barry at digicool.com  Fri Dec 15 01:33:16 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Thu, 14 Dec 2000 19:33:16 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <3A3948DB.9165E404@lemburg.com>
	<200012142319.MAA02140@s454.cosc.canterbury.ac.nz>
Message-ID: <14905.26316.407495.981198@anthem.concentric.net>

>>>>> "GE" == Greg Ewing <greg at cosc.canterbury.ac.nz> writes:

    GE> I don't see how Q can *need* any particular library P to
    GE> work. The most it can need is some library with an API which
    GE> is compatible with P's. So I don't buy that argument.

It's been my understanding that the FSF's position on this is as
follows.  If the only functional implementation of the API is GPL'd
software then simply writing your code against that API is tantamount
to linking with that software.  Their reasoning is that the clear
intent of the programmer (shut up, Chad) is to combine the program
with GPL code.  As soon as there is a second, non-GPL implementation
of the API, you're fine because while you may not distribute your
program with the GPL'd software linked in, those who receive your
software wouldn't be forced to combine GPL and non-GPL code.

-Barry



From tim.one at home.com  Fri Dec 15 04:01:36 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 14 Dec 2000 22:01:36 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <3A3948DB.9165E404@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEAEIEAA.tim.one@home.com>

[MAL]
> Sorry, I got a bit carried away -- I don't want to take it up
> with the FSF, simply because I couldn't care less.

Well, nobody else is able to Pronounce on what the FSF believes or will do.
Which tells me that you're not really interested in playing along with the
FSF here after all -- which we both knew from the start anyway <wink>.

> What's bugging me is that this one guy is splitting the OSS world

There are many people on the FSF bandwagon.  I'm not one of them, but I can
count.

> in two even though both halfs actually want the same thing: software
> which you can use for free with full source code. I find that a very
> poor situation.

RMS would not agree that both halves want the same thing; to the contrary,
he's openly contemptuous of the Open Source movement -- which you also knew
from the start.

> [stuff about German law I won't touch with 12-foot schnitzel]

OTOH, a German FSF advocate assured me:

    I also tend to forget that the system of the law works different
    in the US as in Germany. In Germany something that most people
    will believe (called "common grounds") play a role in the court.
    So if you knew, because it is widely known what the GPL means,
    than it is harder to attack that in court.

In the US, when something gets to court it doesn't matter at all what people
believed about it.  Heck, we'll let mass murderers go free if a comma was in
the wrong place in a 1592 statute, or send a kid to jail for life for using
crack cocaine instead of the flavor favored by stockbrokers <wink>.  I hope
the US is unique in that respect, but it does makes the GPL weaker here
because even if *everyone* in our country believed the GPL means what RMS
says it means, a US court would give that no weight in its logic-chopping.

>>> Another issue: since Python doesn't link Python scripts, is it
>>> still true that if one (pure) Python package is covered by the GPL,
>>> then all other packages needed by that application will also fall
>>> under GPL ?

> This is very controversial: if an application Q needs a GPLed
> library P to work, then P and Q form a new whole in the sense of
> the GPL. And this even though P wasn't even distributed together
> with Q. Don't ask me why, but that's how RMS and folks look at it.

Understood, but have you reread your question above, which I've said twice I
can't make sense of?  That's not what you were asking about.  Your question
above asks, if anything, the opposite:  the *application* Q is GPL'ed, and
the question above asks whether that means the *Ps* it depends on must also
be GPL'ed.  To the best of my ability, I've answered "NO" to that one, and
"YES" to the question it appears you meant to ask.

> It can be argued that the dynamic linker actually integrates
> P into Q, but is the same argument valid for a Python program Q
> which relies on a GPLed package P ? (The relationship between
> Q and P is one of providing interfaces -- there is no call address
> patching required for the setup to work.)

As before, I believe the FSF will say YES.  Unless there's also a non-GPL'ed
implementation of the same interface that people could use just as well.
See my extended mxDateTime example too.

> ...
> No. What's viral about the GPL is that you can turn an application
> into a GPLed one by merely linking the two together

No, you cannot.  You can link them together all day without any hassle.
What you cannot do is *distribute* it unless the aggregate is first placed
under the GPL (or a GPL-compatible license) too.  If you distribute it
without taking that step, that doesn't turn it into a GPL'ed application
either -- in that case you've simply (& supposedly) violated the license on
P, so your distribution was simply (& supposedly) illegal.  And that is in
fact the end result that people who knowingly use the GPL want (granting
that it appears most people who use the GPL do so unknowing of its
consequences).

> -- that's why e.g. the libc is distributed under the LGPL which
> doesn't have this viral property.

You should read RMS on why glibc is under the LGPL:

    http://www.fsf.org/philosophy/why-not-lgpl.html

It will at least disabuse you of the notion that RMS and you are after the
same thing <wink>.




From paulp at ActiveState.com  Fri Dec 15 05:02:08 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Thu, 14 Dec 2000 20:02:08 -0800
Subject: [Python-Dev] new draft of PEP 227
References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz>
Message-ID: <3A3997C0.F977AF51@ActiveState.com>

Greg Ewing wrote:
> 
> Paul Prescod <paulp at ActiveState.com>:
> 
> > We could say that a local can only shadow a global
> > if the local is formally declared.
> 
> How do you intend to enforce that? Seems like it would
> require a test on every assignment to a local, to make
> sure nobody has snuck in a new global since the function
> was compiled.

I would expect that all of the checks would be at compile-time. Except
for __dict__ hackery, I think it is doable. Python already keeps track
of all assignments to locals and all assignments to globals in a
function scope. The only addition is keeping track of assignments at a
global scope.

> > Actually, one could argue that there is no good reason to
> > even *allow* the shadowing of globals.
> 
> If shadowing were completely disallowed, it would make it
> impossible to write a completely self-contained function
> whose source could be moved from one environment to another
> without danger of it breaking. I wouldn't like the language
> to have a characteristic like that.

That seems like a very esoteric requirement. How often do you have
functions that do not rely *at all* on their environment (other
functions, import statements, global variables).

When you move code you have to do some rewriting or customizing of the
environment in 94% of the cases. How much effort do you want to spend on
the other 6%? Also, there are tools that are designed to help you move
code without breaking programs (refactoring editors). They can just as
easily handle renaming local variables as adding import statements and
fixing up function calls.

 Paul Prescod



From mal at lemburg.com  Fri Dec 15 11:05:59 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 11:05:59 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz>
Message-ID: <3A39ED07.6B3EE68E@lemburg.com>

Greg Ewing wrote:
> 
> "M.-A. Lemburg" <mal at lemburg.com>:
> > if an application Q needs a GPLed
> > library P to work, then P and Q form a new whole in the sense of
> > the GPL.
> 
> I don't see how Q can *need* any particular library P
> to work. The most it can need is some library with
> an API which is compatible with P's. So I don't
> buy that argument.

It's the view of the FSF, AFAIK. You can't distribute an application
in binary which dynamically links against libreadline (which is GPLed)
on the user's machine, since even though you don't distribute
libreadline the application running on the user's machine is
considered the "whole" in terms of the GPL.

FWIW, I don't agree with that view either, but that's probably
because I'm a programmer and not a lawyer :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Fri Dec 15 11:25:12 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 11:25:12 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <LNBBLJKPBEHFEDALKOLCKEAEIEAA.tim.one@home.com>
Message-ID: <3A39F188.E366B481@lemburg.com>

Tim Peters wrote:
> 
> [Tim and MAL talking about the FSF and their views]
> 
> [Tim and MAL showing off as hobby advocates ;-)]
> 
> >>> Another issue: since Python doesn't link Python scripts, is it
> >>> still true that if one (pure) Python package is covered by the GPL,
> >>> then all other packages needed by that application will also fall
> >>> under GPL ?
> 
> > This is very controversial: if an application Q needs a GPLed
> > library P to work, then P and Q form a new whole in the sense of
> > the GPL. And this even though P wasn't even distributed together
> > with Q. Don't ask me why, but that's how RMS and folks look at it.
> 
> Understood, but have you reread your question above, which I've said twice I
> can't make sense of? 

I know, it was backwards. 

Take an example: I have a program which
wants to process MP3 files in some way. Now because of some stroke
is luck, all Python MP3 modules out there are covered by the GPL.

Now I could write an application which uses a certain interface
and then tell the user to install the MP3 module separately.
As Barry mentioned, this setup will cause distribution of my 
application to be illegal because I could have only done so
by putting the application under the GPL.

> You should read RMS on why glibc is under the LGPL:
> 
>     http://www.fsf.org/philosophy/why-not-lgpl.html
> 
> It will at least disabuse you of the notion that RMS and you are after the
> same thing <wink>.

:-) 

Let's stop this discussion and get back to those cheerful things
like Christmas Bells and Santa Clause... :-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From amk at mira.erols.com  Fri Dec 15 14:27:24 2000
From: amk at mira.erols.com (A.M. Kuchling)
Date: Fri, 15 Dec 2000 08:27:24 -0500
Subject: [Python-Dev] Use of %c and Py_UNICODE
Message-ID: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com>

unicodeobject.c contains this code:

                PyErr_Format(PyExc_ValueError,
                            "unsupported format character '%c' (0x%x) "
                            "at index %i",
                            c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat));

c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits,
so '%\u3000' % 1 results in an error message containing "'\000'
(0x3000)".  Is this worth fixing?  I'd say no, since the hex value is
more useful for Unicode strings anyway.  (I still wanted to mention
this little buglet, since I just touched this bit of code.)

--amk




From jack at oratrix.nl  Fri Dec 15 15:26:15 2000
From: jack at oratrix.nl (Jack Jansen)
Date: Fri, 15 Dec 2000 15:26:15 +0100
Subject: [Python-Dev] reuse of address default value (was Re: 
 [Python-checkins] CVS: python/dist/src/Lib SocketServer.py)
In-Reply-To: Message by Guido van Rossum <guido@python.org> ,
	     Thu, 14 Dec 2000 09:51:26 -0500 , <200012141451.JAA15637@cj20424-a.reston1.va.home.com> 
Message-ID: <20001215142616.705993B9B44@snelboot.oratrix.nl>

> The reason for the patch is that without this, if you kill a TCP server
> and restart it right away, you'll get a 'port in use" error -- TCP has
> some kind of strange wait period after a connection is closed before
> it can be reused.  The patch avoids this error.

Well, actually there's a pretty good reason for the "port in use" behaviour: 
the TCP standard more-or-less requires it. A srchost/srcport/dsthost/dstport 
combination should not be reused until the maximum TTL has passed, because 
there may still be "old" retransmissions around. Especially the "open" packets 
are potentially dangerous.

Setting the reuse bit while you're debugging is fine, but setting it in 
general is not a very good idea...
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 





From guido at python.org  Fri Dec 15 15:31:19 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 09:31:19 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: Your message of "Thu, 14 Dec 2000 20:02:08 PST."
             <3A3997C0.F977AF51@ActiveState.com> 
References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz>  
            <3A3997C0.F977AF51@ActiveState.com> 
Message-ID: <200012151431.JAA19799@cj20424-a.reston1.va.home.com>

> Greg Ewing wrote:
> > 
> > Paul Prescod <paulp at ActiveState.com>:
> > 
> > > We could say that a local can only shadow a global
> > > if the local is formally declared.
> > 
> > How do you intend to enforce that? Seems like it would
> > require a test on every assignment to a local, to make
> > sure nobody has snuck in a new global since the function
> > was compiled.
> 
> I would expect that all of the checks would be at compile-time. Except
> for __dict__ hackery, I think it is doable. Python already keeps track
> of all assignments to locals and all assignments to globals in a
> function scope. The only addition is keeping track of assignments at a
> global scope.
> 
> > > Actually, one could argue that there is no good reason to
> > > even *allow* the shadowing of globals.
> > 
> > If shadowing were completely disallowed, it would make it
> > impossible to write a completely self-contained function
> > whose source could be moved from one environment to another
> > without danger of it breaking. I wouldn't like the language
> > to have a characteristic like that.
> 
> That seems like a very esoteric requirement. How often do you have
> functions that do not rely *at all* on their environment (other
> functions, import statements, global variables).
> 
> When you move code you have to do some rewriting or customizing of the
> environment in 94% of the cases. How much effort do you want to spend on
> the other 6%? Also, there are tools that are designed to help you move
> code without breaking programs (refactoring editors). They can just as
> easily handle renaming local variables as adding import statements and
> fixing up function calls.

Can we cut this out please?  Paul is misguided.  There's no reason to
forbid a local shadowing a global.  All languages with nested scopes
allow this.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Fri Dec 15 17:17:08 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Fri, 15 Dec 2000 11:17:08 -0500
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz>
	<3A39ED07.6B3EE68E@lemburg.com>
Message-ID: <14906.17412.221040.895357@anthem.concentric.net>

>>>>> "M" == M  <mal at lemburg.com> writes:

    M> It's the view of the FSF, AFAIK. You can't distribute an
    M> application in binary which dynamically links against
    M> libreadline (which is GPLed) on the user's machine, since even
    M> though you don't distribute libreadline the application running
    M> on the user's machine is considered the "whole" in terms of the
    M> GPL.

    M> FWIW, I don't agree with that view either, but that's probably
    M> because I'm a programmer and not a lawyer :)

I'm not sure I agree with that view either, but mostly because there
is a non-GPL replacement for parts of the readline API:

    http://www.cstr.ed.ac.uk/downloads/editline.html

Don't know anything about it, so it may not be featureful enough for
Python's needs, but if licensing is really a problem, it might be
worth looking into.

-Barry



From paulp at ActiveState.com  Fri Dec 15 17:16:37 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Fri, 15 Dec 2000 08:16:37 -0800
Subject: [Python-Dev] new draft of PEP 227
References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz>  
	            <3A3997C0.F977AF51@ActiveState.com> <200012151431.JAA19799@cj20424-a.reston1.va.home.com>
Message-ID: <3A3A43E5.347AAF6C@ActiveState.com>

Guido van Rossum wrote:
> 
> ...
> 
> Can we cut this out please?  Paul is misguided.  There's no reason to
> forbid a local shadowing a global.  All languages with nested scopes
> allow this.

Python is the only one I know of that implicitly shadows without
requiring some form of declaration. JavaScript has it right: reading and
writing of globals are symmetrical. In the rare case that you explicitly
want to shadow, you need a declaration. Python's rule is confusing,
implicit and error causing. In my opinion, of course. If you are
dead-set against explicit declarations then I would say that disallowing
the ambiguous construct is better than silently treating it as a
declaration.

 Paul Prescod



From guido at python.org  Fri Dec 15 17:23:07 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 11:23:07 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: Your message of "Fri, 15 Dec 2000 08:16:37 PST."
             <3A3A43E5.347AAF6C@ActiveState.com> 
References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> <3A3997C0.F977AF51@ActiveState.com> <200012151431.JAA19799@cj20424-a.reston1.va.home.com>  
            <3A3A43E5.347AAF6C@ActiveState.com> 
Message-ID: <200012151623.LAA27630@cj20424-a.reston1.va.home.com>

> Python is the only one I know of that implicitly shadows without
> requiring some form of declaration. JavaScript has it right: reading and
> writing of globals are symmetrical. In the rare case that you explicitly
> want to shadow, you need a declaration. Python's rule is confusing,
> implicit and error causing. In my opinion, of course. If you are
> dead-set against explicit declarations then I would say that disallowing
> the ambiguous construct is better than silently treating it as a
> declaration.

Let's agree to differ.  This will never change.  In Python, assignment
is declaration.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Dec 15 18:01:33 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 12:01:33 -0500
Subject: [Python-Dev] Use of %c and Py_UNICODE
In-Reply-To: Your message of "Fri, 15 Dec 2000 08:27:24 EST."
             <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> 
References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> 
Message-ID: <200012151701.MAA28058@cj20424-a.reston1.va.home.com>

> unicodeobject.c contains this code:
> 
>                 PyErr_Format(PyExc_ValueError,
>                             "unsupported format character '%c' (0x%x) "
>                             "at index %i",
>                             c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat));
> 
> c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits,
> so '%\u3000' % 1 results in an error message containing "'\000'
> (0x3000)".  Is this worth fixing?  I'd say no, since the hex value is
> more useful for Unicode strings anyway.  (I still wanted to mention
> this little buglet, since I just touched this bit of code.)

Sounds like the '%c' should just be deleted.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From bckfnn at worldonline.dk  Fri Dec 15 18:05:42 2000
From: bckfnn at worldonline.dk (Finn Bock)
Date: Fri, 15 Dec 2000 17:05:42 GMT
Subject: [Python-Dev] CWD in sys.path.
Message-ID: <3a3a480b.28490597@smtp.worldonline.dk>

Hi,

I'm trying to understand the initialization of sys.path and especially
if CWD is supposed to be included in sys.path by default. (I understand
the purpose of sys.path[0], that is not the focus of my question).

My setup is Python2.0 on Win2000, no PYTHONHOME or PYTHONPATH envvars.

In this setup, an empty string exists as sys.path[1], but I'm unsure if
this is by careful design or some freak accident. The empty entry is
added because

  HKEY_LOCAL_MACHINE\SOFTWARE\Python\PythonCore\2.0\PythonPath 

does *not* have any subkey. There are a default value, but that value
appears to be ignored. If I add a subkey "foo":

  HKEY_LOCAL_MACHINE\SOFTWARE\Python\PythonCore\2.0\PythonPath\foo 

with a default value of "d:\foo", the CWD is no longer in sys.path.

i:\java\jython.cvs\org\python\util>d:\Python20\python.exe  -S
Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', 'd:\\foo', 'D:\\PYTHON20\\DLLs', 'D:\\PYTHON20\\lib',
'D:\\PYTHON20\\lib\\plat-win', 'D:\\PYTHON20\\lib\\lib-tk',
'D:\\PYTHON20']
>>>

I noticed that some of the PYTHONPATH macros in PC/config.h includes the
'.', others does not.

So, to put it as a question (for jython): Should CWD be included in
sys.path? Are there some situation (like embedding) where CWD shouldn't
be in sys.path?

regards,
finn



From guido at python.org  Fri Dec 15 18:12:03 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 12:12:03 -0500
Subject: [Python-Dev] CWD in sys.path.
In-Reply-To: Your message of "Fri, 15 Dec 2000 17:05:42 GMT."
             <3a3a480b.28490597@smtp.worldonline.dk> 
References: <3a3a480b.28490597@smtp.worldonline.dk> 
Message-ID: <200012151712.MAA02544@cj20424-a.reston1.va.home.com>

On Unix, CWD is not in sys.path unless as sys.path[0].

--Guido van Rossum (home page: http://www.python.org/~guido/)



From moshez at zadka.site.co.il  Sat Dec 16 02:43:41 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Sat, 16 Dec 2000 03:43:41 +0200 (IST)
Subject: [Python-Dev] new draft of PEP 227
Message-ID: <20001216014341.5BA97A82E@darjeeling.zadka.site.co.il>

On Fri, 15 Dec 2000 08:16:37 -0800, Paul Prescod <paulp at ActiveState.com> wrote:

> Python is the only one I know of that implicitly shadows without
> requiring some form of declaration.

Perl and Scheme permit implicit shadowing too.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tismer at tismer.com  Fri Dec 15 17:42:18 2000
From: tismer at tismer.com (Christian Tismer)
Date: Fri, 15 Dec 2000 18:42:18 +0200
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL (Splitting up 
 _cursesmodule)
References: <LNBBLJKPBEHFEDALKOLCIEPFIDAA.tim.one@home.com>
Message-ID: <3A3A49EA.5D9418E@tismer.com>


Tim Peters wrote:
...
> > Another issue: since Python doesn't link Python scripts, is it
> > still true that if one (pure) Python package is covered by the GPL,
> > then all other packages needed by that application will also fall
> > under GPL ?
> 
> Sorry, couldn't make sense of the question.  Just as well, since you should
> ask about it on a GNU forum anyway <wink>.

The GNU license is transitive. It automatically extends on other
parts of a project, unless they are identifiable, independent
developments. As soon as a couple of modules is published together,
based upon one GPL-ed module, this propagates. I think this is
what MAL meant?
Anyway, I'd be interested to hear what the GNU forum says.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From amk at mira.erols.com  Fri Dec 15 19:10:34 2000
From: amk at mira.erols.com (A.M. Kuchling)
Date: Fri, 15 Dec 2000 13:10:34 -0500
Subject: [Python-Dev] What to do about PEP 229?
Message-ID: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com>

I began writing the fabled fancy setup script described in PEP 229,
and then realized there was duplication going on here.  The code in
setup.py would need to know what libraries, #defines, &c., are needed
by each module in order to check if they're needed and set them.  But
if Modules/Setup can be used to override setup.py's behaviour, then
much of this information would need to be in that file, too; the
details of compiling a module are in two places. 

Possibilities:

1) Setup contains fully-loaded module descriptions, and the setup
   script drops unneeded bits.  For example, the socket module
   requires -lnsl on some platforms.  The Setup file would contain
   "socket socketmodule.c -lnsl" on all platforms, and setup.py would 
   check for an nsl library and only use if it's there.  

   This seems dodgy to me; what if -ldbm is needed on one platform and
   -lndbm on another?

2) Drop setup completely and just maintain setup.py, with some
   different overriding mechanism.  This is more radical.  Adding a
   new module is then not just a matter of editing a simple text file;
   you'd have to modify setup.py, making it more like maintaining an
   autoconf script.
  
Remember, the underlying goal of PEP 229 is to have the out-of-the-box
Python installation you get from "./configure;make" contain many more
useful modules; right now you wouldn't get zlib, syslog, resource, any
of the DBM modules, PyExpat, &c.  I'm not wedded to using Distutils to
get that, but think that's the only practical way; witness the hackery
required to get the DB module automatically compiled.  

You can also wave your hands in the direction of packagers such as
ActiveState or Red Hat, and say "let them make to compile everything".
But this problem actually inconveniences *me*, since I always build
Python myself and have to extensively edit Setup, so I'd like to fix
the problem.

Thoughts? 

--amk




From nas at arctrix.com  Fri Dec 15 13:03:04 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 15 Dec 2000 04:03:04 -0800
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <14906.17412.221040.895357@anthem.concentric.net>; from barry@digicool.com on Fri, Dec 15, 2000 at 11:17:08AM -0500
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net>
Message-ID: <20001215040304.A22056@glacier.fnational.com>

On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote:
> I'm not sure I agree with that view either, but mostly because there
> is a non-GPL replacement for parts of the readline API:
> 
>     http://www.cstr.ed.ac.uk/downloads/editline.html

It doesn't work with the current readline module.  It is much
smaller than readline and works just as well in my experience.
Would there be any interest in including a copy with the standard
distribution?  The license is quite nice (X11 type).

  Neil



From nas at arctrix.com  Fri Dec 15 13:14:50 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 15 Dec 2000 04:14:50 -0800
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012151509.HAA18093@slayer.i.sourceforge.net>; from gvanrossum@users.sourceforge.net on Fri, Dec 15, 2000 at 07:09:46AM -0800
References: <200012151509.HAA18093@slayer.i.sourceforge.net>
Message-ID: <20001215041450.B22056@glacier.fnational.com>

On Fri, Dec 15, 2000 at 07:09:46AM -0800, Guido van Rossum wrote:
> Update of /cvsroot/python/python/dist/src/Lib
> In directory slayer.i.sourceforge.net:/tmp/cvs-serv18082
> 
> Modified Files:
> 	httplib.py 
> Log Message:
> Get rid of string functions.

Can you explain the logic behind this recent interest in removing
string functions from the standard library?  It it performance?
Some unicode issue?  I don't have a great attachment to string.py
but I also don't see the justification for the amount of work it
requires.

  Neil



From guido at python.org  Fri Dec 15 20:29:37 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 14:29:37 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Fri, 15 Dec 2000 04:14:50 PST."
             <20001215041450.B22056@glacier.fnational.com> 
References: <200012151509.HAA18093@slayer.i.sourceforge.net>  
            <20001215041450.B22056@glacier.fnational.com> 
Message-ID: <200012151929.OAA03073@cj20424-a.reston1.va.home.com>

> Can you explain the logic behind this recent interest in removing
> string functions from the standard library?  It it performance?
> Some unicode issue?  I don't have a great attachment to string.py
> but I also don't see the justification for the amount of work it
> requires.

I figure that at *some* point we should start putting our money where
our mouth is, deprecate most uses of the string module, and start
warning about it.  Not in 2.1 probably, given my experience below.

As a realistic test of the warnings module I played with some warnings
about the string module, and then found that say most of the std
library modules use it, triggering an extraordinary amount of
warnings.  I then decided to experiment with the conversion.  I
quickly found out it's too much work to do manually, so I'll hold off
until someone comes up with a tool that does 99% of the work.

(The selection of std library modules to convert manually was
triggered by something pretty random -- I decided to silence a
particular cron job I was running. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From Barrett at stsci.edu  Fri Dec 15 20:32:10 2000
From: Barrett at stsci.edu (Paul Barrett)
Date: Fri, 15 Dec 2000 14:32:10 -0500 (EST)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>
References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>
Message-ID: <14906.17712.830224.481130@nem-srvr.stsci.edu>

Guido,

Here are my comments on PEP 207.  (I've also gone back and read most
of the 1998 discussion.  What a tedious, in terms of time, but
enlightening, in terms of content, discussion that was.)

| - New function:
|
|      PyObject *PyObject_RichCompare(PyObject *, PyObject *, enum cmp_op)
|
|      This performs the requested rich comparison, returning a Python
|      object or raising an exception.  The 3rd argument must be one of
|      LT, LE, EQ, NE, GT or GE.

I'd much prefer '<', '<=', '=', etc. to LT, LE, EQ, etc.


|    Classes
|
|    - Classes can define new special methods __lt__, __le__, __gt__,
|      __ge__, __eq__, __ne__ to override the corresponding operators.
|      (You gotta love the Fortran heritage.)  If a class overrides
|      __cmp__ as well, it is only used by PyObject_Compare().

Likewise, I'd prefer __less__, __lessequal__, __equal__, etc. to
__lt__, __le__, __eq__, etc.  I'm not keen on the FORTRAN derived
symbolism.  I also find it contrary to Python's heritage of being
clear and concise.  I don't mind typing __lessequal__ (or
__less_equal__) once per class for the additional clarity.


|    - Should we even bother upgrading the existing types?

Isn't this question partly related to the coercion issue and which
type of comparison takes precedence?  And if so, then I would think
the answer would be 'yes'.  Or better still see below my suggestion of
adding poor and rich comparison operators along with matrix-type
operators. 


    - If so, how should comparisons on container types be defined?
      Suppose we have a list whose items define rich comparisons.  How
      should the itemwise comparisons be done?  For example:

        def __lt__(a, b): # a<b for lists
            for i in range(min(len(a), len(b))):
                ai, bi = a[i], b[i]
                if ai < bi: return 1
                if ai == bi: continue
                if ai > bi: return 0
                raise TypeError, "incomparable item types"
            return len(a) < len(b)

      This uses the same sequence of comparisons as cmp(), so it may
      as well use cmp() instead:

        def __lt__(a, b): # a<b for lists
            for i in range(min(len(a), len(b))):
                c = cmp(a[i], b[i])
                if c < 0: return 1
                if c == 0: continue
                if c > 0: return 0
                assert 0 # unreachable
            return len(a) < len(b)

      And now there's not really a reason to change lists to rich
      comparisons.

I don't understand this example.  If a[i] and b[i] define rich
comparisons, then 'a[i] < b[i]' is likely to return a non-boolean
value.  Yet the 'if' statement expects a boolean value.  I don't see
how the above example will work.

This example also makes me think that the proposals for new operators
(ie.  PEP 211 and 225) are a good idea.  The discussion of rich
comparisons in 1998 also lends some support to this.  I can see many
uses for two types of comparison operators (as well as the proposed
matrix-type operators), one set for poor or boolean comparisons and
one for rich or non-boolean comparisons.  For example, numeric arrays
can define both.  Rich comparison operators would return an array of
boolean values, while poor comparison operators return a boolean value
by performing an implied 'and.reduce' operation.  These operators
provide clarity and conciseness, without much change to current Python 
behavior.

 -- Paul



From guido at python.org  Fri Dec 15 20:51:04 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 14:51:04 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: Your message of "Fri, 15 Dec 2000 14:32:10 EST."
             <14906.17712.830224.481130@nem-srvr.stsci.edu> 
References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>  
            <14906.17712.830224.481130@nem-srvr.stsci.edu> 
Message-ID: <200012151951.OAA03219@cj20424-a.reston1.va.home.com>

> Here are my comments on PEP 207.  (I've also gone back and read most
> of the 1998 discussion.  What a tedious, in terms of time, but
> enlightening, in terms of content, discussion that was.)
> 
> | - New function:
> |
> |      PyObject *PyObject_RichCompare(PyObject *, PyObject *, enum cmp_op)
> |
> |      This performs the requested rich comparison, returning a Python
> |      object or raising an exception.  The 3rd argument must be one of
> |      LT, LE, EQ, NE, GT or GE.
> 
> I'd much prefer '<', '<=', '=', etc. to LT, LE, EQ, etc.

This is only at the C level.  Having to do a string compare is too
slow.  Since some of these are multi-character symbols, a character
constant doesn't suffice (multi-character character constants are not
portable).

> |    Classes
> |
> |    - Classes can define new special methods __lt__, __le__, __gt__,
> |      __ge__, __eq__, __ne__ to override the corresponding operators.
> |      (You gotta love the Fortran heritage.)  If a class overrides
> |      __cmp__ as well, it is only used by PyObject_Compare().
> 
> Likewise, I'd prefer __less__, __lessequal__, __equal__, etc. to
> __lt__, __le__, __eq__, etc.  I'm not keen on the FORTRAN derived
> symbolism.  I also find it contrary to Python's heritage of being
> clear and concise.  I don't mind typing __lessequal__ (or
> __less_equal__) once per class for the additional clarity.

I don't care about Fortran, but you just showed why I think the short
operator names are better: there's less guessing or disagreement about
how they are to be spelled.  E.g. should it be __lessthan__ or
__less_than__ or __less__?

> |    - Should we even bother upgrading the existing types?
> 
> Isn't this question partly related to the coercion issue and which
> type of comparison takes precedence?  And if so, then I would think
> the answer would be 'yes'.

It wouldn't make much of a difference -- comparisons between different
types numbers would get the same outcome either way.

> Or better still see below my suggestion of
> adding poor and rich comparison operators along with matrix-type
> operators. 
> 
> 
>     - If so, how should comparisons on container types be defined?
>       Suppose we have a list whose items define rich comparisons.  How
>       should the itemwise comparisons be done?  For example:
> 
>         def __lt__(a, b): # a<b for lists
>             for i in range(min(len(a), len(b))):
>                 ai, bi = a[i], b[i]
>                 if ai < bi: return 1
>                 if ai == bi: continue
>                 if ai > bi: return 0
>                 raise TypeError, "incomparable item types"
>             return len(a) < len(b)
> 
>       This uses the same sequence of comparisons as cmp(), so it may
>       as well use cmp() instead:
> 
>         def __lt__(a, b): # a<b for lists
>             for i in range(min(len(a), len(b))):
>                 c = cmp(a[i], b[i])
>                 if c < 0: return 1
>                 if c == 0: continue
>                 if c > 0: return 0
>                 assert 0 # unreachable
>             return len(a) < len(b)
> 
>       And now there's not really a reason to change lists to rich
>       comparisons.
> 
> I don't understand this example.  If a[i] and b[i] define rich
> comparisons, then 'a[i] < b[i]' is likely to return a non-boolean
> value.  Yet the 'if' statement expects a boolean value.  I don't see
> how the above example will work.

Sorry.  I was thinking of list items that contain objects that respond
to the new overloading protocol, but still return Boolean outcomes.
My conclusion is that __cmp__ is just as well.

> This example also makes me think that the proposals for new operators
> (ie.  PEP 211 and 225) are a good idea.  The discussion of rich
> comparisons in 1998 also lends some support to this.  I can see many
> uses for two types of comparison operators (as well as the proposed
> matrix-type operators), one set for poor or boolean comparisons and
> one for rich or non-boolean comparisons.  For example, numeric arrays
> can define both.  Rich comparison operators would return an array of
> boolean values, while poor comparison operators return a boolean value
> by performing an implied 'and.reduce' operation.  These operators
> provide clarity and conciseness, without much change to current Python 
> behavior.

Maybe.  That can still be decided later.  Right now, adding operators
is not on the table for 2.1 (if only because there are two conflicting
PEPs); adding rich comparisons *is* on the table because it doesn't
change the parser (and because the rich comparisons idea was already
pretty much worked out two years ago).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Fri Dec 15 22:08:02 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 15 Dec 2000 16:08:02 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012151929.OAA03073@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com>

[Neil Schemenauer]
> Can you explain the logic behind this recent interest in removing
> string functions from the standard library?  It it performance?
> Some unicode issue?  I don't have a great attachment to string.py
> but I also don't see the justification for the amount of work it
> requires.

[Guido]
> I figure that at *some* point we should start putting our money where
> our mouth is, deprecate most uses of the string module, and start
> warning about it.  Not in 2.1 probably, given my experience below.

I think this begs Neil's questions:  *is* our mouth there <ahem>, and if so,
why?  The only public notice of impending string module deprecation anyone
came up with was a vague note on the 1.6 web page, and one not repeated in
any of the 2.0 release material.

"string" is right up there with "os" and "sys" as a FIM (Frequently Imported
Module), so the required code changes will be massive.  As a user, I don't
see what's in it for me to endure that pain:  the string module functions
work fine!   Neither are they warts in the language, any more than that we
say sin(pi) instead of pi.sin().  Keeping the functions around doesn't hurt
anybody that I can see.

> As a realistic test of the warnings module I played with some warnings
> about the string module, and then found that say most of the std
> library modules use it, triggering an extraordinary amount of
> warnings.  I then decided to experiment with the conversion.  I
> quickly found out it's too much work to do manually, so I'll hold off
> until someone comes up with a tool that does 99% of the work.

Ah, so that's the *easy* way to kill this crusade -- forget I said anything
<wink>.




From Barrett at stsci.edu  Fri Dec 15 22:20:20 2000
From: Barrett at stsci.edu (Paul Barrett)
Date: Fri, 15 Dec 2000 16:20:20 -0500 (EST)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <200012151951.OAA03219@cj20424-a.reston1.va.home.com>
References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com>
	<14906.17712.830224.481130@nem-srvr.stsci.edu>
	<200012151951.OAA03219@cj20424-a.reston1.va.home.com>
Message-ID: <14906.33325.5784.118110@nem-srvr.stsci.edu>

>> This example also makes me think that the proposals for new operators
>> (ie.  PEP 211 and 225) are a good idea.  The discussion of rich
>> comparisons in 1998 also lends some support to this.  I can see many
>> uses for two types of comparison operators (as well as the proposed
>> matrix-type operators), one set for poor or boolean comparisons and
>> one for rich or non-boolean comparisons.  For example, numeric arrays
>> can define both.  Rich comparison operators would return an array of
>> boolean values, while poor comparison operators return a boolean value
>> by performing an implied 'and.reduce' operation.  These operators
>> provide clarity and conciseness, without much change to current Python 
>> behavior.
>
> Maybe.  That can still be decided later.  Right now, adding operators
> is not on the table for 2.1 (if only because there are two conflicting
> PEPs); adding rich comparisons *is* on the table because it doesn't
> change the parser (and because the rich comparisons idea was already
> pretty much worked out two years ago).

Yes, it was worked out previously _assuming_ rich comparisons do not
use any new operators.  

But let's stop for a moment and contemplate adding rich comparisons 
along with new comparison operators.  What do we gain?

1. The current boolean operator behavior does not have to change, and
   hence will be backward compatible.

2. It eliminates the need to decide whether or not rich comparisons
   takes precedence over boolean comparisons.

3. The new operators add additional behavior without directly impacting 
   current behavior and the use of them is unambigous, at least in
   relation to current Python behavior.  You know by the operator what 
   type of comparison will be returned.  This should appease Jim
   Fulton, based on his arguments in 1998 about comparison operators
   always returning a boolean value.

4. Compound objects, such as lists, could implement both rich
   and boolean comparisons.  The boolean comparison would remain as
   is, while the rich comparison would return a list of boolean
   values.  Current behavior doesn't change; just a new feature, which
   you may or may not choose to use, is added.

If we go one step further and add the matrix-style operators along
with the comparison operators, we can provide a consistent user
interface to array/complex operations without changing current Python
behavior.  If a user has no need for these new operators, he doesn't
have to use them or even know about them.  All we've done is made
Python richer, but I believe with making it more complex.  For
example, all element-wise operations could have a ':' appended to
them, e.g. '+:', '<:', etc.; and will define element-wise addition,
element-wise less-than, etc.  The traditional '*', '/', etc. operators
can then be used for matrix operations, which will appease the Matlab
people.

Therefore, I don't think rich comparisons and matrix-type operators
should be considered separable.  I really think you should consider
this suggestion.  It appeases many groups while providing a consistent 
and clear user interface, while greatly impacting current Python
behavior. 

Always-causing-havoc-at-the-last-moment-ly Yours,
Paul

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218



From guido at python.org  Fri Dec 15 22:23:46 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 16:23:46 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Fri, 15 Dec 2000 16:08:02 EST."
             <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com> 
Message-ID: <200012152123.QAA03566@cj20424-a.reston1.va.home.com>

> "string" is right up there with "os" and "sys" as a FIM (Frequently
> Imported Module), so the required code changes will be massive.  As
> a user, I don't see what's in it for me to endure that pain: the
> string module functions work fine!  Neither are they warts in the
> language, any more than that we say sin(pi) instead of pi.sin().
> Keeping the functions around doesn't hurt anybody that I can see.

Hm.  I'm not saying that this one will be easy.  But I don't like
having "two ways to do it".  It means more learning, etc. (you know
the drill).  We could have chosen to make the strop module support
Unicode; instead, we chose to give string objects methods and promote
the use of those methods instead of the string module.  (And in a
generous mood, we also supported Unicode in the string module -- by
providing wrappers that invoke string methods.)

If you're saying that we should give users ample time for the
transition, I'm with you.

If you're saying that you think the string module is too prominent to
ever start deprecating its use, I'm afraid we have a problem.

I'd also like to note that using the string module's wrappers incurs
the overhead of a Python function call -- using string methods is
faster.

Finally, I like the look of fields[i].strip().lower() much better than
that of string.lower(string.strip(fields[i])) -- an actual example
from mimetools.py.

Ideally, I would like to deprecate the entire string module, so that I
can place a single warning at its top.  This will cause a single
warning to be issued for programs that still use it (no matter how
many times it is imported).  Unfortunately, there are a couple of
things that still need it: string.letters etc., and
string.maketrans().

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gvwilson at nevex.com  Fri Dec 15 22:43:47 2000
From: gvwilson at nevex.com (Greg Wilson)
Date: Fri, 15 Dec 2000 16:43:47 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <14906.33325.5784.118110@nem-srvr.stsci.edu>
Message-ID: <002901c066e0$1b3f13c0$770a0a0a@nevex.com>

Hi, Paul; thanks for your mail.  W.r.t. adding matrix operators to Python,
you may want to take a look at the counter-arguments in PEP 0211 (attached).
Basically, I spoke with the authors of GNU Octave (a GPL'd clone of MATLAB)
about what users really used.  They felt that the only matrix operator that
really mattered was matrix-matrix multiply; other operators (including the
left and right division operators that even experienced MATLAB users often
mix up) were second order at best, and were better handled with methods or
functions.

Thanks,
Greg

p.s. PEP 0225 (also attached) is an alternative to PEP 0211 which would add
most of the MATLAB-ish operators to Python.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: pep-0211.txt
URL: <http://mail.python.org/pipermail/python-dev/attachments/20001215/3bf6c282/attachment-0002.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: pep-0225.txt
URL: <http://mail.python.org/pipermail/python-dev/attachments/20001215/3bf6c282/attachment-0003.txt>

From guido at python.org  Fri Dec 15 22:55:46 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 16:55:46 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: Your message of "Fri, 15 Dec 2000 16:20:20 EST."
             <14906.33325.5784.118110@nem-srvr.stsci.edu> 
References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> <14906.17712.830224.481130@nem-srvr.stsci.edu> <200012151951.OAA03219@cj20424-a.reston1.va.home.com>  
            <14906.33325.5784.118110@nem-srvr.stsci.edu> 
Message-ID: <200012152155.QAA03879@cj20424-a.reston1.va.home.com>

> > Maybe.  That can still be decided later.  Right now, adding operators
> > is not on the table for 2.1 (if only because there are two conflicting
> > PEPs); adding rich comparisons *is* on the table because it doesn't
> > change the parser (and because the rich comparisons idea was already
> > pretty much worked out two years ago).
> 
> Yes, it was worked out previously _assuming_ rich comparisons do not
> use any new operators.  
> 
> But let's stop for a moment and contemplate adding rich comparisons 
> along with new comparison operators.  What do we gain?
> 
> 1. The current boolean operator behavior does not have to change, and
>    hence will be backward compatible.

What incompatibility do you see in the current proposal?

> 2. It eliminates the need to decide whether or not rich comparisons
>    takes precedence over boolean comparisons.

Only if you want different semantics -- that's only an issue for NumPy.

> 3. The new operators add additional behavior without directly impacting 
>    current behavior and the use of them is unambigous, at least in
>    relation to current Python behavior.  You know by the operator what 
>    type of comparison will be returned.  This should appease Jim
>    Fulton, based on his arguments in 1998 about comparison operators
>    always returning a boolean value.

As you know, I'm now pretty close to Jim. :-)  He seemed pretty mellow
about this now.

> 4. Compound objects, such as lists, could implement both rich
>    and boolean comparisons.  The boolean comparison would remain as
>    is, while the rich comparison would return a list of boolean
>    values.  Current behavior doesn't change; just a new feature, which
>    you may or may not choose to use, is added.
> 
> If we go one step further and add the matrix-style operators along
> with the comparison operators, we can provide a consistent user
> interface to array/complex operations without changing current Python
> behavior.  If a user has no need for these new operators, he doesn't
> have to use them or even know about them.  All we've done is made
> Python richer, but I believe with making it more complex.  For
> example, all element-wise operations could have a ':' appended to
> them, e.g. '+:', '<:', etc.; and will define element-wise addition,
> element-wise less-than, etc.  The traditional '*', '/', etc. operators
> can then be used for matrix operations, which will appease the Matlab
> people.
> 
> Therefore, I don't think rich comparisons and matrix-type operators
> should be considered separable.  I really think you should consider
> this suggestion.  It appeases many groups while providing a consistent 
> and clear user interface, while greatly impacting current Python
> behavior. 
> 
> Always-causing-havoc-at-the-last-moment-ly Yours,

I think you misunderstand.  Rich comparisons are mostly about allowing
the separate overloading of <, <=, ==, !=, >, and >=.  This is useful
in its own light.

If you don't want to use this overloading facility for elementwise
comparisons in NumPy, that's fine with me.  Nobody says you have to --
it's just that you *could*.

Red my lips: there won't be *any* new operators in 2.1.

There will a better way to overload the existing Boolean operators,
and they will be able to return non-Boolean results.  That's useful in
other situations besides NumPy.

Feel free to lobby for elementwise operators -- but based on the
discussion about this subject so far, I don't give it much of a chance
even past Python 2.1.  They would add a lot of baggage to the language
(e.g. the table of operators in all Python books would be about twice
as long) and by far the most users don't care about them.  (Read the
intro to 211 for some of the concerns -- this PEP tries to make the
addition palatable by adding exactly *one* new operator.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Dec 15 23:16:34 2000
From: guido at python.org (Guido van Rossum)
Date: Fri, 15 Dec 2000 17:16:34 -0500
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: Your message of "Fri, 08 Dec 2000 17:58:03 EST."
             <200012082258.RAA02389@cj20424-a.reston1.va.home.com> 
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>  
            <200012082258.RAA02389@cj20424-a.reston1.va.home.com> 
Message-ID: <200012152216.RAA11098@cj20424-a.reston1.va.home.com>

I've checked in the essential parts of the warnings PEP, and closed
the SF patch.  I haven't checked in the examples in the patch -- it's
too early for that.  But I figured that it's easier to revise the code
once it's checked in.  I'm pretty confident that it works as
advertised.

Still missing is documentation: the warnings module, the new API
functions, and the new command line option should all be documented.
I'll work on that over the holidays.

I consider the PEP done.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Fri Dec 15 23:21:24 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:21:24 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com>
Message-ID: <3A3A9964.A6B3DD11@lemburg.com>

Neil Schemenauer wrote:
> 
> On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote:
> > I'm not sure I agree with that view either, but mostly because there
> > is a non-GPL replacement for parts of the readline API:
> >
> >     http://www.cstr.ed.ac.uk/downloads/editline.html
> 
> It doesn't work with the current readline module.  It is much
> smaller than readline and works just as well in my experience.
> Would there be any interest in including a copy with the standard
> distribution?  The license is quite nice (X11 type).

+1 from here -- line editing is simply a very important part of
an interactive prompt and readline is not only big, slow and
full of strange surprises, but also GPLed ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Fri Dec 15 23:24:34 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:24:34 +0100
Subject: [Python-Dev] Use of %c and Py_UNICODE
References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com>
Message-ID: <3A3A9A22.E9BA9551@lemburg.com>

"A.M. Kuchling" wrote:
> 
> unicodeobject.c contains this code:
> 
>                 PyErr_Format(PyExc_ValueError,
>                             "unsupported format character '%c' (0x%x) "
>                             "at index %i",
>                             c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat));
> 
> c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits,
> so '%\u3000' % 1 results in an error message containing "'\000'
> (0x3000)".  Is this worth fixing?  I'd say no, since the hex value is
> more useful for Unicode strings anyway.  (I still wanted to mention
> this little buglet, since I just touched this bit of code.)

Why would you want to fix it ? Format characters will always
be ASCII and thus 7-bit -- theres really no need to expand the
set of possibilities beyond 8 bits ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fdrake at acm.org  Fri Dec 15 23:22:34 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 15 Dec 2000 17:22:34 -0500 (EST)
Subject: [Python-Dev] Warning Framework (PEP 230)
In-Reply-To: <200012152216.RAA11098@cj20424-a.reston1.va.home.com>
References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com>
	<200012082258.RAA02389@cj20424-a.reston1.va.home.com>
	<200012152216.RAA11098@cj20424-a.reston1.va.home.com>
Message-ID: <14906.39338.795843.947683@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > Still missing is documentation: the warnings module, the new API
 > functions, and the new command line option should all be documented.
 > I'll work on that over the holidays.

  I've assigned a bug to you in case you forget.  I've given it a
"show-stopper" priority level, so I'll feel good ripping the code out
if you don't get docs written in time.  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From mal at lemburg.com  Fri Dec 15 23:39:18 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:39:18 +0100
Subject: [Python-Dev] What to do about PEP 229?
References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com>
Message-ID: <3A3A9D96.80781D61@lemburg.com>

"A.M. Kuchling" wrote:
> 
> I began writing the fabled fancy setup script described in PEP 229,
> and then realized there was duplication going on here.  The code in
> setup.py would need to know what libraries, #defines, &c., are needed
> by each module in order to check if they're needed and set them.  But
> if Modules/Setup can be used to override setup.py's behaviour, then
> much of this information would need to be in that file, too; the
> details of compiling a module are in two places.
> 
> Possibilities:
> 
> 1) Setup contains fully-loaded module descriptions, and the setup
>    script drops unneeded bits.  For example, the socket module
>    requires -lnsl on some platforms.  The Setup file would contain
>    "socket socketmodule.c -lnsl" on all platforms, and setup.py would
>    check for an nsl library and only use if it's there.
> 
>    This seems dodgy to me; what if -ldbm is needed on one platform and
>    -lndbm on another?

Can't distutils try both and then settle for the working combination ?

[distutils isn't really ready for auto-configure yet, but Greg
has already provided most of the needed functionality -- it's just
not well integrated into the rest of the build process in version 1.0.1
... BTW, where is Gerg ? I haven't heard from him in quite a while.]
 
> 2) Drop setup completely and just maintain setup.py, with some
>    different overriding mechanism.  This is more radical.  Adding a
>    new module is then not just a matter of editing a simple text file;
>    you'd have to modify setup.py, making it more like maintaining an
>    autoconf script.

Why not parse Setup and use it as input to distutils setup.py ?
 
> Remember, the underlying goal of PEP 229 is to have the out-of-the-box
> Python installation you get from "./configure;make" contain many more
> useful modules; right now you wouldn't get zlib, syslog, resource, any
> of the DBM modules, PyExpat, &c.  I'm not wedded to using Distutils to
> get that, but think that's the only practical way; witness the hackery
> required to get the DB module automatically compiled.
> 
> You can also wave your hands in the direction of packagers such as
> ActiveState or Red Hat, and say "let them make to compile everything".
> But this problem actually inconveniences *me*, since I always build
> Python myself and have to extensively edit Setup, so I'd like to fix
> the problem.
> 
> Thoughts?

Nice idea :-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Fri Dec 15 23:44:15 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:44:15 +0100
Subject: [Python-Dev] Death to string functions!
References: <200012151509.HAA18093@slayer.i.sourceforge.net>  
	            <20001215041450.B22056@glacier.fnational.com> <200012151929.OAA03073@cj20424-a.reston1.va.home.com>
Message-ID: <3A3A9EBF.3F9306B6@lemburg.com>

Guido van Rossum wrote:
> 
> > Can you explain the logic behind this recent interest in removing
> > string functions from the standard library?  It it performance?
> > Some unicode issue?  I don't have a great attachment to string.py
> > but I also don't see the justification for the amount of work it
> > requires.
> 
> I figure that at *some* point we should start putting our money where
> our mouth is, deprecate most uses of the string module, and start
> warning about it.  Not in 2.1 probably, given my experience below.
> 
> As a realistic test of the warnings module I played with some warnings
> about the string module, and then found that say most of the std
> library modules use it, triggering an extraordinary amount of
> warnings.  I then decided to experiment with the conversion.  I
> quickly found out it's too much work to do manually, so I'll hold off
> until someone comes up with a tool that does 99% of the work.

This would also help a lot of programmers out there who are
stuch with 100k LOCs of Python code using string.py ;)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Fri Dec 15 23:49:01 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 15 Dec 2000 23:49:01 +0100
Subject: [Python-Dev] Death to string functions!
References: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com> <200012152123.QAA03566@cj20424-a.reston1.va.home.com>
Message-ID: <3A3A9FDD.E6F021AF@lemburg.com>

Guido van Rossum wrote:
> 
> Ideally, I would like to deprecate the entire string module, so that I
> can place a single warning at its top.  This will cause a single
> warning to be issued for programs that still use it (no matter how
> many times it is imported).  Unfortunately, there are a couple of
> things that still need it: string.letters etc., and
> string.maketrans().

Can't we come up with a module similar to unicodedata[.py] ? 

string.py could then still provide the interfaces, but the
implementation would live in stringdata.py

[Perhaps we won't need stringdata by then... Unicode will have
 taken over and the discussion be mood ;-)]

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Fri Dec 15 23:54:25 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 15 Dec 2000 23:54:25 +0100
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: <20001215040304.A22056@glacier.fnational.com>; from nas@arctrix.com on Fri, Dec 15, 2000 at 04:03:04AM -0800
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com>
Message-ID: <20001215235425.A29681@xs4all.nl>

On Fri, Dec 15, 2000 at 04:03:04AM -0800, Neil Schemenauer wrote:
> On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote:
> > I'm not sure I agree with that view either, but mostly because there
> > is a non-GPL replacement for parts of the readline API:
> > 
> >     http://www.cstr.ed.ac.uk/downloads/editline.html
> 
> It doesn't work with the current readline module.  It is much
> smaller than readline and works just as well in my experience.
> Would there be any interest in including a copy with the standard
> distribution?  The license is quite nice (X11 type).

Definately +1 from here. Readline reminds me of the cold war, for some
reason. (Actually, multiple reasons ;) I don't have time to do it myself,
unfortunately, or I would. (Looking at editline has been on my TODO list for
a while... :P)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From martin at loewis.home.cs.tu-berlin.de  Sat Dec 16 13:32:30 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 16 Dec 2000 13:32:30 +0100
Subject: [Python-Dev] PEP 226
Message-ID: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de>

I remember earlier discussion on the Python 2.1 release schedule, and
never managed to comment on those.

I believe that Python contributors and maintainers did an enourmous
job in releasing Python 2, which took quite some time from everybody's
life. I think it is unrealistic to expect the same amount of
commitment for the next release, especially if that release appears
just a few months after the previous release (that is, one month from
now).

So I'd like to ask the release manager to take that into
account. I'm not quite sure what kind of action I expect; possible
alternatives are:
- declare 2.1 a pure bug fix release only; with a minimal set of new
  features. In particular, don't push for completion of PEPs; everybody
  should then accept that most features that are currently discussed
  will appear in Python 2.2.

- move the schedule for Python 2.1 back (or is it forward?) by, say, a
  few month. This will people give some time to do the things that did
  not get the right amount of attention during 2.0 release, and will
  still allow to work on new and interesting features.

Just my 0.02EUR,

Martin



From guido at python.org  Sat Dec 16 17:38:28 2000
From: guido at python.org (Guido van Rossum)
Date: Sat, 16 Dec 2000 11:38:28 -0500
Subject: [Python-Dev] PEP 226
In-Reply-To: Your message of "Sat, 16 Dec 2000 13:32:30 +0100."
             <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> 
References: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> 
Message-ID: <200012161638.LAA13888@cj20424-a.reston1.va.home.com>

> I remember earlier discussion on the Python 2.1 release schedule, and
> never managed to comment on those.
> 
> I believe that Python contributors and maintainers did an enourmous
> job in releasing Python 2, which took quite some time from everybody's
> life. I think it is unrealistic to expect the same amount of
> commitment for the next release, especially if that release appears
> just a few months after the previous release (that is, one month from
> now).
> 
> So I'd like to ask the release manager to take that into
> account. I'm not quite sure what kind of action I expect; possible
> alternatives are:
> - declare 2.1 a pure bug fix release only; with a minimal set of new
>   features. In particular, don't push for completion of PEPs; everybody
>   should then accept that most features that are currently discussed
>   will appear in Python 2.2.
> 
> - move the schedule for Python 2.1 back (or is it forward?) by, say, a
>   few month. This will people give some time to do the things that did
>   not get the right amount of attention during 2.0 release, and will
>   still allow to work on new and interesting features.
> 
> Just my 0.02EUR,

You're right -- 2.0 (including 1.6) was a monumental effort, and I'm
grateful to all who contributed.

I don't expect that 2.1 will be anywhere near the same amount of work!

Let's look at what's on the table.

0042  Small Feature Requests                 Hylton
 SD  205  pep-0205.txt  Weak References                        Drake
 S   207  pep-0207.txt  Rich Comparisons                       Lemburg, van Rossum
 S   208  pep-0208.txt  Reworking the Coercion Model           Schemenauer
 S   217  pep-0217.txt  Display Hook for Interactive Use       Zadka
 S   222  pep-0222.txt  Web Library Enhancements               Kuchling
 I   226  pep-0226.txt  Python 2.1 Release Schedule            Hylton
 S   227  pep-0227.txt  Statically Nested Scopes               Hylton
 S   230  pep-0230.txt  Warning Framework                      van Rossum
 S   232  pep-0232.txt  Function Attributes                    Warsaw
 S   233  pep-0233.txt  Python Online Help                     Prescod



From guido at python.org  Sat Dec 16 17:46:32 2000
From: guido at python.org (Guido van Rossum)
Date: Sat, 16 Dec 2000 11:46:32 -0500
Subject: [Python-Dev] PEP 226
In-Reply-To: Your message of "Sat, 16 Dec 2000 13:32:30 +0100."
             <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> 
References: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> 
Message-ID: <200012161646.LAA13947@cj20424-a.reston1.va.home.com>

[Oops, I posted a partial edit of this message by mistake before.]

> I remember earlier discussion on the Python 2.1 release schedule, and
> never managed to comment on those.
> 
> I believe that Python contributors and maintainers did an enourmous
> job in releasing Python 2, which took quite some time from everybody's
> life. I think it is unrealistic to expect the same amount of
> commitment for the next release, especially if that release appears
> just a few months after the previous release (that is, one month from
> now).
> 
> So I'd like to ask the release manager to take that into
> account. I'm not quite sure what kind of action I expect; possible
> alternatives are:
> - declare 2.1 a pure bug fix release only; with a minimal set of new
>   features. In particular, don't push for completion of PEPs; everybody
>   should then accept that most features that are currently discussed
>   will appear in Python 2.2.
> 
> - move the schedule for Python 2.1 back (or is it forward?) by, say, a
>   few month. This will people give some time to do the things that did
>   not get the right amount of attention during 2.0 release, and will
>   still allow to work on new and interesting features.
> 
> Just my 0.02EUR,

You're right -- 2.0 (including 1.6) was a monumental effort, and I'm
grateful to all who contributed.

I don't expect that 2.1 will be anywhere near the same amount of work!

Let's look at what's on the table.  These are listed as Active PEPs --
under serious consideration for Python 2.1:

> 0042  Small Feature Requests                 Hylton

We can do some of these or leave them.

> 0205  Weak References                        Drake

This one's open.

> 0207  Rich Comparisons                       Lemburg, van Rossum

This is really not that much work -- I would've done it already if I
weren't distracted by the next one.

> 0208  Reworking the Coercion Model           Schemenauer

Neil has most of this under control.  I don't doubt for a second that
it will be finished.

> 0217  Display Hook for Interactive Use       Zadka

Probably a 20-line fix.

> 0222  Web Library Enhancements               Kuchling

Up to Andrew.  If he doesn't get to it, no big deal.

> 0226  Python 2.1 Release Schedule            Hylton

I still think this is realistic -- a release before the conference
seems doable!

> 0227  Statically Nested Scopes               Hylton

This one's got a 50% chance at least.  Jeremy seems motivated to do
it.

> 0230  Warning Framework                      van Rossum

Done except for documentation.

> 0232  Function Attributes                    Warsaw

We need to discuss this more, but it's not much work to implement.

> 0233  Python Online Help                     Prescod

If Paul can control his urge to want to solve everything at once, I
see no reason whi this one couldn't find its way into 2.1.

Now, officially the PEP deadline is closed today: the schedule says
"16-Dec-2000: 2.1 PEPs ready for review".  That means that no new PEPs
will be considered for inclusion in 2.1, and PEPs not in the active
list won't be considered either.  But the PEPs in the list above are
all ready for review, even if we don't agree with all of them.

I'm actually more worried about the ever-growing number of bug reports
and submitted patches.  But that's for another time.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Sun Dec 17 01:09:28 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Sat, 16 Dec 2000 19:09:28 -0500
Subject: [Python-Dev] Use of %c and Py_UNICODE
In-Reply-To: <3A3A9A22.E9BA9551@lemburg.com>; from mal@lemburg.com on Fri, Dec 15, 2000 at 11:24:34PM +0100
References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> <3A3A9A22.E9BA9551@lemburg.com>
Message-ID: <20001216190928.A6703@kronos.cnri.reston.va.us>

On Fri, Dec 15, 2000 at 11:24:34PM +0100, M.-A. Lemburg wrote:
>Why would you want to fix it ? Format characters will always
>be ASCII and thus 7-bit -- theres really no need to expand the
>set of possibilities beyond 8 bits ;-)

This message is for characters that aren't format characters, which
therefore includes all characters >127.  

--amk




From akuchlin at mems-exchange.org  Sun Dec 17 01:17:39 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Sat, 16 Dec 2000 19:17:39 -0500
Subject: [Python-Dev] What to do about PEP 229?
In-Reply-To: <3A3A9D96.80781D61@lemburg.com>; from mal@lemburg.com on Fri, Dec 15, 2000 at 11:39:18PM +0100
References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com>
Message-ID: <20001216191739.B6703@kronos.cnri.reston.va.us>

On Fri, Dec 15, 2000 at 11:39:18PM +0100, M.-A. Lemburg wrote:
>Can't distutils try both and then settle for the working combination ?

I'm worried about subtle problems; what if an unneeded -lfoo drags in
a customized malloc, or has symbols which conflict with some other
library.  

>... BTW, where is Greg ? I haven't heard from him in quite a while.]

Still around; he just hasn't been posting much these days.

>Why not parse Setup and use it as input to distutils setup.py ?

That was option 1.  The existing Setup format doesn't really contain
enough intelligence, though; the intelligence is usually in comments
such as "Uncomment the following line for Solaris".  So either the
Setup format is modified (bad, since we'd break existing 3rd-party
packages that still use a Makefile.pre.in), or I give up and just do
everything in a setup.py.

--amk



From guido at python.org  Sun Dec 17 03:38:01 2000
From: guido at python.org (Guido van Rossum)
Date: Sat, 16 Dec 2000 21:38:01 -0500
Subject: [Python-Dev] What to do about PEP 229?
In-Reply-To: Your message of "Sat, 16 Dec 2000 19:17:39 EST."
             <20001216191739.B6703@kronos.cnri.reston.va.us> 
References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com>  
            <20001216191739.B6703@kronos.cnri.reston.va.us> 
Message-ID: <200012170238.VAA14466@cj20424-a.reston1.va.home.com>

> >Why not parse Setup and use it as input to distutils setup.py ?
> 
> That was option 1.  The existing Setup format doesn't really contain
> enough intelligence, though; the intelligence is usually in comments
> such as "Uncomment the following line for Solaris".  So either the
> Setup format is modified (bad, since we'd break existing 3rd-party
> packages that still use a Makefile.pre.in), or I give up and just do
> everything in a setup.py.

Forget Setup.  Convert it and be done with it.  There really isn't
enough there to hang on to.  We'll support Setup format (through the
makesetup script and the Misc/Makefile.pre.in file) for 3rd party b/w
compatibility, but we won't need to use it ourselves.  (Too bad for
3rd party documentation that describes the Setup format. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Sun Dec 17 08:34:27 2000
From: tim.one at home.com (Tim Peters)
Date: Sun, 17 Dec 2000 02:34:27 -0500
Subject: [Python-Dev] Use of %c and Py_UNICODE
In-Reply-To: <20001216190928.A6703@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEEHIEAA.tim.one@home.com>

[MAL]
> Why would you want to fix it ? Format characters will always
> be ASCII and thus 7-bit -- theres really no need to expand the
> set of possibilities beyond 8 bits ;-)

[AMK]
> This message is for characters that aren't format characters, which
> therefore includes all characters >127.

I'm with the wise man who suggested to drop the %c in this case and just
display the hex value.  Although it would be more readable to drop the %c if
and only if the bogus format character isn't printable 7-bit ASCII.  Which
is obvious, yes?  A new if/else isn't going to hurt anything.




From tim.one at home.com  Sun Dec 17 08:57:01 2000
From: tim.one at home.com (Tim Peters)
Date: Sun, 17 Dec 2000 02:57:01 -0500
Subject: [Python-Dev] PEP 226
In-Reply-To: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEHIEAA.tim.one@home.com>

[Martin v. Loewis]
> ...
> - move the schedule for Python 2.1 back (or is it forward?) by, say, a
>   few month. This will people give some time to do the things that did
>   not get the right amount of attention during 2.0 release, and will
>   still allow to work on new and interesting features.

Just a stab in the dark, but is one of your real concerns the spotty state
of Unicode support in the std libraries?  If so, nobody working on the PEPs
Guido identified would be likely to work on improving Unicode support even
if the PEPs vanished.  I don't know how Unicode support is going to improve,
but in the absence of visible work in that direction-- or even A Plan to get
some --I doubt we're going to hold up 2.1 waiting for magic.

no-feature-is-ever-done-ly y'rs  - tim




From tim.one at home.com  Sun Dec 17 09:30:24 2000
From: tim.one at home.com (Tim Peters)
Date: Sun, 17 Dec 2000 03:30:24 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <3A387D6A.782E6A3B@prescod.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEEIIEAA.tim.one@home.com>

[Tim]
>> I've rarely seen problems due to shadowing a global, but have often
>> seen problems due to shadowing a builtin.

[Paul Prescod]
> Really?

Yes.

> I think that there are two different issues here. One is consciously
> choosing to create a new variable but not understanding that there
> already exists a variable by that name. (i.e. str, list).

Yes, and that's what I've often seen, typically long after the original code
is written:  someone sticks in some debugging output, or makes a small
change to the implementation, and introduces e.g.

    str = some_preexisting_var + ":"
    yadda(str)

"Suddenly" the program misbehaves in baffling ways.  They're "baffling"
because the errors do not occur on the lines where the changes were made,
and are almost never related to the programmer's intent when making the
changes.

> Another is trying to assign to a global but actually shadowing it.

I've rarely seen that.

> There is no way that anyone coming from another language is going
> to consider this transcript reasonable:

True, but I don't really care:  everyone gets burned once, the better ones
eventually learn to use classes instead of mutating globals, and even the
dull get over it.  It is not, in my experience, an on-going problem for
anyone.  But I still get burned regularly by shadowing builtins.  The burns
are not fatal, however, and I can't think of an ointment less painful than
the blisters.

> >>> a=5
> >>> def show():
> ...    print a
> ...
> >>> def set(val):
> ...     a=val
> ...
> >>> a
> 5
> >>> show()
> 5
> >>> set(10)
> >>> show()
> 5
>
> It doesn't seem to make any sense. My solution is to make the assignment
> in "set" illegal unless you add a declaration that says: "No, really. I
> mean it. Override that sucker." As the PEP points out, overriding is
> seldom a good idea so the requirement to declare would be rarely
> invoked.

I expect it would do less harm to introduce a compile-time warning for
locals that are never referenced (such as the "a" in "set").

> ...
> The "right answer" in terms of namespace theory is to consistently refer
> to builtins with a prefix (whether "__builtins__" or "$") but that's
> pretty unpalatable from an aesthetic point of view.

Right, that's one of the ointments I won't apply to my own code, so wouldn't
think of asking others to either.

WRT mutable globals, people who feel they have to use them would be well
served to adopt a naming convention.  For example, begin each name with "g"
and capitalize the second letter.  This can make global-rich code much
easier to follow (I've done-- and very happily --similar things in
Javascript and C++).




From pf at artcom-gmbh.de  Sun Dec 17 10:59:11 2000
From: pf at artcom-gmbh.de (Peter Funk)
Date: Sun, 17 Dec 2000 10:59:11 +0100 (MET)
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 15, 2000  4:23:46 pm"
Message-ID: <m147ab2-000CxUC@artcom0.artcom-gmbh.de>

Hi,

Guido van Rossum:
> If you're saying that you think the string module is too prominent to
> ever start deprecating its use, I'm afraid we have a problem.
 
I strongly believe the string module is too prominent.

> I'd also like to note that using the string module's wrappers incurs
> the overhead of a Python function call -- using string methods is
> faster.

I think most care more about readbility than about run time performance.
For people without much OOP experience, the method syntax hurts
readability.

> Finally, I like the look of fields[i].strip().lower() much better than
> that of string.lower(string.strip(fields[i])) -- an actual example
> from mimetools.py.

Hmmmm.... May be this is just a matter of taste?  Like my preference
for '<>' instead of '!='?  Personally I still like the old fashinoned
form more.  Especially, if string.join() or string.split() are
involved.

Since Python 1.5.2 will stay around for several years, keeping backward 
compatibility in our Python coding is still major issue for us.
So we won't change our Python coding style soon if ever.  

> Ideally, I would like to deprecate the entire string module, so that I
[...]
I share Mark Lutz and Tim Peters oppinion, that this crusade will do 
more harm than good to Python community.  IMO this is a really bad idea.

Just my $0.02, Peter



From martin at loewis.home.cs.tu-berlin.de  Sun Dec 17 12:13:09 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 17 Dec 2000 12:13:09 +0100
Subject: [Python-Dev] PEP 226
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEEHIEAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCOEEHIEAA.tim.one@home.com>
Message-ID: <200012171113.MAA00733@loewis.home.cs.tu-berlin.de>

> Just a stab in the dark, but is one of your real concerns the spotty state
> of Unicode support in the std libraries?  

Not at all. I really responded to amk's message

# All the PEPs for 2.1 are supposed to be complete for Dec. 16, and
# some of those PEPs are pretty complicated.  I'm a bit worried that
# it's been so quiet on python-dev lately, especially after the
# previous two weeks of lively discussion.

I just thought that something was wrong here - contributing to a free
software project ought to be fun for contributors, not a cause for
worries.

There-are-other-things-but-i18n-although-they-are-not-that-interesting y'rs,
Martin



From guido at python.org  Sun Dec 17 15:38:07 2000
From: guido at python.org (Guido van Rossum)
Date: Sun, 17 Dec 2000 09:38:07 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: Your message of "Sun, 17 Dec 2000 03:30:24 EST."
             <LNBBLJKPBEHFEDALKOLCKEEIIEAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCKEEIIEAA.tim.one@home.com> 
Message-ID: <200012171438.JAA21603@cj20424-a.reston1.va.home.com>

> I expect it would do less harm to introduce a compile-time warning for
> locals that are never referenced (such as the "a" in "set").

Another warning that would be quite useful (and trap similar cases)
would be "local variable used before set".

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Sun Dec 17 15:40:40 2000
From: guido at python.org (Guido van Rossum)
Date: Sun, 17 Dec 2000 09:40:40 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Sun, 17 Dec 2000 10:59:11 +0100."
             <m147ab2-000CxUC@artcom0.artcom-gmbh.de> 
References: <m147ab2-000CxUC@artcom0.artcom-gmbh.de> 
Message-ID: <200012171440.JAA21620@cj20424-a.reston1.va.home.com>

> I think most care more about readbility than about run time performance.
> For people without much OOP experience, the method syntax hurts
> readability.

I don't believe one bit of this.  By that standard, we would do better
to define a new module "list" and start writing list.append(L, x) for
L.append(x).

> I share Mark Lutz and Tim Peters oppinion, that this crusade will do 
> more harm than good to Python community.  IMO this is a really bad
> idea.

You are entitled to your opinion, but given that your arguments seem
very weak I will continue to ignore it (except to argue with you :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Sun Dec 17 17:17:12 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Sun, 17 Dec 2000 11:17:12 -0500
Subject: [Python-Dev] Death to string functions!
References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com>
	<m147ab2-000CxUC@artcom0.artcom-gmbh.de>
Message-ID: <14908.59144.321167.419762@anthem.concentric.net>

>>>>> "PF" == Peter Funk <pf at artcom-gmbh.de> writes:

    PF> Hmmmm.... May be this is just a matter of taste?  Like my
    PF> preference for '<>' instead of '!='?  Personally I still like
    PF> the old fashinoned form more.  Especially, if string.join() or
    PF> string.split() are involved.

Hey cool!  I prefer <> over != too, but I also (not surprisingly)
strongly prefer string methods over string module functions.

TOOWTDI-MA-ly y'rs,
-Barry



From gvwilson at nevex.com  Sun Dec 17 17:25:17 2000
From: gvwilson at nevex.com (Greg Wilson)
Date: Sun, 17 Dec 2000 11:25:17 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <14908.59144.321167.419762@anthem.concentric.net>
Message-ID: <000201c06845$f1afdb40$770a0a0a@nevex.com>

+1 on deprecating string functions.  Every Python book and tutorial
(including mine) emphasizes Python's simplicity and lack of Perl-ish
redundancy; the more we practice what we preach, the more persuasive
this argument is.

Greg (who admittedly only has a few thousand lines of Python to maintain)



From pf at artcom-gmbh.de  Sun Dec 17 18:40:06 2000
From: pf at artcom-gmbh.de (Peter Funk)
Date: Sun, 17 Dec 2000 18:40:06 +0100 (MET)
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012171440.JAA21620@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 17, 2000  9:40:40 am"
Message-ID: <m147hn4-000CxUC@artcom0.artcom-gmbh.de>

[string.function(S, ...) vs. S.method(...)]

Guido van Rossum:
> I don't believe one bit of this.  By that standard, we would do better
> to define a new module "list" and start writing list.append(L, x) for
> L.append(x).

list objects have only very few methods.  Strings have so many methods.  
Some of them have names, that clash easily with the method names 
of other kind of objects.  Since there are no type declarations in
Python, looking at the code in isolation and seeing a line
	i = string.index(some_parameter)
tells at the first glance, that some_parameter should be a string
object even if the doc string of this function is too terse.  However
in 
	i = some_parameter.index()
it could be a list, a database or whatever.  

> You are entitled to your opinion, but given that your arguments seem
> very weak I will continue to ignore it (except to argue with you :-).

I see.  But given the time frame that the string module wouldn't
go away any time soon, I guess I have a lot of time to either think
about some stronger arguments or to get finally accustomed to that
new style of coding.  But since we have to keep compatibility with
Python 1.5.2 for at least the next two years chances for the latter
are bad.

Regards and have a nice vacation, Peter



From mwh21 at cam.ac.uk  Sun Dec 17 19:18:24 2000
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 17 Dec 2000 18:18:24 +0000
Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL
In-Reply-To: Thomas Wouters's message of "Fri, 15 Dec 2000 23:54:25 +0100"
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> <20001215235425.A29681@xs4all.nl>
Message-ID: <m3hf42q5cf.fsf@atrus.jesus.cam.ac.uk>

Thomas Wouters <thomas at xs4all.net> writes:

> On Fri, Dec 15, 2000 at 04:03:04AM -0800, Neil Schemenauer wrote:
> > On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote:
> > > I'm not sure I agree with that view either, but mostly because there
> > > is a non-GPL replacement for parts of the readline API:
> > > 
> > >     http://www.cstr.ed.ac.uk/downloads/editline.html
> > 
> > It doesn't work with the current readline module.  It is much
> > smaller than readline and works just as well in my experience.
> > Would there be any interest in including a copy with the standard
> > distribution?  The license is quite nice (X11 type).
> 
> Definately +1 from here. Readline reminds me of the cold war, for
> some reason. (Actually, multiple reasons ;) I don't have time to do
> it myself, unfortunately, or I would. (Looking at editline has been
> on my TODO list for a while... :P)

It wouldn't be particularly hard to rewrite editline in Python (we
have termios & the terminal handling functions in curses - and even
ioctl if we get really keen).

I've been hacking on my own Python line reader on and off for a while;
it's still pretty buggy, but if you're feeling brave you could look at:

http://www-jcsu.jesus.cam.ac.uk/~mwh21/hacks/pyrl-0.0.0.tar.gz

To try it out, unpack it, cd into the ./pyrl directory and try:

>>> import foo # sorry
>>> foo.test_loop()

It sort of imitates the Python command prompt, except that it doesn't
actually execute the code you type.

You need a recent _cursesmodule.c for it to work.

Cheers,
M.

-- 
41. Some programming languages manage to absorb change, but 
    withstand progress.
  -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html




From thomas at xs4all.net  Sun Dec 17 19:30:38 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Sun, 17 Dec 2000 19:30:38 +0100
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <000201c06845$f1afdb40$770a0a0a@nevex.com>; from gvwilson@nevex.com on Sun, Dec 17, 2000 at 11:25:17AM -0500
References: <14908.59144.321167.419762@anthem.concentric.net> <000201c06845$f1afdb40$770a0a0a@nevex.com>
Message-ID: <20001217193038.C29681@xs4all.nl>

On Sun, Dec 17, 2000 at 11:25:17AM -0500, Greg Wilson wrote:

> +1 on deprecating string functions.

How wonderfully ambiguous ! Do you mean string methods, or the string module?
:)

FWIW, I agree that in time, the string module should be deprecated. But I
also think that 'in time' should be a considerable timespan. Don't deprecate
it before everything it provides is available though some other means. Wait
a bit longer than that, even, before calling it deprecated -- that scares
people off. And then keep it for practically forever (until Py3K) just to
support old code. And don't forget to document it 'deprecated' everywhere,
not just one minor release note.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tismer at tismer.com  Sun Dec 17 18:38:31 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sun, 17 Dec 2000 19:38:31 +0200
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com>
Message-ID: <3A3CFA17.ED26F51A@tismer.com>

Old topic: {}.popitem() (was Re: {}.first[key,value,item] ...)

Christian Tismer wrote:
> 
> Fredrik Lundh wrote:
> >
> > christian wrote:
> > > That algorithm is really a gem which you should know,
> > > so let me try to explain it.
> >
> > I think someone just won the "brain exploder 2000" award ;-)

<snip>

> As you might have guessed, I didn't do this just for fun.
> It is the old game of explaining what is there, convincing
> everybody that you at least know what you are talking about,
> and then three days later coming up with an improved
> application of the theory.
> 
> Today is Monday, 2 days left. :-)

Ok, today is Sunday, I had no time to finish this.
But now it is here.

                  ===========================
                  =====    Claim:       =====
                  ===========================

-  Dictionary access time can be improved with a minimal change -

On the hash() function:
All Objects are supposed to provide a hash function which is
as good as possible.
Good means to provide a wide range of different keys for different
values.

Problem: There are hash functions which are "good" in this sense,
but they do not spread their randomness uniformly over the
32 bits.

Example: Integers use their own value as hash.
This is ok, as far as the integers are uniformly distributed.
But if they all contain a high power of two, for instance,
the low bits give a very bad hash function.

Take a dictionary with integers range(1000) as keys and access
all entries. Then use a dictionay with the integers shifted
left by 16.
Access time is slowed down by a factor of 100, since every
access is a linear search now.

This is not an urgent problem, although applications exist
where this can play a role (memory addresses for instance
can have high factors of two when people do statistics on
page accesses...)

While this is not a big problem, it is ugly enough to
think of a solution.

Solution 1:
-------------
Try to involve more bits of the hash value by doing extra
shuffling, either 
a) in the dictlook function, or
b) in the hash generation itself.

I believe, both can't be really justified for a rare problem.
But how about changing the existing solution in a way that
an improvement is gained without extra cost?

Solution 2: (*the* solution)
----------------------------
Some people may remember what I wrote about re-hashing
functions through the multiplicative group GF(2^n)*,
and I don't want to repeat this here.
The simple idea can be summarized quickly:

The original algorithm uses multiplication by polynomials,
and it is guaranteed that these re-hash values are jittering
through all possible nonzero patterns of the n bits.

Observation: Whe are using an operation of a finite field.
This means that the inverse of multiplication also exists!

Old algortithm (multiplication):
      shift the index left by 1
      if index > mask:
          xor the index with the generator polynomial

New algorithm (division):
      if low bit of index set:
          xor the index with the generator polynomial
      shift the index right by 1

What does this mean? Not so much, we are just cycling through
our bit patterns in reverse order.

But now for the big difference.

First change:    We change from multiplication to division.
Second change:   We do not mask the hash value before!

The second change is what I was after: By not masking the
hash value when computing the initial index, all the
existing bits in the hash come into play.

This can be seen like a polynomial division, but the initial
remainder (the hash value) was not normalized. After a number
of steps, all the extra bits are wheeled into our index,
but not wasted by masking them off. That gives our re-hash
some more randomness. When all the extra bits are sucked
in, the guaranteed single-visit cycle begins. There cannot
be more than 27 extra cycles in the worst case (dict size
= 32, so there are 27 bits to consume).

I do not expect any bad effect from this modification.

Here some results, dictionaries have 1000 entries:

timing for strings              old=  5.097 new= 5.088
timing for bad integers (<<10)  old=101.540 new=12.610
timing for bad integers (<<16)  old=571.210 new=19.220

On strings, both algorithms behave the same.
On numbers, they differ dramatically.
While the current algorithm is 110 times slower on a worst
case dict (quadratic behavior), the new algorithm accounts
a little for the extra cycle, but is only 4 times slower.

Alternative implementation:
The above approach is conservative in the sense that it
tries not to slow down the current implementation in any
way. An alternative would be to comsume all of the extra
bits at once. But this would add an extra "warmup" loop
like this to the algorithm:

    while index > mask:
        if low bit of index set:
            xor the index with the generator polynomial
        shift the index right by 1

This is of course a very good digest of the higher bits,
since it is a polynomial division and not just some
bit xor-ing which might give quite predictable cancellations,
therefore it is "the right way" in my sense.
It might be cheap, but would add over 20 cycles to every
small dict. I therefore don't think it is worth to do this.

Personally, I prefer the solution to merge the bits during
the actual lookup, since it suffices to get access time
from quadratic down to logarithmic.

Attached is a direct translation of the relevant parts
of dictobject.c into Python, with both algorithms
implemented.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com
-------------- next part --------------
## dictest.py
## Test of a new rehash algorithm
## Chris Tismer
## 2000-12-17
## Mission Impossible 5oftware Team

# The following is a partial re-implementation of
# Python dictionaries in Python.
# The original algorithm was literally turned
# into Python code.

##/*
##Table of irreducible polynomials to efficiently cycle through
##GF(2^n)-{0}, 2<=n<=30.
##*/
polys = [
    4 + 3,
    8 + 3,
    16 + 3,
    32 + 5,
    64 + 3,
    128 + 3,
    256 + 29,
    512 + 17,
    1024 + 9,
    2048 + 5,
    4096 + 83,
    8192 + 27,
    16384 + 43,
    32768 + 3,
    65536 + 45,
    131072 + 9,
    262144 + 39,
    524288 + 39,
    1048576 + 9,
    2097152 + 5,
    4194304 + 3,
    8388608 + 33,
    16777216 + 27,
    33554432 + 9,
    67108864 + 71,
    134217728 + 39,
    268435456 + 9,
    536870912 + 5,
    1073741824 + 83,
    0
]


class NULL: pass

class Dictionary:
    dummy = "<dummy key>"
    def __init__(mp, newalg=0):
        mp.ma_size = 0
        mp.ma_poly = 0
        mp.ma_table = []
        mp.ma_fill = 0
        mp.ma_used = 0
        mp.oldalg = not newalg

    def lookdict(mp, key, _hash):
        me_hash, me_key, me_value = range(3) # rec slots
        dummy = mp.dummy
        
        mask = mp.ma_size-1
        ep0 = mp.ma_table
        i = (~_hash) & mask
        ep = ep0[i]
        if ep[me_key] is NULL or ep[me_key] == key:
            return ep
        if ep[me_key] == dummy:
            freeslot = ep
        else:
            if (ep[me_hash] == _hash and
                cmp(ep[me_key], key) == 0) :
                return ep
            freeslot = NULL

###### FROM HERE
        if mp.oldalg:
            incr = (_hash ^ (_hash >> 3)) & mask
        else:
            # note that we do not mask!
            # even the shifting my not be worth it.
            incr = _hash ^ (_hash >> 3)
###### TO HERE
            
        if (not incr):
            incr = mask
        while 1:
            ep = ep0[(i+incr)&mask]
            if (ep[me_key] is NULL) :
                if (freeslot != NULL) :
                    return freeslot
                else:
                    return ep
            if (ep[me_key] == dummy) :
                if (freeslot == NULL):
                    freeslot = ep
            elif (ep[me_key] == key or
                 (ep[me_hash] == _hash and
                  cmp(ep[me_key], key) == 0)) :
                return ep

            # Cycle through GF(2^n)-{0}
###### FROM HERE
            if mp.oldalg:
                incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            else:
                # new algorithm: do a division
                if (incr & 1):
                    incr = incr ^ mp.ma_poly
                incr = incr >> 1
###### TO HERE

    def insertdict(mp, key, _hash, value):
        me_hash, me_key, me_value = range(3) # rec slots
        
        ep = mp.lookdict(key, _hash)
        if (ep[me_value] is not NULL) :
            old_value = ep[me_value]
            ep[me_value] = value
        else :
            if (ep[me_key] is NULL):
                mp.ma_fill=mp.ma_fill+1
            ep[me_key] = key
            ep[me_hash] = _hash
            ep[me_value] = value
            mp.ma_used = mp.ma_used+1

    def dictresize(mp, minused):
        me_hash, me_key, me_value = range(3) # rec slots

        oldsize = mp.ma_size
        oldtable = mp.ma_table
        MINSIZE = 4
        newsize = MINSIZE
        for i in range(len(polys)):
            if (newsize > minused) :
                newpoly = polys[i]
                break
            newsize = newsize << 1
        else:
            return -1

        _nullentry = range(3)
        _nullentry[me_hash] = 0
        _nullentry[me_key] = NULL
        _nullentry[me_value] = NULL

        newtable = map(lambda x,y=_nullentry:y[:], range(newsize))      

        mp.ma_size = newsize
        mp.ma_poly = newpoly
        mp.ma_table = newtable
        mp.ma_fill = 0
        mp.ma_used = 0

        for ep in oldtable:
            if (ep[me_value] is not NULL):
                mp.insertdict(ep[me_key],ep[me_hash],ep[me_value])
        return 0

    # PyDict_GetItem
    def __getitem__(op, key):
        me_hash, me_key, me_value = range(3) # rec slots

        if not op.ma_table:
            raise KeyError, key
        _hash = hash(key)
        return op.lookdict(key, _hash)[me_value]

    # PyDict_SetItem
    def __setitem__(op, key, value):
        mp = op
        _hash = hash(key)
##      /* if fill >= 2/3 size, double in size */
        if (mp.ma_fill*3 >= mp.ma_size*2) :
            if (mp.dictresize(mp.ma_used*2) != 0):
                if (mp.ma_fill+1 > mp.ma_size):
                    raise MemoryError
        mp.insertdict(key, _hash, value)

    # more interface functions
    def keys(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _key)
        return res

    def values(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _value)
        return res

    def items(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( (_key, _value) )
        return res

    def __cmp__(self, other):
        mine = self.items()
        others = other.items()
        mine.sort()
        others.sort()
        return cmp(mine, others)

######################################################
## tests

def timing(func, args=None, n=1, **keywords) :
	import time
	time=time.time
	appl=apply
	if args is None: args = ()
	if type(args) != type(()) : args=(args,)
	rep=range(n)
	dummyarg = ("",)
	dummykw = {}
	dummyfunc = len
	if keywords:
		before=time()
		for i in rep: res=appl(dummyfunc, dummyarg, dummykw)
		empty = time()-before
		before=time()
		for i in rep: res=appl(func, args, keywords)
	else:
		before=time()
		for i in rep: res=appl(dummyfunc, dummyarg)
		empty = time()-before
		before=time()
		for i in rep: res=appl(func, args)
	after = time()
	return round(after-before-empty,4), res

def test(lis, dic):
    for key in lis: dic[key]

def nulltest(lis, dic):
    for key in lis: dic

def string_dicts():
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	for i in range(1000):
		s = str(i) * 5
		d1[s] = d2[s] = i
	return d1, d2

def badnum_dicts():
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	for i in range(1000):
		bad = i << 16
		d1[bad] = d2[bad] = i
	return d1, d2

def do_test(dict, keys, n):
	t0 = timing(nulltest, (keys, dict), n)[0]
	t1 = timing(test, (keys, dict), n)[0]
	return t1-t0

if __name__ == "__main__":
	sdold, sdnew = string_dicts()
	bdold, bdnew = badnum_dicts()
	print "timing for strings old=%.3f new=%.3f" % (
		  do_test(sdold, sdold.keys(), 100),
		  do_test(sdnew, sdnew.keys(), 100) )
	print "timing for bad integers old=%.3f new=%.3f" % (
		  do_test(bdold, bdold.keys(), 10) *10,
		  do_test(bdnew, bdnew.keys(), 10) *10)

		  
"""
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.097 new=5.088
timing for bad integers old=101.540 new=12.610
"""

From fdrake at acm.org  Sun Dec 17 19:49:58 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sun, 17 Dec 2000 13:49:58 -0500 (EST)
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <20001217193038.C29681@xs4all.nl>
References: <14908.59144.321167.419762@anthem.concentric.net>
	<000201c06845$f1afdb40$770a0a0a@nevex.com>
	<20001217193038.C29681@xs4all.nl>
Message-ID: <14909.2774.158973.760077@cj42289-a.reston1.va.home.com>

Thomas Wouters writes:
 > FWIW, I agree that in time, the string module should be deprecated. But I
 > also think that 'in time' should be a considerable timespan. Don't deprecate

  *If* most functions in the string module are going to be deprecated,
that should be done *now*, so that the documentation will include the
appropriate warning to users.  When they should actually be removed is
another matter, and I think Guido is sufficiently aware of their
widespread use and won't remove them too quickly -- his creation of
Python isn't the reason he's *accepted* as BDFL, it just made it a
possibility.  He's had to actually *earn* the BDFL position, I think.
  With regard to converting the standard library to string methods:
that needs to be done as part of the deprecation.  The code in the
library is commonly used as example code, and should be good example
code wherever possible.

 > support old code. And don't forget to document it 'deprecated' everywhere,
 > not just one minor release note.

  When Guido tells me exactly what is deprecated, the documentation
will be updated with proper deprecation notices in the appropriate
places.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tismer at tismer.com  Sun Dec 17 19:10:07 2000
From: tismer at tismer.com (Christian Tismer)
Date: Sun, 17 Dec 2000 20:10:07 +0200
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com>
Message-ID: <3A3D017F.62AD599F@tismer.com>



Christian Tismer wrote:

...
(my timings)
Attached is the updated script with the timings mentioned
in the last posting. Sorry, I posted an older version before.

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com
-------------- next part --------------
## dictest.py
## Test of a new rehash algorithm
## Chris Tismer
## 2000-12-17
## Mission Impossible 5oftware Team

# The following is a partial re-implementation of
# Python dictionaries in Python.
# The original algorithm was literally turned
# into Python code.

##/*
##Table of irreducible polynomials to efficiently cycle through
##GF(2^n)-{0}, 2<=n<=30.
##*/
polys = [
    4 + 3,
    8 + 3,
    16 + 3,
    32 + 5,
    64 + 3,
    128 + 3,
    256 + 29,
    512 + 17,
    1024 + 9,
    2048 + 5,
    4096 + 83,
    8192 + 27,
    16384 + 43,
    32768 + 3,
    65536 + 45,
    131072 + 9,
    262144 + 39,
    524288 + 39,
    1048576 + 9,
    2097152 + 5,
    4194304 + 3,
    8388608 + 33,
    16777216 + 27,
    33554432 + 9,
    67108864 + 71,
    134217728 + 39,
    268435456 + 9,
    536870912 + 5,
    1073741824 + 83,
    0
]


class NULL: pass

class Dictionary:
    dummy = "<dummy key>"
    def __init__(mp, newalg=0):
        mp.ma_size = 0
        mp.ma_poly = 0
        mp.ma_table = []
        mp.ma_fill = 0
        mp.ma_used = 0
        mp.oldalg = not newalg

    def lookdict(mp, key, _hash):
        me_hash, me_key, me_value = range(3) # rec slots
        dummy = mp.dummy
        
        mask = mp.ma_size-1
        ep0 = mp.ma_table
        i = (~_hash) & mask
        ep = ep0[i]
        if ep[me_key] is NULL or ep[me_key] == key:
            return ep
        if ep[me_key] == dummy:
            freeslot = ep
        else:
            if (ep[me_hash] == _hash and
                cmp(ep[me_key], key) == 0) :
                return ep
            freeslot = NULL

###### FROM HERE
        if mp.oldalg:
            incr = (_hash ^ (_hash >> 3)) & mask
        else:
            # note that we do not mask!
            # even the shifting my not be worth it.
            incr = _hash ^ (_hash >> 3)
###### TO HERE
            
        if (not incr):
            incr = mask
        while 1:
            ep = ep0[(i+incr)&mask]
            if (ep[me_key] is NULL) :
                if (freeslot != NULL) :
                    return freeslot
                else:
                    return ep
            if (ep[me_key] == dummy) :
                if (freeslot == NULL):
                    freeslot = ep
            elif (ep[me_key] == key or
                 (ep[me_hash] == _hash and
                  cmp(ep[me_key], key) == 0)) :
                return ep

            # Cycle through GF(2^n)-{0}
###### FROM HERE
            if mp.oldalg:
                incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            else:
                # new algorithm: do a division
                if (incr & 1):
                    incr = incr ^ mp.ma_poly
                incr = incr >> 1
###### TO HERE

    def insertdict(mp, key, _hash, value):
        me_hash, me_key, me_value = range(3) # rec slots
        
        ep = mp.lookdict(key, _hash)
        if (ep[me_value] is not NULL) :
            old_value = ep[me_value]
            ep[me_value] = value
        else :
            if (ep[me_key] is NULL):
                mp.ma_fill=mp.ma_fill+1
            ep[me_key] = key
            ep[me_hash] = _hash
            ep[me_value] = value
            mp.ma_used = mp.ma_used+1

    def dictresize(mp, minused):
        me_hash, me_key, me_value = range(3) # rec slots

        oldsize = mp.ma_size
        oldtable = mp.ma_table
        MINSIZE = 4
        newsize = MINSIZE
        for i in range(len(polys)):
            if (newsize > minused) :
                newpoly = polys[i]
                break
            newsize = newsize << 1
        else:
            return -1

        _nullentry = range(3)
        _nullentry[me_hash] = 0
        _nullentry[me_key] = NULL
        _nullentry[me_value] = NULL

        newtable = map(lambda x,y=_nullentry:y[:], range(newsize))      

        mp.ma_size = newsize
        mp.ma_poly = newpoly
        mp.ma_table = newtable
        mp.ma_fill = 0
        mp.ma_used = 0

        for ep in oldtable:
            if (ep[me_value] is not NULL):
                mp.insertdict(ep[me_key],ep[me_hash],ep[me_value])
        return 0

    # PyDict_GetItem
    def __getitem__(op, key):
        me_hash, me_key, me_value = range(3) # rec slots

        if not op.ma_table:
            raise KeyError, key
        _hash = hash(key)
        return op.lookdict(key, _hash)[me_value]

    # PyDict_SetItem
    def __setitem__(op, key, value):
        mp = op
        _hash = hash(key)
##      /* if fill >= 2/3 size, double in size */
        if (mp.ma_fill*3 >= mp.ma_size*2) :
            if (mp.dictresize(mp.ma_used*2) != 0):
                if (mp.ma_fill+1 > mp.ma_size):
                    raise MemoryError
        mp.insertdict(key, _hash, value)

    # more interface functions
    def keys(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _key)
        return res

    def values(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _value)
        return res

    def items(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( (_key, _value) )
        return res

    def __cmp__(self, other):
        mine = self.items()
        others = other.items()
        mine.sort()
        others.sort()
        return cmp(mine, others)

######################################################
## tests

def timing(func, args=None, n=1, **keywords) :
	import time
	time=time.time
	appl=apply
	if args is None: args = ()
	if type(args) != type(()) : args=(args,)
	rep=range(n)
	dummyarg = ("",)
	dummykw = {}
	dummyfunc = len
	if keywords:
		before=time()
		for i in rep: res=appl(dummyfunc, dummyarg, dummykw)
		empty = time()-before
		before=time()
		for i in rep: res=appl(func, args, keywords)
	else:
		before=time()
		for i in rep: res=appl(dummyfunc, dummyarg)
		empty = time()-before
		before=time()
		for i in rep: res=appl(func, args)
	after = time()
	return round(after-before-empty,4), res

def test(lis, dic):
    for key in lis: dic[key]

def nulltest(lis, dic):
    for key in lis: dic

def string_dicts():
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	for i in range(1000):
		s = str(i) * 5
		d1[s] = d2[s] = i
	return d1, d2

def badnum_dicts():
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	shift = 10
	if EXTREME:
		shift = 16
	for i in range(1000):
		bad = i << 16
		d1[bad] = d2[bad] = i
	return d1, d2

def do_test(dict, keys, n):
	t0 = timing(nulltest, (keys, dict), n)[0]
	t1 = timing(test, (keys, dict), n)[0]
	return t1-t0

EXTREME=1

if __name__ == "__main__":
	sdold, sdnew = string_dicts()
	bdold, bdnew = badnum_dicts()
	print "timing for strings old=%.3f new=%.3f" % (
		  do_test(sdold, sdold.keys(), 100),
		  do_test(sdnew, sdnew.keys(), 100) )
	print "timing for bad integers old=%.3f new=%.3f" % (
		  do_test(bdold, bdold.keys(), 10) *10,
		  do_test(bdnew, bdnew.keys(), 10) *10)
  
"""
Results with a shift of 10 (EXTREME=0):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.097 new=5.088
timing for bad integers old=101.540 new=12.610

Results with a shift of 16 (EXTREME=1):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.218 new=5.147
timing for bad integers old=571.210 new=19.220
"""

From lutz at rmi.net  Sun Dec 17 20:09:47 2000
From: lutz at rmi.net (Mark Lutz)
Date: Sun, 17 Dec 2000 12:09:47 -0700
Subject: [Python-Dev] Death to string functions!
References: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com>  <200012152123.QAA03566@cj20424-a.reston1.va.home.com>
Message-ID: <001f01c0685c$ef555200$7bdb5da6@vaio>

As a longstanding Python advocate and user, I find this
thread disturbing, and feel compelled to add a few words:

> > [Tim wrote:]
> > "string" is right up there with "os" and "sys" as a FIM (Frequently
> > Imported Module), so the required code changes will be massive.  As
> > a user, I don't see what's in it for me to endure that pain: the
> > string module functions work fine!  Neither are they warts in the
> > language, any more than that we say sin(pi) instead of pi.sin().
> > Keeping the functions around doesn't hurt anybody that I can see.
> 
> [Guido wrote:]
> Hm.  I'm not saying that this one will be easy.  But I don't like
> having "two ways to do it".  It means more learning, etc. (you know
> the drill).

But with all due respect, there are already _lots_ of places 
in Python that provide at least two ways to do something
already.  Why be so strict on this one alone?

Consider lambda and def; tuples and lists; map and for 
loops; the loop else and boolean exit flags; and so on.  
The notion of Python forcing a single solution is largely a 
myth. And as someone who makes a living teaching this 
stuff, I can tell you that none of the existing redundancies 
prevent anyone from learning Python.

More to the point, many of those shiny new features added 
to 2.0 fall squarely into this category too, and are completely 
redundant with other tools.  Consider list comprehensions
and simple loops; extended print statements and sys.std* 
assignments; augmented assignment statements and simpler
ones.  Eliminating redundancy at a time when we're also busy
introducing it seems a tough goal to sell.

I understand the virtues of aesthetics too, but removing the
string module seems an incredibly arbitrary application of it.  


> If you're saying that you think the string module is too prominent to
> ever start deprecating its use, I'm afraid we have a problem.
>
> [...]
> Ideally, I'd like to deprecate the entire string module, so that I
> can place a single warning at its top.  This will cause a single
> warning to be issued for programs that still use it (no matter how
> many times it is imported).

And to me, this seems the real crux of the matter. For a 
decade now, the string module _has_ been the right way to
do it.  And today, half a million Python developers absolutely
rely on it as an essential staple in their toolbox.  What could 
possibly be wrong with keeping it around for backward 
compatibility, albeit as a less recommended option?

If almost every Python program ever written suddenly starts 
issuing warning messages, then I think we do have a problem
indeed.  Frankly, a Python that changes without regard to its
user base seems an ominous thing to me.  And keep in mind 
that I like Python; others will look much less generously upon
a tool that seems inclined to rip the rug out from under its users.
Trust me on this; I've already heard the rumblings out there.

So please: can we keep string around?  Like it or not, we're 
way past the point of removing such core modules at this point.
Such a radical change might pass in a future non-backward-
compatible Python mutation; I'm not sure such a different
system will still be "Python", but that's a topic for another day.

All IMHO, of course,
--Mark Lutz  (http://www.rmi.net~lutz)





From tim.one at home.com  Sun Dec 17 20:50:55 2000
From: tim.one at home.com (Tim Peters)
Date: Sun, 17 Dec 2000 14:50:55 -0500
Subject: [Python-Dev] SourceForge SSH silliness
Message-ID: <LNBBLJKPBEHFEDALKOLCCEFCIEAA.tim.one@home.com>

Starting last night, I get this msg whenever I update Python code w/
CVSROOT=:ext:tim_one at cvs.python.sourceforge.net:/cvsroot/python:

"""
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@       WARNING: HOST IDENTIFICATION HAS CHANGED!         @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that the host key has just been changed.
Please contact your system administrator.
Add correct host key in C:\Code/.ssh/known_hosts to get rid of this message.
Password authentication is disabled to avoid trojan horses.
"""

This is SourceForge's doing, and is permanent (they've changed keys on their
end).  Here's a link to a thread that may or may not make sense to you:

http://sourceforge.net/forum/forum.php?forum_id=52867

Deleting the sourceforge entries from my .ssh/known_hosts file worked for
me.  But everyone in the thread above who tried it says that they haven't
been able to get scp working again (I haven't tried it yet ...).




From paulp at ActiveState.com  Sun Dec 17 21:04:27 2000
From: paulp at ActiveState.com (Paul Prescod)
Date: Sun, 17 Dec 2000 12:04:27 -0800
Subject: [Python-Dev] Pragmas and warnings
Message-ID: <3A3D1C4B.8F08A744@ActiveState.com>

A couple of other threads started me to thinking that there are a couple
of things missing from our warnings framework. 

Many languages have pragmas that allow you turn warnings on and off in
code. For instance, I should be able to put a pragma at the top of a
module that uses string functions to say: "I know that this module
doesn't adhere to the latest Python conventions. Please don't warn me
about it." I should also be able to put a declaration that says: "I'm
really paranoid about shadowing globals and builtins. Please warn me
when I do that."

Batch and visual linters could also use the declarations to customize
their behaviors.

And of course we have a stack of other features that could use pragmas:

 * type signatures
 * Unicode syntax declarations
 * external object model language binding hints
 * ...

A case could be made that warning pragmas could use a totally different
syntax from "user-defined" pragmas. I don't care much.

 Paul



From thomas at xs4all.net  Sun Dec 17 22:00:08 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Sun, 17 Dec 2000 22:00:08 +0100
Subject: [Python-Dev] SourceForge SSH silliness
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEFCIEAA.tim.one@home.com>; from tim.one@home.com on Sun, Dec 17, 2000 at 02:50:55PM -0500
References: <LNBBLJKPBEHFEDALKOLCCEFCIEAA.tim.one@home.com>
Message-ID: <20001217220008.D29681@xs4all.nl>

On Sun, Dec 17, 2000 at 02:50:55PM -0500, Tim Peters wrote:
> Starting last night, I get this msg whenever I update Python code w/
> CVSROOT=:ext:tim_one at cvs.python.sourceforge.net:/cvsroot/python:

> """
> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> @       WARNING: HOST IDENTIFICATION HAS CHANGED!         @
> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
> Someone could be eavesdropping on you right now (man-in-the-middle attack)!
> It is also possible that the host key has just been changed.
> Please contact your system administrator.
> Add correct host key in C:\Code/.ssh/known_hosts to get rid of this message.
> Password authentication is disabled to avoid trojan horses.
> """

> This is SourceForge's doing, and is permanent (they've changed keys on their
> end).  Here's a link to a thread that may or may not make sense to you:

> http://sourceforge.net/forum/forum.php?forum_id=52867

> Deleting the sourceforge entries from my .ssh/known_hosts file worked for
> me.  But everyone in the thread above who tried it says that they haven't
> been able to get scp working again (I haven't tried it yet ...).

What sourceforge did was switch Linux distributions, and upgrade. The switch
doesn't really matter for the SSH problem, because recent Debian and recent
RedHat releases both use a new ssh, the OpenBSD ssh imlementation.
Apparently, it isn't entirely backwards compatible to old versions of
F-secure ssh. For one thing, it doesn't support the 'idea' cypher. This
might or might not be your problem; if it is, you should get a decent
message that gives a relatively clear message such as 'cypher type 'idea'
not supported'. You should be able to pass the '-c' option to scp/ssh to use
a different cypher, like 3des (aka triple-des.) Or maybe the windows
versions have a menu to configure that kind of thing :) 

Another possible problem is that it might not have good support for older
protocol versions. The 'current' protocol version, at least for 'ssh1', is
1.5. The one message on the sourceforge thread above that actually mentions
a version in the *cough* bugreport is using an older ssh that only supports
protocol version 1.4. Since that particular version of F-secure ssh has
known problems (why else would they release 16 more versions ?) I'd suggest
anyone with problems first try a newer version. I hope that doesn't break
WinCVS, but it would suck if it did :P

If that doesn't work, which is entirely possible, it might be an honest bug
in the OpenBSD ssh that Sourceforge is using. If anyone cared, we could do a
bit of experimenting with the openssh-2.0 betas installed by Debian woody
(unstable) to see if the problem occurs there as well.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From greg at cosc.canterbury.ac.nz  Mon Dec 18 00:05:41 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 18 Dec 2000 12:05:41 +1300 (NZDT)
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <20001216014341.5BA97A82E@darjeeling.zadka.site.co.il>
Message-ID: <200012172305.MAA02512@s454.cosc.canterbury.ac.nz>

Moshe Zadka <moshez at zadka.site.co.il>:

> Perl and Scheme permit implicit shadowing too.

But Scheme always requires declarations!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From martin at loewis.home.cs.tu-berlin.de  Mon Dec 18 00:45:56 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 18 Dec 2000 00:45:56 +0100
Subject: [Python-Dev] Death to string functions!
Message-ID: <200012172345.AAA00877@loewis.home.cs.tu-berlin.de>

> But with all due respect, there are already _lots_ of places in
> Python that provide at least two ways to do something already.

Exactly. My favourite one here is string exceptions, which have quite
some analogy to the string module.

At some time, there were only string exceptions. Then, instance
exceptions were added, some releases later they were considered the
better choice, so the standard library was converted to use them.
Still, there is no sign whatsoever that anybody plans to deprecate
string exceptions.

I believe the string module will get less importance over
time. Comparing it with string exception, that may be well 5 years.
It seems there are two ways of "deprecation": a loud "we will remove
that, change your code", and a silent "strings have methods"
(i.e. don't mention the module when educating people). The latter
approach requires educators to agree that the module is
"uninteresting", and people to really not use once they find out it
exists.

I think deprecation should be only attempted once there is a clear
sign that people don't use it massively for new code anymore. Removal
should only occur if keeping the module less pain than maintaining it.

Regards,
Martin




From skip at mojam.com  Mon Dec 18 00:55:10 2000
From: skip at mojam.com (Skip Montanaro)
Date: Sun, 17 Dec 2000 17:55:10 -0600 (CST)
Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in?
Message-ID: <14909.21086.92774.940814@beluga.mojam.com>

I executed cvs update today (removing the sourceforge machines from
.ssh/known_hosts worked fine for me, btw) followed by a configure and a make
clean.  The last step failed with this output:

    ...
    make[1]: Entering directory `/home/beluga/skip/src/python/dist/src/Modules'
    Makefile.pre.in:20: *** missing separator.  Stop.
    make[1]: Leaving directory `/home/beluga/skip/src/python/dist/src/Modules'
    make: [clean] Error 2 (ignored)

I found the following at line 20 of Modules/Makefile.pre.in:

    @SET_CXX@

I then tried a cvs annotate on that file but saw that line 20 had been there
since rev 1.60 (16-Dec-99).  I then checked the top-level Makefile.in
thinking something must have changed in the clean target recently, but cvs
annotate shows no recent changes there either:

    1.1          (guido    24-Dec-93): clean:		localclean
    1.1          (guido    24-Dec-93): 		-for i in $(SUBDIRS); do \
    1.74         (guido    19-May-98): 		    if test -d $$i; then \
    1.24         (guido    20-Jun-96): 			(echo making clean in subdirectory $$i; cd $$i; \
    1.4          (guido    01-Aug-94): 			 if test -f Makefile; \
    1.4          (guido    01-Aug-94): 			 then $(MAKE) clean; \
    1.4          (guido    01-Aug-94): 			 else $(MAKE) -f Makefile.*in clean; \
    1.4          (guido    01-Aug-94): 			 fi); \
    1.74         (guido    19-May-98): 		    else true; fi; \
    1.1          (guido    24-Dec-93): 		done

Make distclean succeeded so I tried the following:

    make distclean
    ./configure
    make clean

but the last step still failed.  Any idea why make clean is now failing (for
me)?  Can anyone else reproduce this problem?

Skip



From greg at cosc.canterbury.ac.nz  Mon Dec 18 01:02:32 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 18 Dec 2000 13:02:32 +1300 (NZDT)
Subject: [Python-Dev] Use of %c and Py_UNICODE
In-Reply-To: <3A3A9A22.E9BA9551@lemburg.com>
Message-ID: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz>

"M.-A. Lemburg" <mal at lemburg.com>:

> Format characters will always
> be ASCII and thus 7-bit -- theres really no need to expand the
> set of possibilities beyond 8 bits ;-)

But the error message is being produced because the
character is NOT a valid format character. One of the
reasons for that might be because it's not in the
7-bit range!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From MarkH at ActiveState.com  Mon Dec 18 07:02:27 2000
From: MarkH at ActiveState.com (Mark Hammond)
Date: Mon, 18 Dec 2000 17:02:27 +1100
Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in?
In-Reply-To: <14909.21086.92774.940814@beluga.mojam.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPIEJJCMAA.MarkH@ActiveState.com>

> I found the following at line 20 of Modules/Makefile.pre.in:
>
>     @SET_CXX@

I dont have time to investigate this specific problem, but I definitely had
problems with SET_CXX around 6 months back.  This was trying to build an
external C++ application, so may be different.  My message and other
followups at the time implied noone really knew and everyone agreed it was
likely SET_CXX was broken :-(  I even referenced the CVS chekin that I
thought broke it.

Mark.




From mal at lemburg.com  Mon Dec 18 10:58:37 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 10:58:37 +0100
Subject: [Python-Dev] Pragmas and warnings
References: <3A3D1C4B.8F08A744@ActiveState.com>
Message-ID: <3A3DDFCD.34AB05B2@lemburg.com>

Paul Prescod wrote:
> 
> A couple of other threads started me to thinking that there are a couple
> of things missing from our warnings framework.
> 
> Many languages have pragmas that allow you turn warnings on and off in
> code. For instance, I should be able to put a pragma at the top of a
> module that uses string functions to say: "I know that this module
> doesn't adhere to the latest Python conventions. Please don't warn me
> about it." I should also be able to put a declaration that says: "I'm
> really paranoid about shadowing globals and builtins. Please warn me
> when I do that."
> 
> Batch and visual linters could also use the declarations to customize
> their behaviors.
> 
> And of course we have a stack of other features that could use pragmas:
> 
>  * type signatures
>  * Unicode syntax declarations
>  * external object model language binding hints
>  * ...
> 
> A case could be made that warning pragmas could use a totally different
> syntax from "user-defined" pragmas. I don't care much.

There was a long thread about this some months ago. We agreed
to add a new keyword to the language (I think it was "define")
which then uses a very simple syntax which can be interpreted 
at compile time to modify the behaviour of the compiler, e.g.

define <identifier> = <literal>

There was also a discussion about allowing limited forms of
expressions instead of the constant literal.

define source_encoding = "utf-8"

was the original motivation for this, but (as always ;) the
usefulness for other application areas was quickly recognized, e.g.
to enable compilation in optimization mode on a per module
basis.

PS: "define" is perhaps not obscure enough as keyword...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Dec 18 11:04:08 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 11:04:08 +0100
Subject: [Python-Dev] Use of %c and Py_UNICODE
References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz>
Message-ID: <3A3DE118.3355896D@lemburg.com>

Greg Ewing wrote:
> 
> "M.-A. Lemburg" <mal at lemburg.com>:
> 
> > Format characters will always
> > be ASCII and thus 7-bit -- theres really no need to expand the
> > set of possibilities beyond 8 bits ;-)
> 
> But the error message is being produced because the
> character is NOT a valid format character. One of the
> reasons for that might be because it's not in the
> 7-bit range!

True. 

I think removing %c completely in that case is the right
solution (in case you don't want to convert the Unicode char
using the default encoding to a string first).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Dec 18 11:09:16 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 11:09:16 +0100
Subject: [Python-Dev] What to do about PEP 229?
References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com> <20001216191739.B6703@kronos.cnri.reston.va.us>
Message-ID: <3A3DE24C.DA0B2F6C@lemburg.com>

Andrew Kuchling wrote:
> 
> On Fri, Dec 15, 2000 at 11:39:18PM +0100, M.-A. Lemburg wrote:
> >Can't distutils try both and then settle for the working combination ?
> 
> I'm worried about subtle problems; what if an unneeded -lfoo drags in
> a customized malloc, or has symbols which conflict with some other
> library.

In that case, I think the user will have to decide. setup.py should
then default to not integrating the module in question and issue
a warning telling the use what to look for and how to call setup.py
in order to add the right combination of libs.
 
> >... BTW, where is Greg ? I haven't heard from him in quite a while.]
> 
> Still around; he just hasn't been posting much these days.

Good to know :)
 
> >Why not parse Setup and use it as input to distutils setup.py ?
> 
> That was option 1.  The existing Setup format doesn't really contain
> enough intelligence, though; the intelligence is usually in comments
> such as "Uncomment the following line for Solaris".  So either the
> Setup format is modified (bad, since we'd break existing 3rd-party
> packages that still use a Makefile.pre.in), or I give up and just do
> everything in a setup.py.

I would still like a simple input to setup.py -- one that doesn't
require hacking setup.py just to enable a few more modules.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fredrik at effbot.org  Mon Dec 18 11:15:26 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Mon, 18 Dec 2000 11:15:26 +0100
Subject: [Python-Dev] Use of %c and Py_UNICODE
References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> <3A3DE118.3355896D@lemburg.com>
Message-ID: <004a01c068db$72403170$3c6340d5@hagrid>

mal wrote:

> > But the error message is being produced because the
> > character is NOT a valid format character. One of the
> > reasons for that might be because it's not in the
> > 7-bit range!
> 
> True. 
> 
> I think removing %c completely in that case is the right
> solution (in case you don't want to convert the Unicode char
> using the default encoding to a string first).

how likely is it that a human programmer will use a bad formatting
character that's not in the ASCII range?

-1 on removing it -- people shouldn't have to learn the octal ASCII
table just to be able to fix trivial typos.

+1 on mapping the character back to a string in the same was as
"repr" -- that is, print ASCII characters as is, map anything else to
an octal escape.

+0 on leaving it as it is, or mapping non-printables to "?".

</F>




From mal at lemburg.com  Mon Dec 18 11:34:02 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 11:34:02 +0100
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com>
Message-ID: <3A3DE81A.4B825D89@lemburg.com>

> Here some results, dictionaries have 1000 entries:
> 
> timing for strings              old=  5.097 new= 5.088
> timing for bad integers (<<10)  old=101.540 new=12.610
> timing for bad integers (<<16)  old=571.210 new=19.220

Even though I think concentrating on string keys would provide more
performance boost for Python in general, I think you have a point
there. +1 from here.

BTW, would changing the hash function on strings from the simple
XOR scheme to something a little smarter help improve the performance
too (e.g. most strings used in programming never use the 8-th
bit) ?

I also think that we could inline the string compare function
in dictobject:lookdict_string to achieve even better performance.
Currently it uses a function which doesn't trigger compiler
inlining.

And finally: I think a generic PyString_Compare() API would
be useful in a lot of places where strings are being compared
(e.g. dictionaries and keyword parameters). Unicode already
has such an API (along with dozens of other useful APIs which
are not available for strings).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Dec 18 11:41:38 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 11:41:38 +0100
Subject: [Python-Dev] Use of %c and Py_UNICODE
References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> <3A3DE118.3355896D@lemburg.com> <004a01c068db$72403170$3c6340d5@hagrid>
Message-ID: <3A3DE9E2.77FF0FA9@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:
> 
> > > But the error message is being produced because the
> > > character is NOT a valid format character. One of the
> > > reasons for that might be because it's not in the
> > > 7-bit range!
> >
> > True.
> >
> > I think removing %c completely in that case is the right
> > solution (in case you don't want to convert the Unicode char
> > using the default encoding to a string first).
> 
> how likely is it that a human programmer will use a bad formatting
> character that's not in the ASCII range?

Not very likely... the most common case of this error is probably
the use of % as percent sign in a formatting string. The next
character in those cases is usually whitespace.
 
> -1 on removing it -- people shouldn't have to learn the octal ASCII
> table just to be able to fix trivial typos.
> 
> +1 on mapping the character back to a string in the same was as
> "repr" -- that is, print ASCII characters as is, map anything else to
> an octal escape.
> 
> +0 on leaving it as it is, or mapping non-printables to "?".

Agreed.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tismer at tismer.com  Mon Dec 18 12:08:34 2000
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 18 Dec 2000 13:08:34 +0200
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> <3A3DE81A.4B825D89@lemburg.com>
Message-ID: <3A3DF032.5F86AD15@tismer.com>


"M.-A. Lemburg" wrote:
> 
> > Here some results, dictionaries have 1000 entries:
> >
> > timing for strings              old=  5.097 new= 5.088
> > timing for bad integers (<<10)  old=101.540 new=12.610
> > timing for bad integers (<<16)  old=571.210 new=19.220
> 
> Even though I think concentrating on string keys would provide more
> performance boost for Python in general, I think you have a point
> there. +1 from here.
> 
> BTW, would changing the hash function on strings from the simple
> XOR scheme to something a little smarter help improve the performance
> too (e.g. most strings used in programming never use the 8-th
> bit) ?

Yes, it would. I spent the rest of last night to do more
accurate tests, also refined the implementation (using
longs for the shifts etc), and turned from timing over to
trip counting, i.e. a dict counts every round through the
re-hash. That showed two things:
- The bits used from the string hash are not well distributed
- using a "warmup wheel" on the hash to suck all bits in
  gives the same quality of hashes like random numbers.

I will publish some results later today.

> I also think that we could inline the string compare function
> in dictobject:lookdict_string to achieve even better performance.
> Currently it uses a function which doesn't trigger compiler
> inlining.

Sure!

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From guido at python.org  Mon Dec 18 15:20:22 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 09:20:22 -0500
Subject: [Python-Dev] The Dictionary Gem is polished!
In-Reply-To: Your message of "Sun, 17 Dec 2000 19:38:31 +0200."
             <3A3CFA17.ED26F51A@tismer.com> 
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com>  
            <3A3CFA17.ED26F51A@tismer.com> 
Message-ID: <200012181420.JAA25063@cj20424-a.reston1.va.home.com>

> Problem: There are hash functions which are "good" in this sense,
> but they do not spread their randomness uniformly over the
> 32 bits.
> 
> Example: Integers use their own value as hash.
> This is ok, as far as the integers are uniformly distributed.
> But if they all contain a high power of two, for instance,
> the low bits give a very bad hash function.
> 
> Take a dictionary with integers range(1000) as keys and access
> all entries. Then use a dictionay with the integers shifted
> left by 16.
> Access time is slowed down by a factor of 100, since every
> access is a linear search now.

Ai.  I think what happened is this: long ago, the hash table sizes
were primes, or at least not powers of two!

I'll leave it to the more mathematically-inclined to judge your
solution...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Dec 18 15:52:35 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 09:52:35 -0500
Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in?
In-Reply-To: Your message of "Sun, 17 Dec 2000 17:55:10 CST."
             <14909.21086.92774.940814@beluga.mojam.com> 
References: <14909.21086.92774.940814@beluga.mojam.com> 
Message-ID: <200012181452.JAA04372@cj20424-a.reston1.va.home.com>

> Make distclean succeeded so I tried the following:
> 
>     make distclean
>     ./configure
>     make clean
> 
> but the last step still failed.  Any idea why make clean is now failing (for
> me)?  Can anyone else reproduce this problem?

Yes.  I don't understand it, but this takes care of it:

    make distclean
    ./configure
    make Makefiles		# <--------- !!!
    make clean

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Dec 18 15:54:20 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 09:54:20 -0500
Subject: [Python-Dev] Pragmas and warnings
In-Reply-To: Your message of "Mon, 18 Dec 2000 10:58:37 +0100."
             <3A3DDFCD.34AB05B2@lemburg.com> 
References: <3A3D1C4B.8F08A744@ActiveState.com>  
            <3A3DDFCD.34AB05B2@lemburg.com> 
Message-ID: <200012181454.JAA04394@cj20424-a.reston1.va.home.com>

> There was a long thread about this some months ago. We agreed
> to add a new keyword to the language (I think it was "define")

I don't recall agreeing.  :-)

This is PEP material.  For 2.2, please!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Mon Dec 18 15:56:33 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 18 Dec 2000 15:56:33 +0100
Subject: [Python-Dev] Pragmas and warnings
References: <3A3D1C4B.8F08A744@ActiveState.com>  
	            <3A3DDFCD.34AB05B2@lemburg.com> <200012181454.JAA04394@cj20424-a.reston1.va.home.com>
Message-ID: <3A3E25A1.DFD2BDBF@lemburg.com>

Guido van Rossum wrote:
> 
> > There was a long thread about this some months ago. We agreed
> > to add a new keyword to the language (I think it was "define")
> 
> I don't recall agreeing.  :-)

Well, maybe it was a misinterpretation on my part... you said
something like "add a new keyword and live with the consequences".
AFAIR, of course :-)

> This is PEP material.  For 2.2, please!

Ok.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at python.org  Mon Dec 18 16:15:26 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 10:15:26 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Sun, 17 Dec 2000 12:09:47 MST."
             <001f01c0685c$ef555200$7bdb5da6@vaio> 
References: <LNBBLJKPBEHFEDALKOLCEECKIEAA.tim.one@home.com> <200012152123.QAA03566@cj20424-a.reston1.va.home.com>  
            <001f01c0685c$ef555200$7bdb5da6@vaio> 
Message-ID: <200012181515.KAA04571@cj20424-a.reston1.va.home.com>

[Mark Lutz]
> So please: can we keep string around?  Like it or not, we're 
> way past the point of removing such core modules at this point.

Of course we're keeping string around.  I already said that for
backwards compatibility reasons it would not disappear before Py3K.

I think there's a misunderstanding about the meaning of deprecation,
too.  That word doesn't mean to remove a feature.  It doesn't even
necessarily mean to warn every time a feature is used.  It just means
(to me) that at some point in the future the feature will change or
disappear, there's a new and better way to do it, and that we
encourage users to start using the new way, to save them from work
later.

In my mind, there's no reason to start emitting warnings about every
deprecated feature.  The warnings are only needed late in the
deprecation cycle.  PEP 5 says "There must be at least a one-year
transition period between the release of the transitional version of
Python and the release of the backwards incompatible version."

Can we now stop getting all bent out of shape over this?  String
methods *are* recommended over equivalent string functions.  Those
string functions *are* already deprecated, in the informal sense
(i.e. just that it is recommended to use string methods instead).
This *should* (take notice, Fred!) be documented per 2.1.  We won't
however be issuing run-time warnings about the use of string functions
until much later.  (Lint-style tools may start warning sooner --
that's up to the author of the lint tool to decide.)

Note that I believe Java makes a useful distinction that PEP 5 misses:
it defines both deprecated features and obsolete features.
*Deprecated* features are simply features for which a better
alternative exists.  *Obsolete* features are features that are only
being kept around for backwards compatibility.  Deprecated features
may also be (and usually are) *obsolescent*, meaning they will become
obsolete in the future.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Dec 18 16:22:09 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 10:22:09 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Mon, 18 Dec 2000 00:45:56 +0100."
             <200012172345.AAA00877@loewis.home.cs.tu-berlin.de> 
References: <200012172345.AAA00877@loewis.home.cs.tu-berlin.de> 
Message-ID: <200012181522.KAA04597@cj20424-a.reston1.va.home.com>

> At some time, there were only string exceptions. Then, instance
> exceptions were added, some releases later they were considered the
> better choice, so the standard library was converted to use them.
> Still, there is no sign whatsoever that anybody plans to deprecate
> string exceptions.

Now there is: I hereby state that I officially deprecate string
exceptions.  Py3K won't support them, and it *may* even require that
all exception classes are derived from Exception.

> I believe the string module will get less importance over
> time. Comparing it with string exception, that may be well 5 years.
> It seems there are two ways of "deprecation": a loud "we will remove
> that, change your code", and a silent "strings have methods"
> (i.e. don't mention the module when educating people). The latter
> approach requires educators to agree that the module is
> "uninteresting", and people to really not use once they find out it
> exists.

Exactly.  This is what I hope will happen.  I certainly hope that Mark
Lutz has already started teaching string methods!

> I think deprecation should be only attempted once there is a clear
> sign that people don't use it massively for new code anymore.

Right.  So now we're on the first step: get the word out!

> Removal should only occur if keeping the module [is] less pain than
> maintaining it.

Exactly.  Guess where the string module falls today. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From Barrett at stsci.edu  Mon Dec 18 17:50:49 2000
From: Barrett at stsci.edu (Paul Barrett)
Date: Mon, 18 Dec 2000 11:50:49 -0500 (EST)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons 
Message-ID: <14910.16431.554136.374725@nem-srvr.stsci.edu>

Guido van Rossum writes:
 > > 
 > > 1. The current boolean operator behavior does not have to change, and
 > >    hence will be backward compatible.
 > 
 > What incompatibility do you see in the current proposal?

You have to choose between using rich comparisons or boolean
comparisons.  You can't use both for the same (rich/complex) object.

 > > 2. It eliminates the need to decide whether or not rich comparisons
 > >    takes precedence over boolean comparisons.
 > 
 > Only if you want different semantics -- that's only an issue for NumPy.

No. I think NumPy is the tip of the iceberg, when discussing new
semantics.  Most users don't consider these broader semantic issues,
because Python doesn't give them the opportunity to do so.  I can see
possible scenarios of using both boolean and non-boolean comparisons
for Python lists and dictionaries in addition to NumPy.

I chose to use Python because it provides a richer framework than
other languages.  When Python fails to provide such benefits, I'll
move to another language.  I moved from PERL to Python because the
multi-dimensional array syntax is vastly better in Python than PERL,
though as a novice I don't have to know that it exists.  What I'm
proposing here is in a similar vein.

 > > 3. The new operators add additional behavior without directly impacting 
 > >    current behavior and the use of them is unambigous, at least in
 > >    relation to current Python behavior.  You know by the operator what 
 > >    type of comparison will be returned.  This should appease Jim
 > >    Fulton, based on his arguments in 1998 about comparison operators
 > >    always returning a boolean value.
 > 
 > As you know, I'm now pretty close to Jim. :-)  He seemed pretty mellow
 > about this now.

Yes, I would hope so!

It appears though that you misunderstand me.  My point was that I tend
to agree with Jim Fulton's arguments for a limited interpretation of
the current comparison operators.  I too expect them to return a
boolean result.  I have never felt comfortable using such comparison
operators in an array context, e.g. as in the array language, IDL. It
just looks wrong.  So my suggestion is to create new ones whose
implicit meaning is to provide element-wise or rich comparison
behavior.  And to add similar behavior for the other operators for
consistency.

Can someone provide an example in mathematics where comparison
operators are used in a non-boolean, ie. rich comparison, context.
If so, this might shut me up!

 > > 4. Compound objects, such as lists, could implement both rich
 > >    and boolean comparisons.  The boolean comparison would remain as
 > >    is, while the rich comparison would return a list of boolean
 > >    values.  Current behavior doesn't change; just a new feature, which
 > >    you may or may not choose to use, is added.
 > > 
 > > If we go one step further and add the matrix-style operators along
 > > with the comparison operators, we can provide a consistent user
 > > interface to array/complex operations without changing current Python
 > > behavior.  If a user has no need for these new operators, he doesn't
 > > have to use them or even know about them.  All we've done is made
 > > Python richer, but I believe with making it more complex.  For

Phrase should be: "but I believe without making it more complex.".
                                 -------

 > > example, all element-wise operations could have a ':' appended to
 > > them, e.g. '+:', '<:', etc.; and will define element-wise addition,
 > > element-wise less-than, etc.  The traditional '*', '/', etc. operators
 > > can then be used for matrix operations, which will appease the Matlab
 > > people.
 > > 
 > > Therefore, I don't think rich comparisons and matrix-type operators
 > > should be considered separable.  I really think you should consider
 > > this suggestion.  It appeases many groups while providing a consistent 
 > > and clear user interface, while greatly impacting current Python
 > > behavior. 

The last phrase should read: "while not greatly impacting current
                                    ---
Python behavior."

 > > 
 > > Always-causing-havoc-at-the-last-moment-ly Yours,
 > 
 > I think you misunderstand.  Rich comparisons are mostly about allowing
 > the separate overloading of <, <=, ==, !=, >, and >=.  This is useful
 > in its own light.

No, I do understand.  I've read most of the early discussions on this
issue and one of those issues was about having to choose between
boolean and rich comparisons and what should take precedence, when
both may be appropriate.  I'm suggesting an alternative here.

 > If you don't want to use this overloading facility for elementwise
 > comparisons in NumPy, that's fine with me.  Nobody says you have to --
 > it's just that you *could*.

Yes, I understand.

 > Red my lips: there won't be *any* new operators in 2.1.

OK, I didn't expect this to make it into 2.1.

 > There will a better way to overload the existing Boolean operators,
 > and they will be able to return non-Boolean results.  That's useful in
 > other situations besides NumPy.

Yes, I agree, this should be done anyway.  I'm just not sure that the
implicit meaning that these comparison operators are being given is
the best one.  I'm just looking for ways to incorporate rich
comparisons into a broader framework, numpy just currently happens to
be the primary example of this proposal.

Assuming the current comparison operator overloading is already
implemented and has been used to implement rich comparisons for some
objects, then my rich comparison proposal would cause confusion.  This 
is what I'm trying to avoid.

 > Feel free to lobby for elementwise operators -- but based on the
 > discussion about this subject so far, I don't give it much of a chance
 > even past Python 2.1.  They would add a lot of baggage to the language
 > (e.g. the table of operators in all Python books would be about twice
 > as long) and by far the most users don't care about them.  (Read the
 > intro to 211 for some of the concerns -- this PEP tries to make the
 > addition palatable by adding exactly *one* new operator.)

So!  Introductory books don't have to discuss these additional
operators.  I don't have to know about XML and socket modules to start
using Python effectively, nor do I have to know about 'zip' or list
comprehensions.  These additions decrease the code size and increase
efficiency, but don't really add any new expressive power that can't
already be done by a 'for' loop.

I'll try to convince myself that this suggestion is crazy and not
bother you with this issue for awhile.

Cheers,
Paul




From guido at python.org  Mon Dec 18 18:18:11 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 12:18:11 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: Your message of "Mon, 18 Dec 2000 11:50:49 EST."
             <14910.16431.554136.374725@nem-srvr.stsci.edu> 
References: <14910.16431.554136.374725@nem-srvr.stsci.edu> 
Message-ID: <200012181718.MAA14030@cj20424-a.reston1.va.home.com>

Paul Barret:
>  > > 1. The current boolean operator behavior does not have to change, and
>  > >    hence will be backward compatible.

Guido van Rossum:
>  > What incompatibility do you see in the current proposal?

Paul Barret:
> You have to choose between using rich comparisons or boolean
> comparisons.  You can't use both for the same (rich/complex) object.

Sure.  I thought that the NumPy folks were happy with this.  Certainly
two years ago they seemed to be.

>  > > 2. It eliminates the need to decide whether or not rich comparisons
>  > >    takes precedence over boolean comparisons.
>  > 
>  > Only if you want different semantics -- that's only an issue for NumPy.
> 
> No. I think NumPy is the tip of the iceberg, when discussing new
> semantics.  Most users don't consider these broader semantic issues,
> because Python doesn't give them the opportunity to do so.  I can see
> possible scenarios of using both boolean and non-boolean comparisons
> for Python lists and dictionaries in addition to NumPy.

That's the same argument that has been made for new operators all
along.  I've explained already why they are not on the table for 2.1.

> I chose to use Python because it provides a richer framework than
> other languages.  When Python fails to provide such benefits, I'll
> move to another language.  I moved from PERL to Python because the
> multi-dimensional array syntax is vastly better in Python than PERL,
> though as a novice I don't have to know that it exists.  What I'm
> proposing here is in a similar vein.
> 
>  > > 3. The new operators add additional behavior without directly impacting 
>  > >    current behavior and the use of them is unambigous, at least in
>  > >    relation to current Python behavior.  You know by the operator what 
>  > >    type of comparison will be returned.  This should appease Jim
>  > >    Fulton, based on his arguments in 1998 about comparison operators
>  > >    always returning a boolean value.
>  > 
>  > As you know, I'm now pretty close to Jim. :-)  He seemed pretty mellow
>  > about this now.
> 
> Yes, I would hope so!
> 
> It appears though that you misunderstand me.  My point was that I tend
> to agree with Jim Fulton's arguments for a limited interpretation of
> the current comparison operators.  I too expect them to return a
> boolean result.  I have never felt comfortable using such comparison
> operators in an array context, e.g. as in the array language, IDL. It
> just looks wrong.  So my suggestion is to create new ones whose
> implicit meaning is to provide element-wise or rich comparison
> behavior.  And to add similar behavior for the other operators for
> consistency.
> 
> Can someone provide an example in mathematics where comparison
> operators are used in a non-boolean, ie. rich comparison, context.
> If so, this might shut me up!

Not me (I no longer consider myself a mathematician :-).  Why are you
requiring an example from math though?

Again, you will be able to make this argument to the NumPy folks when
they are ready to change the meaning of A<B to mean an array of
Booleans rather than a single Boolean.  Since you're part of the
design team for NumPy NG, you're in a pretty good position to hold out
for elementwise operators!

However, what would you do if elementwise operators were turned down
for ever (which is a realistic possibility)?

In the mean time, I see no harm in *allowing* the comparison operators
to be overridden to return something else besides a Boolean.  Someone
else may find this useful.  (Note that standard types won't use this
new freedom, so I'm not imposing this on anybody -- I'm only giving a
new option.)

>  > > 4. Compound objects, such as lists, could implement both rich
>  > >    and boolean comparisons.  The boolean comparison would remain as
>  > >    is, while the rich comparison would return a list of boolean
>  > >    values.  Current behavior doesn't change; just a new feature, which
>  > >    you may or may not choose to use, is added.
>  > > 
>  > > If we go one step further and add the matrix-style operators along
>  > > with the comparison operators, we can provide a consistent user
>  > > interface to array/complex operations without changing current Python
>  > > behavior.  If a user has no need for these new operators, he doesn't
>  > > have to use them or even know about them.  All we've done is made
>  > > Python richer, but I believe with making it more complex.  For
> 
> Phrase should be: "but I believe without making it more complex.".
>                                  -------
> 
>  > > example, all element-wise operations could have a ':' appended to
>  > > them, e.g. '+:', '<:', etc.; and will define element-wise addition,
>  > > element-wise less-than, etc.  The traditional '*', '/', etc. operators
>  > > can then be used for matrix operations, which will appease the Matlab
>  > > people.
>  > > 
>  > > Therefore, I don't think rich comparisons and matrix-type operators
>  > > should be considered separable.  I really think you should consider
>  > > this suggestion.  It appeases many groups while providing a consistent 
>  > > and clear user interface, while greatly impacting current Python
>  > > behavior. 
> 
> The last phrase should read: "while not greatly impacting current
>                                     ---
> Python behavior."

I don't see any argument for elementwise operators here that I haven't
heard before, and AFAIK it's all in the two PEPs.

>  > > Always-causing-havoc-at-the-last-moment-ly Yours,
>  > 
>  > I think you misunderstand.  Rich comparisons are mostly about allowing
>  > the separate overloading of <, <=, ==, !=, >, and >=.  This is useful
>  > in its own light.
> 
> No, I do understand.  I've read most of the early discussions on this
> issue and one of those issues was about having to choose between
> boolean and rich comparisons and what should take precedence, when
> both may be appropriate.  I'm suggesting an alternative here.

Note that Python doesn't decide which should take precedent.  The
implementer of an individual extension type decides what his
comparison operators will return.

>  > If you don't want to use this overloading facility for elementwise
>  > comparisons in NumPy, that's fine with me.  Nobody says you have to --
>  > it's just that you *could*.
> 
> Yes, I understand.
> 
>  > Red my lips: there won't be *any* new operators in 2.1.
> 
> OK, I didn't expect this to make it into 2.1.
> 
>  > There will a better way to overload the existing Boolean operators,
>  > and they will be able to return non-Boolean results.  That's useful in
>  > other situations besides NumPy.
> 
> Yes, I agree, this should be done anyway.  I'm just not sure that the
> implicit meaning that these comparison operators are being given is
> the best one.  I'm just looking for ways to incorporate rich
> comparisons into a broader framework, numpy just currently happens to
> be the primary example of this proposal.
> 
> Assuming the current comparison operator overloading is already
> implemented and has been used to implement rich comparisons for some
> objects, then my rich comparison proposal would cause confusion.  This 
> is what I'm trying to avoid.

AFAIK, rich comparisons haven't been used anywhere to return
non-Boolean results.

>  > Feel free to lobby for elementwise operators -- but based on the
>  > discussion about this subject so far, I don't give it much of a chance
>  > even past Python 2.1.  They would add a lot of baggage to the language
>  > (e.g. the table of operators in all Python books would be about twice
>  > as long) and by far the most users don't care about them.  (Read the
>  > intro to 211 for some of the concerns -- this PEP tries to make the
>  > addition palatable by adding exactly *one* new operator.)
> 
> So!  Introductory books don't have to discuss these additional
> operators.  I don't have to know about XML and socket modules to start
> using Python effectively, nor do I have to know about 'zip' or list
> comprehensions.  These additions decrease the code size and increase
> efficiency, but don't really add any new expressive power that can't
> already be done by a 'for' loop.
> 
> I'll try to convince myself that this suggestion is crazy and not
> bother you with this issue for awhile.

Happy holidays nevertheless. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Mon Dec 18 19:38:13 2000
From: tim.one at home.com (Tim Peters)
Date: Mon, 18 Dec 2000 13:38:13 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons 
In-Reply-To: <14910.16431.554136.374725@nem-srvr.stsci.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEGLIEAA.tim.one@home.com>

[Paul Barrett]
> ...
> Can someone provide an example in mathematics where comparison
> operators are used in a non-boolean, ie. rich comparison, context.
> If so, this might shut me up!

By my informal accounting, over the years there have been more requests for
three-outcome comparison operators than for elementwise ones, although the
three-outcome lobby isn't organized so is less visible.  It's a natural
request for anyone working with partial orderings (a < b -> one of {yes, no,
unordered}).  Another large group of requests comes from people working with
variants of fuzzy logic, where it's desired that the comparison operators be
definable to return floats (intuitively corresponding to the probability
that the stated relation "is true").  Another desire comes from the symbolic
math camp, which would like to be able to-- as is possible for "+", "*",
etc --define "<" so that e.g. "x < y" return an object capturing that
somebody *asked* for "x < y"; they're not interested in numeric or Boolean
results so much as symbolic expressions.  "<" is used for all these things
in the literature too.

Whatever.  "<" and friends are just collections of pixels.  Add 300 new
operator symbols, and people will want to redefine all of them at will too.

draw-a-line-in-the-sand-and-the-wind-blows-it-away-ly y'rs  - tim




From tim.one at home.com  Mon Dec 18 21:37:13 2000
From: tim.one at home.com (Tim Peters)
Date: Mon, 18 Dec 2000 15:37:13 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012152123.QAA03566@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHBIEAA.tim.one@home.com>

[Guido]
> ...
> If you're saying that we should give users ample time for the
> transition, I'm with you.

Then we're with each other, for suitably large values of "ample" <wink>.

> If you're saying that you think the string module is too prominent to
> ever start deprecating its use, I'm afraid we have a problem.

We may.  Time will tell.  It needs a conversion tool, else I think it's
unsellable.

> ...
> I'd also like to note that using the string module's wrappers incurs
> the overhead of a Python function call -- using string methods is
> faster.
>
> Finally, I like the look of fields[i].strip().lower() much better than
> that of string.lower(string.strip(fields[i])) -- an actual example
> from mimetools.py.

I happen to like string methods better myself; I don't think that's at issue
(except that loads of people apparently don't like "join" as a string
method -- idiots <wink>).

The issue to me is purely breaking old code someday -- "string" is in very
heavy use, and unlike as when deprecating regex in favor of re (either pre
or especially sre), string methods aren't orders of magnitude better than
the old way; and also unlike regex-vs-re it's not the case that the string
module has become unmaintainable (to the contrary, string.py has become
trivial).  IOW, this one would be unprecedented fiddling.

> ...
> Note that I believe Java makes a useful distinction that PEP 5 misses:
> it defines both deprecated features and obsolete features.
> *Deprecated* features are simply features for which a better
> alternative exists.  *Obsolete* features are features that are only
> being kept around for backwards compatibility.  Deprecated features
> may also be (and usually are) *obsolescent*, meaning they will become
> obsolete in the future.

I agree it would be useful to define these terms, although those particular
definitions appear to be missing the most important point from the user's
POV (not a one says "going away someday").  A Google search on "java
obsolete obsolescent deprecated" doesn't turn up anything useful, so I doubt
the usages you have in mind come from Java (it has "deprecated", but doesn't
appear to have any well-defined meaning for the others).

In keeping with the religious nature of the battle-- and religion offers
precise terms for degrees of damnation! --I suggest:

    struggling -- a supported feature; the initial state of
        all features; may transition to Anathematized

    anathematized -- this feature is now cursed, but is supported;
        may transition to Condemned or Struggling; intimacy with
        Anathematized features is perilous

    condemned -- a feature scheduled for crucifixion; may transition
        to Crucified, Anathematized (this transition is called "a pardon"),
        or Struggling (this transition is called "a miracle"); intimacy
        with Condemned features is suicidal

    crucified -- a feature that is no longer supported; may transition
        to Resurrected

    resurrected -- a once-Crucified feature that is again supported;
        may transition to Condemned, Anathematized or Struggling;
        although since Resurrection is a state of grace, there may be
        no point in human time at which a feature is identifiably
        Resurrected (i.e., it may *appear*, to the unenlightened, that
        a feature moved directly from Crucified to Anathematized or
        Struggling or Condemned -- although saying so out loud is heresy).




From tismer at tismer.com  Mon Dec 18 23:58:03 2000
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 18 Dec 2000 23:58:03 +0100
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCAEADICAA.tim.one@home.com> <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com>  
	            <3A3CFA17.ED26F51A@tismer.com> <200012181420.JAA25063@cj20424-a.reston1.va.home.com>
Message-ID: <3A3E967B.BE404114@tismer.com>


Guido van Rossum wrote:
[me, expanding on hashes, integers,and how to tame them cheaply]

> Ai.  I think what happened is this: long ago, the hash table sizes
> were primes, or at least not powers of two!

At some time I will wake up and they tell me that I'm reducible :-)

> I'll leave it to the more mathematically-inclined to judge your
> solution...

I love small lists! - ciao - chris

+1   (being a member, hopefully)

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From greg at cosc.canterbury.ac.nz  Tue Dec 19 00:04:42 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 19 Dec 2000 12:04:42 +1300 (NZDT)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEGLIEAA.tim.one@home.com>
Message-ID: <200012182304.MAA02642@s454.cosc.canterbury.ac.nz>

[Paul Barrett]
> ...
> Can someone provide an example in mathematics where comparison
> operators are used in a non-boolean, ie. rich comparison, context.
> If so, this might shut me up!

Not exactly mathematical, but some day I'd like to create
a database access module which lets you say things like

  mydb = OpenDB("inventory")
  parts = mydb.parts
  tuples = mydb.retrieve(parts.name, parts.number).where(parts.quantity >= 42)

Of course, to really make this work I need to be able
to overload "and" and "or" as well, but that's a whole
'nother PEP...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From guido at python.org  Tue Dec 19 00:32:51 2000
From: guido at python.org (Guido van Rossum)
Date: Mon, 18 Dec 2000 18:32:51 -0500
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: Your message of "Tue, 19 Dec 2000 12:04:42 +1300."
             <200012182304.MAA02642@s454.cosc.canterbury.ac.nz> 
References: <200012182304.MAA02642@s454.cosc.canterbury.ac.nz> 
Message-ID: <200012182332.SAA18456@cj20424-a.reston1.va.home.com>

> Not exactly mathematical, but some day I'd like to create
> a database access module which lets you say things like
> 
>   mydb = OpenDB("inventory")
>   parts = mydb.parts
>   tuples = mydb.retrieve(parts.name, parts.number).where(parts.quantity >= 42)
> 
> Of course, to really make this work I need to be able
> to overload "and" and "or" as well, but that's a whole
> 'nother PEP...

Believe it or not, in 1998 we already had a suggestion for overloading
these too.  This is hinted at in David Ascher's proposal (the Appendix
of PEP 208) where objects could define __boolean_and__ to overload
x<y<z.  It doesn't get all the details right: it's not enough to check
if the left operand is true, since that leaves 'or' out in the cold,
but a different test (i.e. the presence of __boolean_and__) would
work.

I am leaving this out of the current PEP because the bytecode you have
to generate for this is very hairy.  A simple expression like ``f()
and g()'' would become something like:

  outcome = f()
  if hasattr(outcome, '__boolean_and__'):
      outcome = outcome.__boolean_and__(g())
  elif outcome:
      outcome = g()

The problem I have with this is that the code to evaluate g() has to
be generated twice!  In general, g() could be an arbitrary expression.
We can't evaluate g() ahead of time, because it should not be
evaluated at all when outcome is false and doesn't define
__boolean_and__().

For the same reason the current PEP doesn't support x<y<z when x<y
doesn't return a Boolean result; a similar solution would be possible.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Tue Dec 19 01:09:35 2000
From: tim.one at home.com (Tim Peters)
Date: Mon, 18 Dec 2000 19:09:35 -0500
Subject: [Python-Dev] RE: The Dictionary Gem is polished!
In-Reply-To: <3A3CFA17.ED26F51A@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHJIEAA.tim.one@home.com>

Sounds good to me!  It's a very cheap way to get the high bits into play.

>        i = (~_hash) & mask

The ~ here seems like pure superstition to me (and the comments in the C
code don't justify it at all -- I added a nag of my own about that the last
time I checked in dictobject.c -- and see below for a bad consequence of
doing ~).

>            # note that we do not mask!
>            # even the shifting my not be worth it.
>            incr = _hash ^ (_hash >> 3)

The shifting was just another cheap trick to get *some* influence from the
high bits.  It's very limited, though.  Toss it (it appears to be from the
"random operations yield random results" <wink/sigh> matchbook school of
design).

[MAL]
> BTW, would changing the hash function on strings from the simple
> XOR scheme to something a little smarter help improve the performance
> too (e.g. most strings used in programming never use the 8-th
> bit) ?

Don't understand -- the string hash uses multiplication:

    x = (1000003*x) ^ *p++;

in a loop.  Replacing "^" there by "+" should yield slightly better results.
As is, string hashes are a lot like integer hashes, in that "consecutive"
strings

   J001
   J002
   J003
   J004
   ...

yield hashes very close together in value.  But, because the current dict
algorithm uses ~ on the full hash but does not use ~ on the initial
increment, (~hash)+incr too often yields the same result for distinct hashes
(i.e., there's a systematic (but weak) form of clustering).

Note that Python is doing something very unusual:  hashes are *usually*
designed to yield an approximation to randomness across all bits.  But
Python's hashes never achieve that.  This drives theoreticians mad (like the
fellow who originally came up with the GF idea), but tends to work "better
than random" in practice (e.g., a truly random hash function would almost
certainly produce many collisions when fed a fat range of consecutive
integers but still less than half the table size; but Python's trivial
"identity" integer hash produces no collisions in that common case).

[Christian]
> - The bits used from the string hash are not well distributed
> - using a "warmup wheel" on the hash to suck all bits in
>   gives the same quality of hashes like random numbers.

See above and be very cautious:  none of Python's hash functions produce
well-distributed bits, and-- in effect --that's why Python dicts often
perform "better than random" on common data.  Even what you've done so far
appears to provide marginally worse statistics for Guido's favorite kind of
test case ("worse" in two senses:  total number of collisions (a measure of
amortized lookup cost), and maximum collision chain length (a measure of
worst-case lookup cost)):

   d = {}
   for i in range(N):
       d[repr(i)] = i

check-in-one-thing-then-let-it-simmer-ly y'rs  - tim




From tismer at tismer.com  Tue Dec 19 02:16:27 2000
From: tismer at tismer.com (Christian Tismer)
Date: Tue, 19 Dec 2000 02:16:27 +0100
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <Pine.LNX.4.10.10012180848140.20569-100000@akbar.nevex.com>
Message-ID: <3A3EB6EB.C79A3896@tismer.com>



Greg Wilson wrote:
> 
> > > > Here some results, dictionaries have 1000 entries:
> > I will publish some results later today.
> 
> In Doctor Dobb's Journal, right? :-)  We'd *really* like this article...

Well, the results are not so bad:

I stopped testing computation time for the Python dictionary
implementation, in favor of "trips". How many trips does
the re-hash take in a dictionary?

Tests were run for dictionaries of size 1000, 2000, 3000, 4000.

Dictionary 1 consists of i, formatted as string.
Dictionary 2 consists of strings containig the binary of i.
Dictionary 3 consists of random numbers.
Dictionary 4 consists of i << 16.

Algorithms:
old is the original dictionary algorithm implemented
in Python (probably quite correct now, using longs :-)

new is the proposed incremental bits-suck-in-division
algorithm.

new2 is a version of new, where all extra bits of the
hash function are wheeled in in advance. The computation
time of this is not neglectible, so please use this result
for reference, only.


Here the results:
(bad integers(old) not computed for n>1000 )

"""
D:\crml_doc\platf\py>python dictest.py
N=1000
trips for strings old=293 new=302 new2=221
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=499500 new=13187 new2=999
trips for random integers old=377 new=371 new2=393
trips for windows names old=230 new=207 new2=200
N=2000
trips for strings old=1093 new=1109 new2=786
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=26455 new2=1999
trips for random integers old=691 new=710 new2=741
trips for windows names old=503 new=542 new2=564
N=3000
trips for strings old=810 new=839 new2=609
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=38681 new2=2999
trips for random integers old=762 new=740 new2=735
trips for windows names old=712 new=711 new2=691
N=4000
trips for strings old=1850 new=1843 new2=1375
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=52994 new2=3999
trips for random integers old=1440 new=1450 new2=1414
trips for windows names old=1449 new=1434 new2=1457

D:\crml_doc\platf\py>
"""

Interpretation:
---------------
Short numeric strings show a slightly too high trip number.
This means that the hash() function could be enhanced.
But the effect would be below 10 percent compared to
random hashes, therefore not worth it.

Binary representations of numbers as strings still create
perfect hash numbers.

Bad integers (complete hash clash due to high power of 2)
are handled fairly well by the new algorithm. "new2"
shows that they can be brought down to nearly perfect
hashes just by applying the "hash melting wheel":

Windows names are almost upper case, and almost verbose.
They appear to perform nearly as well as random numbers.
This means: The Python string has function is very good
for a wide area of applications.

In Summary: I would try to modify the string hash function
slightly for short strings, but only if this does not
negatively affect the results of above.

Summary of summary:
There is no really low hanging fruit in string hashing.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com
-------------- next part --------------
## dictest.py
## Test of a new rehash algorithm
## Chris Tismer
## 2000-12-17
## Mission Impossible 5oftware Team

# The following is a partial re-implementation of
# Python dictionaries in Python.
# The original algorithm was literally turned
# into Python code.

##/*
##Table of irreducible polynomials to efficiently cycle through
##GF(2^n)-{0}, 2<=n<=30.
##*/
polys = [
    4 + 3,
    8 + 3,
    16 + 3,
    32 + 5,
    64 + 3,
    128 + 3,
    256 + 29,
    512 + 17,
    1024 + 9,
    2048 + 5,
    4096 + 83,
    8192 + 27,
    16384 + 43,
    32768 + 3,
    65536 + 45,
    131072 + 9,
    262144 + 39,
    524288 + 39,
    1048576 + 9,
    2097152 + 5,
    4194304 + 3,
    8388608 + 33,
    16777216 + 27,
    33554432 + 9,
    67108864 + 71,
    134217728 + 39,
    268435456 + 9,
    536870912 + 5,
    1073741824 + 83,
    0
]
polys = map(long, polys)

class NULL: pass

class Dictionary:
    dummy = "<dummy key>"
    def __init__(mp, newalg=0):
        mp.ma_size = 0
        mp.ma_poly = 0
        mp.ma_table = []
        mp.ma_fill = 0
        mp.ma_used = 0
        mp.oldalg = not newalg
        mp.warmup = newalg>1
        mp.trips = 0

    def getTrips(self):
        trips = self.trips
        self.trips = 0
        return trips

    def lookdict(mp, key, _hash):
        me_hash, me_key, me_value = range(3) # rec slots
        dummy = mp.dummy
        
        mask = mp.ma_size-1
        ep0 = mp.ma_table
        i = (~_hash) & mask
        ep = ep0[i]
        if ep[me_key] is NULL or ep[me_key] == key:
            return ep
        if ep[me_key] == dummy:
            freeslot = ep
        else:
            if (ep[me_hash] == _hash and
                cmp(ep[me_key], key) == 0) :
                return ep
            freeslot = NULL

###### FROM HERE
        if mp.oldalg:
            incr = (_hash ^ (_hash >> 3)) & mask
        else:
            # note that we do not mask!
            # the shifting is worth it in the incremental case.

            ## added after posting to python-dev:
            uhash = _hash & 0xffffffffl
            if mp.warmup:
                incr = uhash
                mask2 = 0xffffffffl ^ mask
                while mask2 > mask:
                    if (incr & 1):
                        incr = incr ^ mp.ma_poly
                    incr = incr >> 1
                    mask2 = mask2>>1
                # this loop *can* be sped up by tables
                # with precomputed multiple shifts.
                # But I'm not sure if it is worth it at all.
            else:
                incr = uhash ^ (uhash >> 3)

###### TO HERE
            
        if (not incr):
            incr = mask
        while 1:
            mp.trips = mp.trips+1
            
            ep = ep0[int((i+incr)&mask)]
            if (ep[me_key] is NULL) :
                if (freeslot is not NULL) :
                    return freeslot
                else:
                    return ep
            if (ep[me_key] == dummy) :
                if (freeslot == NULL):
                    freeslot = ep
            elif (ep[me_key] == key or
                 (ep[me_hash] == _hash and
                  cmp(ep[me_key], key) == 0)) :
                return ep

            # Cycle through GF(2^n)-{0}
###### FROM HERE
            if mp.oldalg:
                incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            else:
                # new algorithm: do a division
                if (incr & 1):
                    incr = incr ^ mp.ma_poly
                incr = incr >> 1
###### TO HERE

    def insertdict(mp, key, _hash, value):
        me_hash, me_key, me_value = range(3) # rec slots
        
        ep = mp.lookdict(key, _hash)
        if (ep[me_value] is not NULL) :
            old_value = ep[me_value]
            ep[me_value] = value
        else :
            if (ep[me_key] is NULL):
                mp.ma_fill=mp.ma_fill+1
            ep[me_key] = key
            ep[me_hash] = _hash
            ep[me_value] = value
            mp.ma_used = mp.ma_used+1

    def dictresize(mp, minused):
        me_hash, me_key, me_value = range(3) # rec slots

        oldsize = mp.ma_size
        oldtable = mp.ma_table
        MINSIZE = 4
        newsize = MINSIZE
        for i in range(len(polys)):
            if (newsize > minused) :
                newpoly = polys[i]
                break
            newsize = newsize << 1
        else:
            return -1

        _nullentry = range(3)
        _nullentry[me_hash] = 0
        _nullentry[me_key] = NULL
        _nullentry[me_value] = NULL

        newtable = map(lambda x,y=_nullentry:y[:], range(newsize))      

        mp.ma_size = newsize
        mp.ma_poly = newpoly
        mp.ma_table = newtable
        mp.ma_fill = 0
        mp.ma_used = 0

        for ep in oldtable:
            if (ep[me_value] is not NULL):
                mp.insertdict(ep[me_key],ep[me_hash],ep[me_value])
        return 0

    # PyDict_GetItem
    def __getitem__(op, key):
        me_hash, me_key, me_value = range(3) # rec slots

        if not op.ma_table:
            raise KeyError, key
        _hash = hash(key)
        return op.lookdict(key, _hash)[me_value]

    # PyDict_SetItem
    def __setitem__(op, key, value):
        mp = op
        _hash = hash(key)
##      /* if fill >= 2/3 size, double in size */
        if (mp.ma_fill*3 >= mp.ma_size*2) :
            if (mp.dictresize(mp.ma_used*2) != 0):
                if (mp.ma_fill+1 > mp.ma_size):
                    raise MemoryError
        mp.insertdict(key, _hash, value)

    # more interface functions
    def keys(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _key)
        return res

    def values(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _value)
        return res

    def items(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( (_key, _value) )
        return res

    def __cmp__(self, other):
        mine = self.items()
        others = other.items()
        mine.sort()
        others.sort()
        return cmp(mine, others)

######################################################
## tests

def test(lis, dic):
    for key in lis: dic[key]

def nulltest(lis, dic):
    for key in lis: dic

def string_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    for i in range(n):
        s = str(i) #* 5
        #s = chr(i%256) + chr(i>>8)##
        d1[s] = d2[s] = d3[s] = i
    return d1, d2, d3

def istring_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    for i in range(n):
        s = chr(i%256) + chr(i>>8)
        d1[s] = d2[s] = d3[s] = i
    return d1, d2, d3

def random_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    from whrandom import randint
    import sys
    keys = []
    for i in range(n):
        keys.append(randint(0, sys.maxint-1))
    for i in keys:
        d1[i] = d2[i] = d3[i] = i
    return d1, d2, d3

def badnum_dicts(n):
	d1 = Dictionary()   # original
	d2 = Dictionary(1)  # other rehash
	d3 = Dictionary(2)  # with warmup
	shift = 10
	if EXTREME:
		shift = 16
	for i in range(n):
		bad = i << 16
		d2[bad] = d3[bad] = i
		if n <= 1000: d1[bad] = i
	return d1, d2, d3

def names_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    import win32con
    keys = win32con.__dict__.keys()
    if len(keys) < n:
        keys = []
    for s in keys[:n]:
        d1[s] = d2[s] = d3[s] = s
    return d1, d2, d3

def do_test(dict):
    keys = dict.keys()
    dict.getTrips() # reset
    test(keys, dict)
    return dict.getTrips()

EXTREME=1

if __name__ == "__main__":

    for N in (1000,2000,3000,4000):    

        sdold, sdnew, sdnew2 = string_dicts(N)
        idold, idnew, idnew2 = istring_dicts(N)
        bdold, bdnew, bdnew2 = badnum_dicts(N)
        rdold, rdnew, rdnew2 = random_dicts(N)
        ndold, ndnew, ndnew2 = names_dicts(N)

        print "N=%d" %N        
        print "trips for strings old=%d new=%d new2=%d" % tuple(
            map(do_test, (sdold, sdnew, sdnew2)) )
        print "trips for bin strings old=%d new=%d new2=%d" % tuple(
            map(do_test, (idold, idnew, idnew2)) )
        print "trips for bad integers old=%d new=%d new2=%d" % tuple(
            map(do_test, (bdold, bdnew, bdnew2)))
        print "trips for random integers old=%d new=%d new2=%d" % tuple(
            map(do_test, (rdold, rdnew, rdnew2)))
        print "trips for windows names old=%d new=%d new2=%d" % tuple(
            map(do_test, (ndold, ndnew, ndnew2)))

"""
Results with a shift of 10 (EXTREME=0):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.097 new=5.088
timing for bad integers old=101.540 new=12.610

Results with a shift of 16 (EXTREME=1):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.218 new=5.147
timing for bad integers old=571.210 new=19.220
"""


From tismer at tismer.com  Tue Dec 19 02:51:32 2000
From: tismer at tismer.com (Christian Tismer)
Date: Tue, 19 Dec 2000 02:51:32 +0100
Subject: [Python-Dev] Re: The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCGEHJIEAA.tim.one@home.com>
Message-ID: <3A3EBF23.750CF761@tismer.com>


Tim Peters wrote:
> 
> Sounds good to me!  It's a very cheap way to get the high bits into play.

That's what I wanted to hear. It's also the reason why I try
to stay conservative: Just do an obviously useful bit, but
do not break any of the inherent benefits, like those
"better than random" amenities.
Python's dictionary algorithm appears to be "near perfect"
and of "never touch but veery carefully or redo it completely".
I tried the tightrope walk of just adding a tiny topping.

> >        i = (~_hash) & mask

Yes that stuff was 2 hours last nite :-)
I just decided to not touch it. Arbitrary crap!
Although an XOR with hash >> number of mask bits
would perform much better (in many cases but not all).
Anyway, simple shifting cannot solve general bit
distribution problems. Nor can I :-)

> The ~ here seems like pure superstition to me (and the comments in the C
> code don't justify it at all -- I added a nag of my own about that the last
> time I checked in dictobject.c -- and see below for a bad consequence of
> doing ~).
> 
> >            # note that we do not mask!
> >            # even the shifting my not be worth it.
> >            incr = _hash ^ (_hash >> 3)
> 
> The shifting was just another cheap trick to get *some* influence from the
> high bits.  It's very limited, though.  Toss it (it appears to be from the
> "random operations yield random results" <wink/sigh> matchbook school of
> design).

Now, comment it out, and you see my new algorithm perform much worse.
I just kept it since it had an advantage on "my case". (bad guy I know).
And I wanted to have an argument for my change to get accepted.
"No cost, just profit, nearly the same" was what I tried to sell.

> [MAL]
> > BTW, would changing the hash function on strings from the simple
> > XOR scheme to something a little smarter help improve the performance
> > too (e.g. most strings used in programming never use the 8-th
> > bit) ?
> 
> Don't understand -- the string hash uses multiplication:
> 
>     x = (1000003*x) ^ *p++;
> 
> in a loop.  Replacing "^" there by "+" should yield slightly better results.

For short strings, this prime has bad influence on the low bits,
making it perform supoptimally for small dicts.
See the new2 algo which funnily corrects for that.
The reason is obvious: Just look at the bit pattern
of 1000003:  '0xf4243'

Without giving proof, this smells like bad bit distribution on small
strings to me. You smell it too, right?

> As is, string hashes are a lot like integer hashes, in that "consecutive"
> strings
> 
>    J001
>    J002
>    J003
>    J004
>    ...
> 
> yield hashes very close together in value. 

A bad generator in that case. I'll look for a better one.

> But, because the current dict
> algorithm uses ~ on the full hash but does not use ~ on the initial
> increment, (~hash)+incr too often yields the same result for distinct hashes
> (i.e., there's a systematic (but weak) form of clustering).

You name it.

> Note that Python is doing something very unusual:  hashes are *usually*
> designed to yield an approximation to randomness across all bits.  But
> Python's hashes never achieve that.  This drives theoreticians mad (like the
> fellow who originally came up with the GF idea), but tends to work "better
> than random" in practice (e.g., a truly random hash function would almost
> certainly produce many collisions when fed a fat range of consecutive
> integers but still less than half the table size; but Python's trivial
> "identity" integer hash produces no collisions in that common case).

A good reason to be careful with changes(ahem).

> [Christian]
> > - The bits used from the string hash are not well distributed
> > - using a "warmup wheel" on the hash to suck all bits in
> >   gives the same quality of hashes like random numbers.
> 
> See above and be very cautious:  none of Python's hash functions produce
> well-distributed bits, and-- in effect --that's why Python dicts often
> perform "better than random" on common data.  Even what you've done so far
> appears to provide marginally worse statistics for Guido's favorite kind of
> test case ("worse" in two senses:  total number of collisions (a measure of
> amortized lookup cost), and maximum collision chain length (a measure of
> worst-case lookup cost)):
> 
>    d = {}
>    for i in range(N):
>        d[repr(i)] = i

Nah, I did quite a lot of tests, and the trip number shows a
variation of about 10%, without judging old or new for better.
This is just the randomness inside.

> check-in-one-thing-then-let-it-simmer-ly y'rs  - tim

This is why I think to be even more conservative:
Try to use a division wheel, but with the inverses
of the original primitive roots, just in order to
get at Guido's results :-)

making-perfect-hashes-of-interneds-still-looks-promising - ly y'rs

   - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From greg at cosc.canterbury.ac.nz  Tue Dec 19 04:07:56 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 19 Dec 2000 16:07:56 +1300 (NZDT)
Subject: [Python-Dev] PEP 207 -- Rich Comparisons
In-Reply-To: <200012182332.SAA18456@cj20424-a.reston1.va.home.com>
Message-ID: <200012190307.QAA02663@s454.cosc.canterbury.ac.nz>

> The problem I have with this is that the code to evaluate g() has to
> be generated twice!

I have an idea how to fix that. There need to be two methods,
__boolean_and_1__ and __boolean_and_2__. The first operand
is evaluated and passed to __boolean_and_1__. If it returns
a result, that becomes the result of the expression, and the
second operand is short-circuited.

If __boolean_and_1__ raises a NeedOtherOperand exception
(or there is no __boolean_and_1__ method), the second operand 
is evaluated, and both operands are passed to __boolean_and_2__.

The bytecode would look something like

        <evaluate first operand>
        BOOLEAN_AND_1 label
        <evaluate second operand>
        BOOLEAN_AND_2
label:  ...

I'll make a PEP out of this one day if I get enthusiastic
enough.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From tim.one at home.com  Tue Dec 19 05:55:33 2000
From: tim.one at home.com (Tim Peters)
Date: Mon, 18 Dec 2000 23:55:33 -0500
Subject: [Python-Dev] The Dictionary Gem is polished!
In-Reply-To: <3A3EB6EB.C79A3896@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEIAIEAA.tim.one@home.com>

Something else to ponder:  my tests show that the current ("old") algorithm
performs much better (somewhat worse than "new2" == new algorithm + warmup)
if incr is simply initialized like so instead:

        if mp.oldalg:
            incr = (_hash & 0xffffffffL) % (mp.ma_size - 1)

That's another way to get all the bits to contribute to the result.  Note
that a mod by size-1 is analogous to "casting out nines" in decimal:  it's
the same as breaking hash into fixed-sized pieces from the right (10 bits
each if size=2**10, etc), adding the pieces together, and repeating that
process until only one piece remains.  IOW, it's a degenerate form of
division, but works well all the same.  It didn't improve over that when I
tried a mod by the largest prime less than the table size (which suggests
we're sucking all we can out of the *probe* sequence given a sometimes-poor
starting index).

However, it's subject to the same weak clustering phenomenon as the old
method due to the ill-advised "~hash" operation in computing the initial
index.  If ~ is also thrown away, it's as good as new2 (here I've tossed out
the "windows names", and "old" == existing algorithm except (a) get rid of ~
when computing index and (b) do mod by size-1 when computing incr):

N=1000
trips for strings old=230 new=261 new2=239
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=999 new=13212 new2=999
trips for random integers old=399 new=421 new2=410
N=2000
trips for strings old=787 new=1066 new2=827
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=26481 new2=1999
trips for random integers old=652 new=733 new2=650
N=3000
trips for strings old=547 new=760 new2=617
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=38701 new2=2999
trips for random integers old=724 new=743 new2=768
N=4000
trips for strings old=1311 new=1657 new2=1315
trips for bin strings old=0 new=0 new2=0
trips for bad integers old=0 new=53014 new2=3999
trips for random integers old=1476 new=1587 new2=1493

The new and new2 values differ in minor ways from the ones you posted
because I got rid of the ~ (the ~ has a bad interaction with "additive"
means of computing incr, because the ~ tends to move the index in the
opposite direction, and these moves in opposite directions tend to cancel
out when computing incr+index the first time).

too-bad-mod-is-expensive!-ly y'rs  - tim




From tim.one at home.com  Tue Dec 19 06:50:01 2000
From: tim.one at home.com (Tim Peters)
Date: Tue, 19 Dec 2000 00:50:01 -0500
Subject: [Python-Dev] SourceForge SSH silliness
In-Reply-To: <20001217220008.D29681@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEIDIEAA.tim.one@home.com>

[Tim]
> Starting last night, I get this msg whenever I update Python code w/
> CVSROOT=:ext:tim_one at cvs.python.sourceforge.net:/cvsroot/python:
>
> """
> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> @       WARNING: HOST IDENTIFICATION HAS CHANGED!         @
> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
> Someone could be eavesdropping on you right now
> (man-in-the-middle attack)!
> It is also possible that the host key has just been changed.
> Please contact your system administrator.
> Add correct host key in C:\Code/.ssh/known_hosts to get rid of
> this message.
> Password authentication is disabled to avoid trojan horses.
> """
>
> This is SourceForge's doing, and is permanent (they've changed
> keys on their end). ...

[Thomas Wouters]
> What sourceforge did was switch Linux distributions, and upgrade.
> The switch doesn't really matter for the SSH problem, because recent
> Debian and recent RedHat releases both use a new ssh, the OpenBSD
> ssh imlementation.
> Apparently, it isn't entirely backwards compatible to old versions of
> F-secure ssh. For one thing, it doesn't support the 'idea' cypher. This
> might or might not be your problem; if it is, you should get a decent
> message that gives a relatively clear message such as 'cypher type 'idea'
> not supported'.
> ... [and quite a bit more] ...

I hope you're feeling better today <wink>.  "The problem" was one the wng
msg spelled out:  "It is also possible that the host key has just been
changed.".  SF changed keys.  That's the whole banana right there.  Deleting
the sourceforge keys from known_hosts fixed it (== convinced ssh to install
new SF keys the next time I connected).




From tim.one at home.com  Tue Dec 19 06:58:45 2000
From: tim.one at home.com (Tim Peters)
Date: Tue, 19 Dec 2000 00:58:45 -0500
Subject: [Python-Dev] new draft of PEP 227
In-Reply-To: <200012171438.JAA21603@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEIEIEAA.tim.one@home.com>

[Tim]
> I expect it would do less harm to introduce a compile-time warning for
> locals that are never referenced (such as the "a" in "set").

[Guido]
> Another warning that would be quite useful (and trap similar cases)
> would be "local variable used before set".

Java elevated that last one to a compile-time error, via its "definite
assignment" rules:  you not only have to make sure a local is bound before
reference, you have to make it *obvious* to the compiler that it's bound
before reference.  I think this is a Good Thing, because with intense
training, people can learn to think like a compiler too <wink>.

Seriously, in several of the cases where gcc warned about "maybe used before
set" in the Python implementation, the warnings were bogus but it was
non-trivial to deduce that.  Such code is very brittle under modification,
and the definite assignment rules make that path to error a non-starter.

Example:

def f(N):
    if N > 0:
        for i in range(N):
            if i == 0:
                j = 42
            else:
                f2(i)
    elif N <= 0:
        j = 24
    return j

It's a Crime Against Humanity to make the code reader *deduce* that j is
always bound by the time "return" is executed.





From guido at python.org  Tue Dec 19 07:08:14 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 01:08:14 -0500
Subject: [Python-Dev] Error: syncmail script missing
Message-ID: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>

I just checked in the documentation for the warnings module.  (Check
it out!)

When I ran "cvs commit" in the Doc directory, it said, amongst other
things:

    sh: /cvsroot/python/CVSROOT/syncmail: No such file or directory

I suppose this may be a side effect of the transition to new hardware
of the SourceForge CVS archive.  (Which, by the way, has dramatically
improved the performance of typical CVS operations -- I am no longer
afraid to do a cvs diff or cvs log in Emacs, or to do a cvs update
just to be sure.)

Could some of the Powers That Be (Fred or Barry :-) check into what
happened to the syncmail script?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at acm.org  Tue Dec 19 07:10:04 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 19 Dec 2000 01:10:04 -0500 (EST)
Subject: [Python-Dev] Error: syncmail script missing
In-Reply-To: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
Message-ID: <14910.64444.662460.48236@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > Could some of the Powers That Be (Fred or Barry :-) check into what
 > happened to the syncmail script?

  We've seen this before, but I'm not sure what it was.  Barry, do you
recall?  Had the Python interpreter landed in a different directory?
Or perhaps the location of the CVS repository is different, so
syncmail isn't where loginfo says.
  Tomorrow... scp to SF appears broken as well.  ;(


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tim.one at home.com  Tue Dec 19 07:16:15 2000
From: tim.one at home.com (Tim Peters)
Date: Tue, 19 Dec 2000 01:16:15 -0500
Subject: [Python-Dev] Error: syncmail script missing
In-Reply-To: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEIFIEAA.tim.one@home.com>

[Guido]
> I just checked in the documentation for the warnings module.  (Check
> it out!)

Everyone should note that this means Guido will be taking his traditional
post-release vacation almost immediately <wink -- but he is about to
leave!>.

> When I ran "cvs commit" in the Doc directory, it said, amongst other
> things:
>
>     sh: /cvsroot/python/CVSROOT/syncmail: No such file or directory
>
> I suppose this may be a side effect of the transition to new hardware
> of the SourceForge CVS archive.

The lack of checkin mail was first noted on a Jython list.  Finn wisely
replied that he'd just sit back and wait for the CPython people to figure
out how to fix it.

> ...
> Could some of the Powers That Be (Fred or Barry :-) check into what
> happened to the syncmail script?

Don't worry, I'll do my part by nagging them in your absence <wink>.  Bon
holiday voyage!




From cgw at fnal.gov  Tue Dec 19 07:32:15 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Tue, 19 Dec 2000 00:32:15 -0600 (CST)
Subject: [Python-Dev] cycle-GC question
Message-ID: <14911.239.12288.546710@buffalo.fnal.gov>

The following program:

import rexec
while 1:
      x = rexec.RExec()
      del x

leaks memory at a fantastic rate.

It seems clear (?) that this is due to the call to "set_rexec" at
rexec.py:140, which creates a circular reference between the `rexec'
and `hooks' objects.  (There's even a nice comment to that effect).

I'm curious however as to why the spiffy new cyclic-garbage collector
doesn't pick this up?

Just-wondering-ly y'rs,
		  cgw











From tim_one at email.msn.com  Tue Dec 19 10:24:18 2000
From: tim_one at email.msn.com (Tim Peters)
Date: Tue, 19 Dec 2000 04:24:18 -0500
Subject: [Python-Dev] RE: The Dictionary Gem is polished!
In-Reply-To: <3A3EBF23.750CF761@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEIKIEAA.tim_one@email.msn.com>

[Christian Tismer]
> ...
> For short strings, this prime has bad influence on the low bits,
> making it perform supoptimally for small dicts.
> See the new2 algo which funnily corrects for that.
> The reason is obvious: Just look at the bit pattern
> of 1000003:  '0xf4243'
>
> Without giving proof, this smells like bad bit distribution on small
> strings to me. You smell it too, right?
> ...

[Tim]
> As is, string hashes are a lot like integer hashes, in that
> "consecutive" strings
>
>    J001
>    J002
>    J003
>    J004
>    ...
>
> yield hashes very close together in value.

[back to Christian]
> A bad generator in that case. I'll look for a better one.

Not necessarily!  It's for that same reason "consecutive strings" can have
"better than random" behavior today.  And consecutive strings-- like
consecutive ints --are a common case.

Here are the numbers for the synthesized string cases:

N=1000
trips for strings old=293 new=302 new2=221
trips for bin strings old=0 new=0 new2=0
N=2000
trips for strings old=1093 new=1109 new2=786
trips for bin strings old=0 new=0 new2=0
N=3000
trips for strings old=810 new=839 new2=609
trips for bin strings old=0 new=0 new2=0
N=4000
trips for strings old=1850 new=1843 new2=1375
trips for bin strings old=0 new=0 new2=0

Here they are again, after doing nothing except changing the "^" to "+" in
the string hash, i.e. replacing

		x = (1000003*x) ^ *p++;
by
		x = (1000003*x) + *p++;

N=1000
trips for strings old=140 new=127 new2=108
trips for bin strings old=0 new=0 new2=0
N=2000
trips for strings old=480 new=434 new2=411
trips for bin strings old=0 new=0 new2=0
N=3000
trips for strings old=821 new=857 new2=631
trips for bin strings old=0 new=0 new2=0
N=4000
trips for strings old=1892 new=1852 new2=1476
trips for bin strings old=0 new=0 new2=0

The first two sizes are dramatically better, the last two a wash.  If you
want to see a real disaster, replace the "+" with "*" <wink>:

N=1000
trips for strings old=71429 new=6449 new2=2048
trips for bin strings old=81187 new=41117 new2=41584
N=2000
trips for strings old=26882 new=9300 new2=6103
trips for bin strings old=96018 new=46932 new2=42408

I got tired of waiting at that point ...

suspecting-a-better-string-hash-is-hard-to-find-ly y'rs  - tim





From martin at loewis.home.cs.tu-berlin.de  Tue Dec 19 12:58:17 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 19 Dec 2000 12:58:17 +0100
Subject: [Python-Dev] Death to string functions!
Message-ID: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>

> I agree it would be useful to define these terms, although those
> particular definitions appear to be missing the most important point
> from the user's POV (not a one says "going away someday").

PEP 4 says

# Usage of a module may be `deprecated', which means that it may be
# removed from a future Python release.

Proposals for better wording are welcome (and yes, I still have to get
the comments that I got into the document).

Regards,
Martin



From guido at python.org  Tue Dec 19 15:48:47 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 09:48:47 -0500
Subject: [Python-Dev] cycle-GC question
In-Reply-To: Your message of "Tue, 19 Dec 2000 00:32:15 CST."
             <14911.239.12288.546710@buffalo.fnal.gov> 
References: <14911.239.12288.546710@buffalo.fnal.gov> 
Message-ID: <200012191448.JAA28737@cj20424-a.reston1.va.home.com>

> The following program:
> 
> import rexec
> while 1:
>       x = rexec.RExec()
>       del x
> 
> leaks memory at a fantastic rate.
> 
> It seems clear (?) that this is due to the call to "set_rexec" at
> rexec.py:140, which creates a circular reference between the `rexec'
> and `hooks' objects.  (There's even a nice comment to that effect).
> 
> I'm curious however as to why the spiffy new cyclic-garbage collector
> doesn't pick this up?

Me too.  I turned on gc debugging (gc.set_debug(077) :-) and got
messages suggesting that it is not collecting everything.  The
output looks like this:

   .
   .
   .
gc: collecting generation 0...
gc: objects in each generation: 764 6726 89174
gc: done.
gc: collecting generation 1...
gc: objects in each generation: 0 8179 89174
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 764 0 97235
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 757 747 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 764 1386 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 757 2082 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 764 2721 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 757 3417 97184
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 764 4056 97184
gc: done.
   .
   .
   .

With the third number growing each time a "generation 1" collection is
done.

Maybe Neil can shed some light?  The gc.garbage list is empty.

This is about as much as I know about the GC stuff...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From petrilli at amber.org  Tue Dec 19 16:25:18 2000
From: petrilli at amber.org (Christopher Petrilli)
Date: Tue, 19 Dec 2000 10:25:18 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Tue, Dec 19, 2000 at 12:58:17PM +0100
References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>
Message-ID: <20001219102518.A14288@trump.amber.org>

So I was thinking about this whole thing, and wondering why it was
that seeing things like:

     " ".join(aList)

bugged me to no end, while:

     aString.lower()

didn't seem to look wrong. I finally put my finger on it, and I
haven't seen anyone mention it, so I guess I'll do so.  To me, the
concept of "join" on a string is just not quite kosher, instead it
should be something like this:

     aList.join(" ")

or if you want it without the indirection:

     ['item', 'item', 'item'].join(" ")

Now *THAT* looks right to me.  The example of a join method on a
string just doesn't quite gel in my head, and I did some thinking and
digging, and well, when I pulled up my Smalltalk browser, things like
join are done on Collections, not on Strings.  You're joining the
collection, not the string.

Perhaps in a rush to move some things that were "string related" in
the string module into methods on the strings themselves (something I
whole-heartedly support), we moved a few too many things
there---things that symantically don't really belong as methods on a
string object.

How this gets resolved, I don't know... but I know a lot of people
have looked at the string methods---and they each keep coming back to
1 or 2 that bug them... and I think it's those that really aren't
methods of a string, but instead something that operates with strings, 
but expects other things.

Chris
-- 
| Christopher Petrilli
| petrilli at amber.org



From guido at python.org  Tue Dec 19 16:37:15 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 10:37:15 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Tue, 19 Dec 2000 10:25:18 EST."
             <20001219102518.A14288@trump.amber.org> 
References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>  
            <20001219102518.A14288@trump.amber.org> 
Message-ID: <200012191537.KAA28909@cj20424-a.reston1.va.home.com>

> So I was thinking about this whole thing, and wondering why it was
> that seeing things like:
> 
>      " ".join(aList)
> 
> bugged me to no end, while:
> 
>      aString.lower()
> 
> didn't seem to look wrong. I finally put my finger on it, and I
> haven't seen anyone mention it, so I guess I'll do so.  To me, the
> concept of "join" on a string is just not quite kosher, instead it
> should be something like this:
> 
>      aList.join(" ")
> 
> or if you want it without the indirection:
> 
>      ['item', 'item', 'item'].join(" ")
> 
> Now *THAT* looks right to me.  The example of a join method on a
> string just doesn't quite gel in my head, and I did some thinking and
> digging, and well, when I pulled up my Smalltalk browser, things like
> join are done on Collections, not on Strings.  You're joining the
> collection, not the string.
> 
> Perhaps in a rush to move some things that were "string related" in
> the string module into methods on the strings themselves (something I
> whole-heartedly support), we moved a few too many things
> there---things that symantically don't really belong as methods on a
> string object.
> 
> How this gets resolved, I don't know... but I know a lot of people
> have looked at the string methods---and they each keep coming back to
> 1 or 2 that bug them... and I think it's those that really aren't
> methods of a string, but instead something that operates with strings, 
> but expects other things.

Boy, are you stirring up a can of worms that we've been through many
times before!  Nothing you say hasn't been said at least a hundred
times before, on this list as well as on c.l.py.

The problem is that if you want to make this a method on lists, you'll
also have to make it a method on tuples, and on arrays, and on NumPy
arrays, and on any user-defined type that implements the sequence
protocol...  That's just not reasonable to expect.

There really seem to be only two possibilities that don't have this
problem: (1) make it a built-in, or (2) make it a method on strings.

We chose for (2) for uniformity, and to avoid the potention with
os.path.join(), which is sometimes imported as a local.  If
" ".join(L) bugs you, try this:

    space = " "	 # This could be a global
     .
     .
     .
    s = space.join(L)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Tue Dec 19 16:46:55 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 10:46:55 -0500
Subject: [Python-Dev] Death to string functions!
References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>
	<20001219102518.A14288@trump.amber.org>
Message-ID: <14911.33519.764029.306876@anthem.concentric.net>

>>>>> "CP" == Christopher Petrilli <petrilli at amber.org> writes:

    CP> So I was thinking about this whole thing, and wondering why it
    CP> was that seeing things like:

    CP>      " ".join(aList)

    CP> bugged me to no end, while:

    CP>      aString.lower()

    CP> didn't seem to look wrong. I finally put my finger on it, and
    CP> I haven't seen anyone mention it, so I guess I'll do so.

Actually, it has been debated to death. ;)  This looks better:

    SPACE = ' '
    SPACE.join(aList)

That reads good to me ("space-join this list") and that's how I always
write it.  That said, there are certainly lots of people who agree
with you.  You can't put join() on sequences though, until you have
builtin base-classes, or interfaces, or protocols or some such
construct, because otherwise you'd have to add it to EVERY sequence,
including classes that act like sequences.

One idea that I believe has merit is to consider adding join() to the
builtins, probably with a signature like:

    join(aList, aString) -> aString

This horse has been whacked pretty good too, but I don't remember
seeing a patch or a pronouncement.

-Barry



From nas at arctrix.com  Tue Dec 19 09:53:36 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 19 Dec 2000 00:53:36 -0800
Subject: [Python-Dev] cycle-GC question
In-Reply-To: <200012191448.JAA28737@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Dec 19, 2000 at 09:48:47AM -0500
References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com>
Message-ID: <20001219005336.A303@glacier.fnational.com>

On Tue, Dec 19, 2000 at 09:48:47AM -0500, Guido van Rossum wrote:
> > import rexec
> > while 1:
> >       x = rexec.RExec()
> >       del x
> > 
> > leaks memory at a fantastic rate.
> > 
> > It seems clear (?) that this is due to the call to "set_rexec" at
> > rexec.py:140, which creates a circular reference between the `rexec'
> > and `hooks' objects.  (There's even a nice comment to that effect).

Line 140 is not the only place a circular reference is created.
There is another one which is trickier to find:

    def add_module(self, mname):
        if self.modules.has_key(mname):
            return self.modules[mname]
        self.modules[mname] = m = self.hooks.new_module(mname)
        m.__builtins__ = self.modules['__builtin__']
        return m

If the module being added is __builtin__ then m.__builtins__ = m.
The GC currently doesn't track modules.  I guess it should.  It
might be possible to avoid this circular reference but I don't
know enough about how RExec works.  Would something like:

    def add_module(self, mname):
        if self.modules.has_key(mname):
            return self.modules[mname]
        self.modules[mname] = m = self.hooks.new_module(mname)
        if mname != '__builtin__':
            m.__builtins__ = self.modules['__builtin__']
        return m
    
do the trick?

  Neil



From fredrik at effbot.org  Tue Dec 19 16:39:49 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 19 Dec 2000 16:39:49 +0100
Subject: [Python-Dev] Death to string functions!
References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> <20001219102518.A14288@trump.amber.org>
Message-ID: <008301c069d3$76560a20$3c6340d5@hagrid>

"Christopher Petrilli" wrote:
> didn't seem to look wrong. I finally put my finger on it, and I
> haven't seen anyone mention it, so I guess I'll do so.  To me, the
> concept of "join" on a string is just not quite kosher, instead it
> should be something like this:
> 
>      aList.join(" ")
> 
> or if you want it without the indirection:
> 
>      ['item', 'item', 'item'].join(" ")
> 
> Now *THAT* looks right to me.

why do we keep coming back to this?

aString.join can do anything string.join can do, but aList.join
cannot.  if you don't understand why, check the archives.

</F>




From martin at loewis.home.cs.tu-berlin.de  Tue Dec 19 16:44:48 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 19 Dec 2000 16:44:48 +0100
Subject: [Python-Dev] cycle-GC question
Message-ID: <200012191544.QAA11408@loewis.home.cs.tu-berlin.de>

> It seems clear (?) that this is due to the call to "set_rexec" at
> rexec.py:140, which creates a circular reference between the `rexec'
> and `hooks' objects.  (There's even a nice comment to that effect).

It's not all that clear that *this* is the cycle. In fact, it is not.

> I'm curious however as to why the spiffy new cyclic-garbage
> collector doesn't pick this up?

It's an interesting problem, so I spent this afternoon investigating
it. I soon found that I need a tool, so I introduced a new function
gc.getreferents which, when given an object, returns a list of objects
referring to that object. The patch for that feature is in

http://sourceforge.net/patch/?func=detailpatch&patch_id=102925&group_id=5470

Applying that function recursively, I can get an output that looks
like that:

<rexec.RExec instance at 0x81f5dcc>
 <method RExec.r_import of RExec instance at 0x81f5dcc>
  dictionary 0x81f4f24
 <method RExec.r_reload of RExec instance at 0x81f5dcc>
  dictionary 0x81f4f24 (seen)
 <method RExec.r_open of RExec instance at 0x81f5dcc>
  dictionary 0x81f4f24 (seen)
 <method RExec.r_exc_info of RExec instance at 0x81f5dcc>
  dictionary 0x8213bc4
 dictionary 0x820869c
  <rexec.RHooks instance at 0x8216cbc>
   dictionary 0x820866c
    <rexec.RExec instance at 0x81f5dcc> (seen)
   dictionary 0x8213bf4
    <ihooks.FancyModuleLoader instance at 0x81f7464>
     dictionary 0x820866c (seen)
     dictionary 0x8214144
      <ihooks.ModuleImporter instance at 0x8214124>
       dictionary 0x820866c (seen)

Each indentation level shows the objects which refer to the outer-next
object, e.g. the dictionary 0x820869c refers to the RExec instance,
and the RHooks instance refers to that dictionary. Clearly, the
dictionary 0x820869c is the RHooks' __dict__, and the reference
belongs to the 'rexec' key in that dictionary.

The recursion stops only when an object has been seen before (so its a
cycle, or other non-tree graph), or if there are no referents (the
lists created to do the iteration are ignored).

So it appears that the r_import method is referenced from some
dictionary, but that dictionary is not referenced anywhere???

Checking the actual structures shows that rexec creates a __builtin__
module, which has a dictionary that has an __import__ key. So the
reference to the method comes from the __builtin__ module, which in
turn is referenced as the RExec's .modules attribute, giving another
cycle.

However, module objects don't participate in garbage
collection. Therefore, gc.getreferents cannot traverse a module, and
the garbage collector won't find a cycle involving a garbage module.
I just submitted a bug report,

http://sourceforge.net/bugs/?func=detailbug&bug_id=126345&group_id=5470

which suggests that modules should also participate in garbage
collection.

Regards,
Martin



From guido at python.org  Tue Dec 19 17:01:46 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 11:01:46 -0500
Subject: [Python-Dev] cycle-GC question
In-Reply-To: Your message of "Tue, 19 Dec 2000 00:53:36 PST."
             <20001219005336.A303@glacier.fnational.com> 
References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com>  
            <20001219005336.A303@glacier.fnational.com> 
Message-ID: <200012191601.LAA29015@cj20424-a.reston1.va.home.com>

> might be possible to avoid this circular reference but I don't
> know enough about how RExec works.  Would something like:
> 
>     def add_module(self, mname):
>         if self.modules.has_key(mname):
>             return self.modules[mname]
>         self.modules[mname] = m = self.hooks.new_module(mname)
>         if mname != '__builtin__':
>             m.__builtins__ = self.modules['__builtin__']
>         return m
>     
> do the trick?

That's certainly a good thing to do (__builtin__ has no business
having a __builtins__!), but (in my feeble experiment) it doesn't make
the leaks go away.

Note that almost every module participates heavily in cycles: whenever
you define a function f(), f.func_globals is the module's __dict__,
which also contains a reference to f.  Similar for classes, with an
extra hop via the class object and its __dict__.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From cgw at fnal.gov  Tue Dec 19 17:06:06 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Tue, 19 Dec 2000 10:06:06 -0600 (CST)
Subject: [Python-Dev] cycle-GC question
In-Reply-To: <20001219005336.A303@glacier.fnational.com>
References: <14911.239.12288.546710@buffalo.fnal.gov>
	<200012191448.JAA28737@cj20424-a.reston1.va.home.com>
	<20001219005336.A303@glacier.fnational.com>
Message-ID: <14911.34670.664178.418523@buffalo.fnal.gov>

Neil Schemenauer writes:
 > 
 > Line 140 is not the only place a circular reference is created.
 > There is another one which is trickier to find:
 > 
 >     def add_module(self, mname):
 >         if self.modules.has_key(mname):
 >             return self.modules[mname]
 >         self.modules[mname] = m = self.hooks.new_module(mname)
 >         m.__builtins__ = self.modules['__builtin__']
 >         return m
 > 
 > If the module being added is __builtin__ then m.__builtins__ = m.
 > The GC currently doesn't track modules.  I guess it should.  It
 > might be possible to avoid this circular reference but I don't
 > know enough about how RExec works.  Would something like:
 > 
 >     def add_module(self, mname):
 >         if self.modules.has_key(mname):
 >             return self.modules[mname]
 >         self.modules[mname] = m = self.hooks.new_module(mname)
 >         if mname != '__builtin__':
 >             m.__builtins__ = self.modules['__builtin__']
 >         return m
 >     
 > do the trick?

No... if you change "add_module" in exactly the way you suggest
(without worrying about whether it breaks the functionality of rexec!)
and run the test

while 1:
      rexec.REXec()

you will find that it still leaks memory at a prodigious rate.

So, (unless there is yet another module-level cyclic reference) I
don't think this theory explains the problem.



From martin at loewis.home.cs.tu-berlin.de  Tue Dec 19 17:07:04 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 19 Dec 2000 17:07:04 +0100
Subject: [Python-Dev] cycle-GC question
Message-ID: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de>

> There is another one which is trickier to find:
[__builtin__.__builtins__ == __builtin__]

> Would something like:
[do not add builtins to builtin
> work?

No, because there is another one that is even trickier to find :-)

>>> print r
<rexec.RExec instance at 0x81f7cac>
>>> print r.modules['__builtin__'].open.im_self
<rexec.RExec instance at 0x81f7cac>

Please see my other message; I think modules should be gc'ed.

Regards,
Martin



From nas at arctrix.com  Tue Dec 19 10:24:29 2000
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 19 Dec 2000 01:24:29 -0800
Subject: [Python-Dev] cycle-GC question
In-Reply-To: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Tue, Dec 19, 2000 at 05:07:04PM +0100
References: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de>
Message-ID: <20001219012429.A520@glacier.fnational.com>

On Tue, Dec 19, 2000 at 05:07:04PM +0100, Martin v. Loewis wrote:
> I think modules should be gc'ed.

I agree.  Its easy to do.  If no one does over Christmas I will
do it before 2.1 is released.

  Neil



From tismer at tismer.com  Tue Dec 19 16:48:58 2000
From: tismer at tismer.com (Christian Tismer)
Date: Tue, 19 Dec 2000 17:48:58 +0200
Subject: [Python-Dev] The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCMEIAIEAA.tim.one@home.com>
Message-ID: <3A3F836A.DEDF1011@tismer.com>


Tim Peters wrote:
> 
> Something else to ponder:  my tests show that the current ("old") algorithm
> performs much better (somewhat worse than "new2" == new algorithm + warmup)
> if incr is simply initialized like so instead:
> 
>         if mp.oldalg:
>             incr = (_hash & 0xffffffffL) % (mp.ma_size - 1)

Sure. I did this as well, but didn't consider a division
since it said to be too slow. But this is very platform
dependant. On Pentiums this might be not noticeable.

> That's another way to get all the bits to contribute to the result.  Note
> that a mod by size-1 is analogous to "casting out nines" in decimal:  it's
> the same as breaking hash into fixed-sized pieces from the right (10 bits
> each if size=2**10, etc), adding the pieces together, and repeating that
> process until only one piece remains.  IOW, it's a degenerate form of
> division, but works well all the same.  It didn't improve over that when I
> tried a mod by the largest prime less than the table size (which suggests
> we're sucking all we can out of the *probe* sequence given a sometimes-poor
> starting index).

Again I tried this too. Instead of the largest near prime I used
the nearest prime. Remarkably the nearest prime is identical
to the primitive element in a lot of cases.
But no improvement over the modulus.

> 
> However, it's subject to the same weak clustering phenomenon as the old
> method due to the ill-advised "~hash" operation in computing the initial
> index.  If ~ is also thrown away, it's as good as new2 (here I've tossed out
> the "windows names", and "old" == existing algorithm except (a) get rid of ~
> when computing index and (b) do mod by size-1 when computing incr):
...
> The new and new2 values differ in minor ways from the ones you posted
> because I got rid of the ~ (the ~ has a bad interaction with "additive"
> means of computing incr, because the ~ tends to move the index in the
> opposite direction, and these moves in opposite directions tend to cancel
> out when computing incr+index the first time).

Remarkable.

> too-bad-mod-is-expensive!-ly y'rs  - tim

Yes. The wheel is cheapest yet.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From just at letterror.com  Tue Dec 19 18:11:55 2000
From: just at letterror.com (Just van Rossum)
Date: Tue, 19 Dec 2000 18:11:55 +0100
Subject: [Python-Dev] Death to string functions!
Message-ID: <l03102804b6653dd31c4e@[193.78.237.123]>

Barry wrote:
>Actually, it has been debated to death. ;)  This looks better:
>
>    SPACE = ' '
>    SPACE.join(aList)
>
>That reads good to me ("space-join this list") and that's how I always
>write it.

I just did a quick scan through the 1.5.2 library, and _most_
occurrances of string.join() are used with a string constant
for the second argument. There is a whole bunch of one-arg
string.join()'s, too. Recommending replacing all of these (not
to mention all the code "out there") with named constants seems
plain silly.

Sure, " ".join() is the most "logical" choice for Python as it
stands, but it's definitely not the most intuitive, as evidenced
by the number of times this comes up on c.l.py: to many people
it simply "looks wrong". Maybe this is the deal: joiner.join()
makes a whole lot of sense from an _implementation_ standpoint,
but a whole lot less as a public interface.

It's easy to explain why join() can't be a method of sequences
(in Python), but that alone doesn't justify a string method.
string.join() is not quite unlike map() and friends:
map() wouldn't be so bad as a sequence method, but that isn't
practical for exactly the same reasons: so it's a builtin.
(And not a function method...)

So, making join() a builtin makes a whole lot of sense. Not
doing this because people sometimes use a local reference to
os.path.join seems awfully backward. Hm, maybe joiner.join()
could become a special method: joiner.__join__(), that way
other objects could define their own implementation for
join(). (Hm, wouldn't be the worst thing for, say, a file
path object...)

Just





From barry at digicool.com  Tue Dec 19 18:20:07 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 12:20:07 -0500
Subject: [Python-Dev] Death to string functions!
References: <l03102804b6653dd31c4e@[193.78.237.123]>
Message-ID: <14911.39111.710940.342986@anthem.concentric.net>

>>>>> "JvR" == Just van Rossum <just at letterror.com> writes:

    JvR> Recommending replacing all of these (not to mention all the
    JvR> code "out there") with named constants seems plain silly.

Until there's a tool to do the migration, I don't (personally)
recommend wholesale migration.  For new code I write though, I usually
do it the way I described (which is intuitive to me, but then so is
moving your fingers at a blinding speed up and down 5 long strips of
metal to cause low bowel-tickling rumbly noises).

    JvR> So, making join() a builtin makes a whole lot of sense. Not
    JvR> doing this because people sometimes use a local reference to
    JvR> os.path.join seems awfully backward.

I agree.  Have we agreed on the semantics and signature of builtin
join() though?  Is it just string.join() stuck in builtins?

-Barry



From fredrik at effbot.org  Tue Dec 19 18:25:49 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 19 Dec 2000 18:25:49 +0100
Subject: [Python-Dev] Death to string functions!
References: <l03102804b6653dd31c4e@[193.78.237.123]> <14911.39111.710940.342986@anthem.concentric.net>
Message-ID: <012901c069e0$bd724fb0$3c6340d5@hagrid>

Barry wrote:
>     JvR> So, making join() a builtin makes a whole lot of sense. Not
>     JvR> doing this because people sometimes use a local reference to
>     JvR> os.path.join seems awfully backward.
> 
> I agree.  Have we agreed on the semantics and signature of builtin
> join() though?  Is it just string.join() stuck in builtins?

+1

(let's leave the __join__ slot and other super-generalized
variants for 2.2)

</F>




From thomas at xs4all.net  Tue Dec 19 18:54:34 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 19 Dec 2000 18:54:34 +0100
Subject: [Python-Dev] SourceForge SSH silliness
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEIDIEAA.tim.one@home.com>; from tim.one@home.com on Tue, Dec 19, 2000 at 12:50:01AM -0500
References: <20001217220008.D29681@xs4all.nl> <LNBBLJKPBEHFEDALKOLCIEIDIEAA.tim.one@home.com>
Message-ID: <20001219185434.E29681@xs4all.nl>

On Tue, Dec 19, 2000 at 12:50:01AM -0500, Tim Peters wrote:

> [Thomas Wouters]
> > What sourceforge did was switch Linux distributions, and upgrade.
> > ... [and quite a bit more] ...

> I hope you're feeling better today <wink>.  "The problem" was one the wng
> msg spelled out:  "It is also possible that the host key has just been
> changed.".  SF changed keys.  That's the whole banana right there.  Deleting
> the sourceforge keys from known_hosts fixed it (== convinced ssh to install
> new SF keys the next time I connected).

Well, if you'd read the thread <wink>, you'll notice that other people had
problems even after that. I'm glad you're not one of them, though :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From barry at digicool.com  Tue Dec 19 19:22:19 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 13:22:19 -0500
Subject: [Python-Dev] Error: syncmail script missing
References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCIEIFIEAA.tim.one@home.com>
Message-ID: <14911.42843.284822.935268@anthem.concentric.net>

Folks,

Python wasn't installed on the new SF CVS machine, which was why
syncmail was broken.  My thanks to the SF guys for quickly remedying
this situation!

Please give it a test.
-Barry



From barry at digicool.com  Tue Dec 19 19:23:32 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 13:23:32 -0500
Subject: [Python-Dev] Error: syncmail script missing
References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCIEIFIEAA.tim.one@home.com>
	<14911.42843.284822.935268@anthem.concentric.net>
Message-ID: <14911.42916.573600.922606@anthem.concentric.net>

>>>>> "BAW" == Barry A Warsaw <barry at digicool.com> writes:

    BAW> Python wasn't installed on the new SF CVS machine, which was
    BAW> why syncmail was broken.  My thanks to the SF guys for
    BAW> quickly remedying this situation!

BTW, it's currently Python 1.5.2.



From tismer at tismer.com  Tue Dec 19 18:34:14 2000
From: tismer at tismer.com (Christian Tismer)
Date: Tue, 19 Dec 2000 19:34:14 +0200
Subject: [Python-Dev] Re: The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCGEHJIEAA.tim.one@home.com>
Message-ID: <3A3F9C16.562F9D9F@tismer.com>

Again...

Tim Peters wrote:
> 
> Sounds good to me!  It's a very cheap way to get the high bits into play.
...
> [Christian]
> > - The bits used from the string hash are not well distributed
> > - using a "warmup wheel" on the hash to suck all bits in
> >   gives the same quality of hashes like random numbers.
> 
> See above and be very cautious:  none of Python's hash functions produce
> well-distributed bits, and-- in effect --that's why Python dicts often
> perform "better than random" on common data.  Even what you've done so far
> appears to provide marginally worse statistics for Guido's favorite kind of
> test case ("worse" in two senses:  total number of collisions (a measure of
> amortized lookup cost), and maximum collision chain length (a measure of
> worst-case lookup cost)):
> 
>    d = {}
>    for i in range(N):
>        d[repr(i)] = i

I will look into this.

> check-in-one-thing-then-let-it-simmer-ly y'rs  - tim

Are you saying I should check the thing in? Really?


In another reply to this message I was saying
"""
This is why I think to be even more conservative:
Try to use a division wheel, but with the inverses
of the original primitive roots, just in order to
get at Guido's results :-)
"""

This was a religious desire, but such an inverse cannot exist.
Well, all inverses exists, but it is an error to think
that they can produce similar bit patterns. Changing the
root means changing the whole system, since we have just
a *representation* of a goup, via polynomial coefficients.

A simple example which renders my thought useless is this:
There is no general pattern that can turn a physical right
shift into a left shift, for all bit combinations.

Anyway, how can I produce a nearly complete scheme like today
with the same "cheaper than random" properties?

Ok, we have to stick with the given polymomials to stay
compatible, and we also have to shift left.
How do we then rotate the random bits in?
Well, we can in fact do a rotation of the whole index, moving
the highest bit into the lowest.
Too bad that this isn't supported in C. It is a native
machine instruction on X86 machines.

We would then have:

                incr = ROTATE_LEFT(incr, 1)
                if (incr > mask):
                    incr = incr ^ mp.ma_poly

The effect is similar to the "old" algorithm, bits are shiftet
left. Only if the hash happens to have hight bits, they appear
in the modulus.
On the current "faster than random" cases, I assume that
high bits in the hash are less likely than low bits, so it is
more likely that an entry finds its good place in the dict,
before bits are rotated in. hence the "good" cases would be kept.

I did all tests again, now including maximum trip length, and
added a "rotate-left" version as well:

D:\crml_doc\platf\py>python dictest.py
N=1000
trips for strings old=293/9 new=302/7 new2=221/7 rot=278/5
trips for bad integers old=499500/999 new=13187/31 new2=999/1 rot=16754/31
trips for random integers old=360/8 new=369/8 new2=358/6 rot=356/7
trips for windows names old=230/5 new=207/7 new2=200/5 rot=225/5
N=2000
trips for strings old=1093/11 new=1109/10 new2=786/6 rot=1082/8
trips for bad integers old=0/0 new=26455/32 new2=1999/1 rot=33524/34
trips for random integers old=704/7 new=686/8 new2=685/7 rot=693/7
trips for windows names old=503/8 new=542/9 new2=564/6 rot=529/7
N=3000
trips for strings old=810/5 new=839/6 new2=609/5 rot=796/5
trips for bad integers old=0/0 new=38681/36 new2=2999/1 rot=49828/38
trips for random integers old=708/5 new=723/7 new2=724/5 rot=722/6
trips for windows names old=712/6 new=711/5 new2=691/5 rot=738/9
N=4000
trips for strings old=1850/9 new=1843/8 new2=1375/11 rot=1848/10
trips for bad integers old=0/0 new=52994/39 new2=3999/1 rot=66356/38
trips for random integers old=1395/9 new=1397/8 new2=1435/9 rot=1394/13
trips for windows names old=1449/8 new=1434/8 new2=1457/11 rot=1513/9

D:\crml_doc\platf\py>

Concerning trip length, rotate is better than old in most cases.
Random integers seem to withstand any of these procedures.
For bad integers, rot takes naturally more trips than new, since
the path to the bits is longer.

All in all I don't see more than marginal differences between
the approaches, and I tent to stick with "new", since it is
theapest to implement.
(it does not cost anything and might instead be a little cheaper
for some compilers, since it does not reference the mask variable).

I'd say let's do the patch   --   ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com
-------------- next part --------------
## dictest.py
## Test of a new rehash algorithm
## Chris Tismer
## 2000-12-17
## Mission Impossible 5oftware Team

# The following is a partial re-implementation of
# Python dictionaries in Python.
# The original algorithm was literally turned
# into Python code.

##/*
##Table of irreducible polynomials to efficiently cycle through
##GF(2^n)-{0}, 2<=n<=30.
##*/
polys = [
    4 + 3,
    8 + 3,
    16 + 3,
    32 + 5,
    64 + 3,
    128 + 3,
    256 + 29,
    512 + 17,
    1024 + 9,
    2048 + 5,
    4096 + 83,
    8192 + 27,
    16384 + 43,
    32768 + 3,
    65536 + 45,
    131072 + 9,
    262144 + 39,
    524288 + 39,
    1048576 + 9,
    2097152 + 5,
    4194304 + 3,
    8388608 + 33,
    16777216 + 27,
    33554432 + 9,
    67108864 + 71,
    134217728 + 39,
    268435456 + 9,
    536870912 + 5,
    1073741824 + 83,
    0
]
polys = map(long, polys)

class NULL: pass

class Dictionary:
    dummy = "<dummy key>"
    def __init__(mp, newalg=0):
        mp.ma_size = 0
        mp.ma_poly = 0
        mp.ma_table = []
        mp.ma_fill = 0
        mp.ma_used = 0
        mp.oldalg = not newalg
        mp.warmup = newalg==2
        mp.rotleft = newalg==3
        mp.trips = 0
        mp.tripmax = 0

    def getTrips(self):
        trips, tripmax = self.trips, self.tripmax
        self.trips = self.tripmax = 0
        return trips, tripmax

    def lookdict(mp, key, _hash):
        me_hash, me_key, me_value = range(3) # rec slots
        dummy = mp.dummy
        
        mask = mp.ma_size-1
        ep0 = mp.ma_table
        i = (~_hash) & mask
        ep = ep0[i]
        if ep[me_key] is NULL or ep[me_key] == key:
            return ep
        if ep[me_key] == dummy:
            freeslot = ep
        else:
            if (ep[me_hash] == _hash and
                cmp(ep[me_key], key) == 0) :
                return ep
            freeslot = NULL

###### FROM HERE
        if mp.oldalg:
            incr = (_hash ^ (_hash >> 3)) & mask
        else:
            # note that we do not mask!
            # the shifting is worth it in the incremental case.

            ## added after posting to python-dev:
            uhash = _hash & 0xffffffffl
            if mp.warmup:
                incr = uhash
                mask2 = 0xffffffffl ^ mask
                while mask2 > mask:
                    if (incr & 1):
                        incr = incr ^ mp.ma_poly
                    incr = incr >> 1
                    mask2 = mask2>>1
                # this loop *can* be sped up by tables
                # with precomputed multiple shifts.
                # But I'm not sure if it is worth it at all.
            else:
                incr = uhash ^ (uhash >> 3)

###### TO HERE
            
        if (not incr):
            incr = mask

        triplen = 0            
        while 1:
            mp.trips = mp.trips+1
            triplen = triplen+1
            if triplen > mp.tripmax:
                mp.tripmax = triplen
            
            ep = ep0[int((i+incr)&mask)]
            if (ep[me_key] is NULL) :
                if (freeslot is not NULL) :
                    return freeslot
                else:
                    return ep
            if (ep[me_key] == dummy) :
                if (freeslot == NULL):
                    freeslot = ep
            elif (ep[me_key] == key or
                 (ep[me_hash] == _hash and
                  cmp(ep[me_key], key) == 0)) :
                return ep

            # Cycle through GF(2^n)-{0}
###### FROM HERE
            if mp.oldalg:
                incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            elif mp.rotleft:
                if incr &0x80000000L:
                    incr = (incr << 1) | 1
                else:
                    incr = incr << 1
                if (incr > mask):
                    incr = incr ^ mp.ma_poly
            else:
                # new algorithm: do a division
                if (incr & 1):
                    incr = incr ^ mp.ma_poly
                incr = incr >> 1
###### TO HERE

    def insertdict(mp, key, _hash, value):
        me_hash, me_key, me_value = range(3) # rec slots
        
        ep = mp.lookdict(key, _hash)
        if (ep[me_value] is not NULL) :
            old_value = ep[me_value]
            ep[me_value] = value
        else :
            if (ep[me_key] is NULL):
                mp.ma_fill=mp.ma_fill+1
            ep[me_key] = key
            ep[me_hash] = _hash
            ep[me_value] = value
            mp.ma_used = mp.ma_used+1

    def dictresize(mp, minused):
        me_hash, me_key, me_value = range(3) # rec slots

        oldsize = mp.ma_size
        oldtable = mp.ma_table
        MINSIZE = 4
        newsize = MINSIZE
        for i in range(len(polys)):
            if (newsize > minused) :
                newpoly = polys[i]
                break
            newsize = newsize << 1
        else:
            return -1

        _nullentry = range(3)
        _nullentry[me_hash] = 0
        _nullentry[me_key] = NULL
        _nullentry[me_value] = NULL

        newtable = map(lambda x,y=_nullentry:y[:], range(newsize))      

        mp.ma_size = newsize
        mp.ma_poly = newpoly
        mp.ma_table = newtable
        mp.ma_fill = 0
        mp.ma_used = 0

        for ep in oldtable:
            if (ep[me_value] is not NULL):
                mp.insertdict(ep[me_key],ep[me_hash],ep[me_value])
        return 0

    # PyDict_GetItem
    def __getitem__(op, key):
        me_hash, me_key, me_value = range(3) # rec slots

        if not op.ma_table:
            raise KeyError, key
        _hash = hash(key)
        return op.lookdict(key, _hash)[me_value]

    # PyDict_SetItem
    def __setitem__(op, key, value):
        mp = op
        _hash = hash(key)
##      /* if fill >= 2/3 size, double in size */
        if (mp.ma_fill*3 >= mp.ma_size*2) :
            if (mp.dictresize(mp.ma_used*2) != 0):
                if (mp.ma_fill+1 > mp.ma_size):
                    raise MemoryError
        mp.insertdict(key, _hash, value)

    # more interface functions
    def keys(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _key)
        return res

    def values(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( _value)
        return res

    def items(self):
        me_hash, me_key, me_value = range(3) # rec slots
        res = []
        for _hash, _key, _value in self.ma_table:
            if _value is not NULL:
                res.append( (_key, _value) )
        return res

    def __cmp__(self, other):
        mine = self.items()
        others = other.items()
        mine.sort()
        others.sort()
        return cmp(mine, others)

######################################################
## tests

def test(lis, dic):
    for key in lis: dic[key]

def nulltest(lis, dic):
    for key in lis: dic

def string_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    for i in range(n):
        s = str(i) #* 5
        #s = chr(i%256) + chr(i>>8)##
        d1[s] = d2[s] = d3[s] = d4[s] = i
    return d1, d2, d3, d4

def istring_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    for i in range(n):
        s = chr(i%256) + chr(i>>8)
        d1[s] = d2[s] = d3[s] = d4[s] = i
    return d1, d2, d3, d4

def random_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    from whrandom import randint
    import sys
    keys = []
    for i in range(n):
        keys.append(randint(0, sys.maxint-1))
    for i in keys:
        d1[i] = d2[i] = d3[i] = d4[i] = i
    return d1, d2, d3, d4

def badnum_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    shift = 10
    if EXTREME:
        shift = 16
    for i in range(n):
        bad = i << 16
        d2[bad] = d3[bad] = d4[bad] = i
        if n <= 1000: d1[bad] = i
    return d1, d2, d3, d4

def names_dicts(n):
    d1 = Dictionary()   # original
    d2 = Dictionary(1)  # other rehash
    d3 = Dictionary(2)  # with warmup
    d4 = Dictionary(3)  # rotleft
    import win32con
    keys = win32con.__dict__.keys()
    if len(keys) < n:
        keys = []
    for s in keys[:n]:
        d1[s] = d2[s] = d3[s] = d4[s] = s
    return d1, d2, d3, d4

def do_test(dict):
    keys = dict.keys()
    dict.getTrips() # reset
    test(keys, dict)
    return "%d/%d" % dict.getTrips()

EXTREME=1

if __name__ == "__main__":

    for N in (1000,2000,3000,4000):    

        sdold, sdnew, sdnew2, sdrot = string_dicts(N)
        #idold, idnew, idnew2, idrot = istring_dicts(N)
        bdold, bdnew, bdnew2, bdrot = badnum_dicts(N)
        rdold, rdnew, rdnew2, rdrot = random_dicts(N)
        ndold, ndnew, ndnew2, ndrot = names_dicts(N)
        fmt = "old=%s new=%s new2=%s rot=%s"
        print "N=%d" %N        
        print ("trips for strings "+fmt) % tuple(
            map(do_test, (sdold, sdnew, sdnew2, sdrot)) )
        #print ("trips for bin strings "+fmt) % tuple(
        #    map(do_test, (idold, idnew, idnew2, idrot)) )
        print ("trips for bad integers "+fmt) % tuple(
            map(do_test, (bdold, bdnew, bdnew2, bdrot)))
        print ("trips for random integers "+fmt) % tuple(
            map(do_test, (rdold, rdnew, rdnew2, rdrot)))
        print ("trips for windows names "+fmt) % tuple(
            map(do_test, (ndold, ndnew, ndnew2, ndrot)))

"""
Results with a shift of 10 (EXTREME=0):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.097 new=5.088
timing for bad integers old=101.540 new=12.610

Results with a shift of 16 (EXTREME=1):
D:\crml_doc\platf\py>python dictest.py
timing for strings old=5.218 new=5.147
timing for bad integers old=571.210 new=19.220
"""

From just at letterror.com  Tue Dec 19 19:46:18 2000
From: just at letterror.com (Just van Rossum)
Date: Tue, 19 Dec 2000 19:46:18 +0100
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <14911.39111.710940.342986@anthem.concentric.net>
References: <l03102804b6653dd31c4e@[193.78.237.123]>
Message-ID: <l03102806b6655cbd62fa@[193.78.237.123]>

At 12:20 PM -0500 19-12-2000, Barry A. Warsaw wrote:
>I agree.  Have we agreed on the semantics and signature of builtin
>join() though?  Is it just string.join() stuck in builtins?

Yep. I'm with /F that further generalization can be done later. Oh, does
this mean that "".join() becomes deprecated? (Nice test case for the
warning framework...)

Just





From barry at digicool.com  Tue Dec 19 19:56:45 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 19 Dec 2000 13:56:45 -0500
Subject: [Python-Dev] Death to string functions!
References: <l03102804b6653dd31c4e@[193.78.237.123]>
	<l03102806b6655cbd62fa@[193.78.237.123]>
Message-ID: <14911.44909.414520.788073@anthem.concentric.net>

>>>>> "JvR" == Just van Rossum <just at letterror.com> writes:

    JvR> Oh, does this mean that "".join() becomes deprecated?

Please, no.




From guido at python.org  Tue Dec 19 19:56:39 2000
From: guido at python.org (Guido van Rossum)
Date: Tue, 19 Dec 2000 13:56:39 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: Your message of "Tue, 19 Dec 2000 13:56:45 EST."
             <14911.44909.414520.788073@anthem.concentric.net> 
References: <l03102804b6653dd31c4e@[193.78.237.123]> <l03102806b6655cbd62fa@[193.78.237.123]>  
            <14911.44909.414520.788073@anthem.concentric.net> 
Message-ID: <200012191856.NAA30524@cj20424-a.reston1.va.home.com>

> >>>>> "JvR" == Just van Rossum <just at letterror.com> writes:
> 
>     JvR> Oh, does this mean that "".join() becomes deprecated?
> 
> Please, no.

No.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From just at letterror.com  Tue Dec 19 20:15:19 2000
From: just at letterror.com (Just van Rossum)
Date: Tue, 19 Dec 2000 20:15:19 +0100
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <14911.44909.414520.788073@anthem.concentric.net>
References: <l03102804b6653dd31c4e@[193.78.237.123]>
 <l03102806b6655cbd62fa@[193.78.237.123]>
Message-ID: <l03102808b665629fc5bf@[193.78.237.123]>

At 1:56 PM -0500 19-12-2000, Barry A. Warsaw wrote:
>>>>>> "JvR" == Just van Rossum <just at letterror.com> writes:
>
>    JvR> Oh, does this mean that "".join() becomes deprecated?
>
>Please, no.

And keep two non-deprecated ways to do the same thing? I'm not saying it
should be removed, just that the powers that be declare that _one_ of them
is the preferred way.

And-if-that-one-isn't-builtin-join()-I-don't-know-why-to-even-bother y'rs
-- Just





From greg at cosc.canterbury.ac.nz  Tue Dec 19 23:35:05 2000
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 20 Dec 2000 11:35:05 +1300 (NZDT)
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012191537.KAA28909@cj20424-a.reston1.va.home.com>
Message-ID: <200012192235.LAA02763@s454.cosc.canterbury.ac.nz>

Guido:

> Boy, are you stirring up a can of worms that we've been through many
> times before!  Nothing you say hasn't been said at least a hundred
> times before, on this list as well as on c.l.py.

And I'll wager you'll continue to hear them said at regular intervals
for a long time to come, because you've done something which a lot of
people feel very strongly was a mistake, and they have some very
rational arguments as to why it was a mistake, whereas you don't seem
to have any arguments to the contrary which those people are likely to
find convincing.

> There really seem to be only two possibilities that don't have this
> problem: (1) make it a built-in, or (2) make it a method on strings.

False dichotomy. Some other possibilities:

(3) Use an operator.

(4) Leave it in the string module! Really, I don't see what
would be so bad about that. You still need somewhere to put
all the string-related constants, so why not keep the string
module for those, plus the few functions that don't have
any other obvious place?

> If " ".join(L) bugs you, try this:
>
>    space = " "	 # This could be a global
>     .
>     .
>     .
>    s = space.join(L)

Surely you must realise that this completely fails to
address Mr. Petrilli's concern?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From akuchlin at mems-exchange.org  Wed Dec 20 15:40:58 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 20 Dec 2000 09:40:58 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <E148ZW1-0000xd-00@usw-sf-web3.sourceforge.net>; from noreply@sourceforge.net on Tue, Dec 19, 2000 at 07:02:05PM -0800
References: <E148ZW1-0000xd-00@usw-sf-web3.sourceforge.net>
Message-ID: <20001220094058.A17623@kronos.cnri.reston.va.us>

On Tue, Dec 19, 2000 at 07:02:05PM -0800, noreply at sourceforge.net wrote:
>Date: 2000-Dec-19 19:02
>By: tim_one

>Unrelated to your patch but in the same area: the other msg, "ord()
>expected string or Unicode character", doesn't read right.  The type
>names in question are "string" and "unicode":
>
>>>> type("")
><type 'string'>
>>>> type(u"")
><type 'unicode'>
>>>>
>
>"character" is out of place, or not in enough places.  Just thought I'd mention that, since *you're* so cute!

Is it OK to refer to 8-bit strings under that name?  
How about "expected an 8-bit string or Unicode string", when the object passed to ord() isn't of the right type.

Similarly, when the value is of the right type but has length>1,
the message is "ord() expected a character, length-%d string found".  
Should that be "length-%d (string / unicode) found)" 

And should the type names be changed to '8-bit string'/'Unicode
string', maybe?

--amk



From barry at digicool.com  Wed Dec 20 16:39:30 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Wed, 20 Dec 2000 10:39:30 -0500
Subject: [Python-Dev] IGNORE - this is only a test
Message-ID: <14912.53938.280864.596141@anthem.concentric.net>

Testing the new MX for python.org...



From fdrake at acm.org  Wed Dec 20 17:57:09 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 20 Dec 2000 11:57:09 -0500 (EST)
Subject: [Python-Dev] scp with SourceForge
Message-ID: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>

  I've not been able to get scp to work with SourceForge since they
upgraded their machines.  ssh works fine.  Is this related to the
protocol mismatch problem that was discussed earlier?  My ssh tells me
"SSH Version OpenSSH-1.2.2, protocol version 1.5.", and the remote
sshd is sending it's version as "Remote protocol version 1.99, remote
software version OpenSSH_2.2.0p1".
  Was there a reasonable way to deal with this?  I'm running
Linux-Mandrake 7.1 with very little customization or extra stuff.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tismer at tismer.com  Wed Dec 20 17:31:00 2000
From: tismer at tismer.com (Christian Tismer)
Date: Wed, 20 Dec 2000 18:31:00 +0200
Subject: [Python-Dev] Re: The Dictionary Gem is polished!
References: <LNBBLJKPBEHFEDALKOLCGEHJIEAA.tim.one@home.com> <3A3F9C16.562F9D9F@tismer.com>
Message-ID: <3A40DEC4.5F659E8E@tismer.com>


Christian Tismer wrote:
...
When talking about left rotation, an error crept in. Sorry!

> We would then have:
> 
>                 incr = ROTATE_LEFT(incr, 1)
>                 if (incr > mask):
>                     incr = incr ^ mp.ma_poly

If incr contains the high bits of the hash, then the
above must be replaced by

                incr = ROTATE_LEFT(incr, 1)
                if (incr & (mask+1)):
                    incr = incr ^ mp.ma_poly

or the multiplicative group is not guaranteed to be
generated, obviously.

This doesn't change my results, rotating right is still my choice.

ciao - chris

D:\crml_doc\platf\py>python dictest.py
N=1000
trips for strings old=293/9 new=302/7 new2=221/7 rot=272/8
trips for bad integers old=499500/999 new=13187/31 new2=999/1 rot=16982/27
trips for random integers old=339/9 new=337/7 new2=343/10 rot=342/8
trips for windows names old=230/5 new=207/7 new2=200/5 rot=225/6
N=2000
trips for strings old=1093/11 new=1109/10 new2=786/6 rot=1090/9
trips for bad integers old=0/0 new=26455/32 new2=1999/1 rot=33985/31
trips for random integers old=747/10 new=733/7 new2=734/7 rot=728/8
trips for windows names old=503/8 new=542/9 new2=564/6 rot=521/11
N=3000
trips for strings old=810/5 new=839/6 new2=609/5 rot=820/6
trips for bad integers old=0/0 new=38681/36 new2=2999/1 rot=50985/26
trips for random integers old=709/4 new=728/5 new2=767/5 rot=711/6
trips for windows names old=712/6 new=711/5 new2=691/5 rot=727/7
N=4000
trips for strings old=1850/9 new=1843/8 new2=1375/11 rot=1861/9
trips for bad integers old=0/0 new=52994/39 new2=3999/1 rot=67986/26
trips for random integers old=1584/9 new=1606/8 new2=1505/9 rot=1579/8
trips for windows names old=1449/8 new=1434/8 new2=1457/11 rot=1476/7
-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From tim.one at home.com  Wed Dec 20 20:52:40 2000
From: tim.one at home.com (Tim Peters)
Date: Wed, 20 Dec 2000 14:52:40 -0500
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>

[Fred L. Drake, Jr.]
>   I've not been able to get scp to work with SourceForge since they
> upgraded their machines.  ssh works fine.

Same here.  In particular, I can use ssh to log in to shell.sourceforge.net,
but attempts to scp there act like this (breaking long lines by hand with
\n\t):

> scp -v pep-0042.html
	tim_one at shell.sourceforge.net:/home/groups/python/htdocs/peps
Executing: host shell.sourceforge.net, user tim_one,
	command scp -v -t /home/groups/python/htdocs/peps
SSH Version 1.2.14 [winnt-4.0-x86], protocol version 1.4.
Standard version.  Does not use RSAREF.
ssh_connect: getuid 0 geteuid 0 anon 0
Connecting to shell.sourceforge.net [216.136.171.201] port 22.
Connection established.
Remote protocol version 1.99, remote software version OpenSSH_2.2.0p1
Waiting for server public key.
Received server public key (768 bits) and host key (1024 bits).
Host 'shell.sourceforge.net' is known and matches the host key.
Initializing random; seed file C:\Code/.ssh/random_seed
IDEA not supported, using 3des instead.
Encryption type: 3des
Sent encrypted session key.
Received encrypted confirmation.
Trying RSA authentication with key 'sourceforge'
Server refused our key.
Doing password authentication.
Password:  **** here tim enteredth his password ****
Sending command: scp -v -t /home/groups/python/htdocs/peps
Entering interactive session.

And there it sits forever.  Several others report the same symptom on SF
forums, and assorted unresolved SF Support and Bug reports.  We don't know
what your symptom is!

> Is this related to the protocol mismatch problem that was discussed
> earlier?

Doubt it.  Most commentators pin the blame elsewhere.

> ...
>   Was there a reasonable way to deal with this?

A new note was added to

http://sourceforge.net/support/?func=detailsupport&support_id=110235&group_i
d=1

today, including:

"""
Re: Shell server

We're also aware of the number of problems on the shell server with respect
to restricitive permissions on some programs - and sourcing of shell
environments.  We're also aware of the troubles with scp and transferring
files.  As a work around, we recommend either editing files on the shell
server, or scping files to the shell server from external hosts to the shell
server, whilst logged in to the shell server.
"""

So there you go:  scp files to the shell server from external hosts to the
shell server whilst logged in to the shell server <wink>.

Is scp working for *anyone*???




From fdrake at acm.org  Wed Dec 20 21:17:58 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 20 Dec 2000 15:17:58 -0500 (EST)
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>
References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>
Message-ID: <14913.5110.271684.107030@cj42289-a.reston1.va.home.com>

Tim Peters writes:
 > And there it sits forever.  Several others report the same symptom on SF
 > forums, and assorted unresolved SF Support and Bug reports.  We don't know
 > what your symptom is!

  Exactly the same.

 > So there you go:  scp files to the shell server from external hosts to the
 > shell server whilst logged in to the shell server <wink>.

  Yeah, that really helps.... NOT!  All I want to be able to do is
post a new development version of the documentation.  ;-(


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From bckfnn at worldonline.dk  Wed Dec 20 21:23:33 2000
From: bckfnn at worldonline.dk (Finn Bock)
Date: Wed, 20 Dec 2000 20:23:33 GMT
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
Message-ID: <3a411449.5247545@smtp.worldonline.dk>

[Fred L. Drake]

>  I've not been able to get scp to work with SourceForge since they
>upgraded their machines.  ssh works fine.  Is this related to the
>protocol mismatch problem that was discussed earlier?  My ssh tells me
>"SSH Version OpenSSH-1.2.2, protocol version 1.5.", and the remote
>sshd is sending it's version as "Remote protocol version 1.99, remote
>software version OpenSSH_2.2.0p1".
>  Was there a reasonable way to deal with this?  I'm running
>Linux-Mandrake 7.1 with very little customization or extra stuff.

I managed to update the jython website by logging into the shell machine
by ssh and doing a ftp back to my machine (using the IP number). That
isn't exactly reasonable, but I was desperate.

regards,
finn



From tim.one at home.com  Wed Dec 20 21:42:11 2000
From: tim.one at home.com (Tim Peters)
Date: Wed, 20 Dec 2000 15:42:11 -0500
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14913.5110.271684.107030@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENBIEAA.tim.one@home.com>

[Tim]
> So there you go:  scp files to the shell server from external
> hosts to the shell server whilst logged in to the shell server <wink>.

[Fred]
>   Yeah, that really helps.... NOT!  All I want to be able to do is
> post a new development version of the documentation.  ;-(

All I want to do is make a measly change to a PEP -- I'm afraid it doesn't
ask how trivial your intents are.  If some suck^H^H^H^Hdeveloper admits that
scp works for them, maybe we can mail them stuff and have *them* copy it
over.

no-takers-so-far-though-ly y'rs  - tim




From barry at digicool.com  Wed Dec 20 21:49:00 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Wed, 20 Dec 2000 15:49:00 -0500
Subject: [Python-Dev] scp with SourceForge
References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>
Message-ID: <14913.6972.934625.840781@anthem.concentric.net>

>>>>> "TP" == Tim Peters <tim.one at home.com> writes:

    TP> So there you go: scp files to the shell server from external
    TP> hosts to the shell server whilst logged in to the shell server
    TP> <wink>.

Psheesh, /that/ was obvious.  Did you even have to ask?

    TP> Is scp working for *anyone*???

Nope, same thing happens to me; it just hangs.
-Barry



From tim.one at home.com  Wed Dec 20 21:53:38 2000
From: tim.one at home.com (Tim Peters)
Date: Wed, 20 Dec 2000 15:53:38 -0500
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14913.6972.934625.840781@anthem.concentric.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKENEIEAA.tim.one@home.com>

[Tim, quoting a bit of immortal SF support prose]
>     TP> So there you go: scp files to the shell server from external
>     TP> hosts to the shell server whilst logged in to the shell server
>     TP> <wink>.

[Barry]
> Psheesh, /that/ was obvious.  Did you even have to ask?

Actually, isn't this easy to do on Linux?  That is, run an ssh server
(whatever) on your home machine, log in to the SF shell (which everyone
seems able to do), then

   scp  whatever  your_home_IP_address:your_home_path

from the SF shell?  Heck, I can even get that to work on Windows, except I
don't know how to set up anything on my end to accept the connection <wink>.

>     TP> Is scp working for *anyone*???

> Nope, same thing happens to me; it just hangs.

That's good to know -- since nobody else mentioned this, Fred probably
figured he was unique.

not-that-he-isn't-it's-just-that-he's-not-ly y'rs  - tim




From fdrake at acm.org  Wed Dec 20 21:52:10 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 20 Dec 2000 15:52:10 -0500 (EST)
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKENEIEAA.tim.one@home.com>
References: <14913.6972.934625.840781@anthem.concentric.net>
	<LNBBLJKPBEHFEDALKOLCKENEIEAA.tim.one@home.com>
Message-ID: <14913.7162.824838.63143@cj42289-a.reston1.va.home.com>

Tim Peters writes:
 > Actually, isn't this easy to do on Linux?  That is, run an ssh server
 > (whatever) on your home machine, log in to the SF shell (which everyone
 > seems able to do), then
 > 
 >    scp  whatever  your_home_IP_address:your_home_path
 > 
 > from the SF shell?  Heck, I can even get that to work on Windows, except I
 > don't know how to set up anything on my end to accept the connection <wink>.

  Err, yes, that's easy to do, but... that means putting your private
key on SourceForge.  They're a great bunch of guys, but they can't
have my private key!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tim.one at home.com  Wed Dec 20 22:06:07 2000
From: tim.one at home.com (Tim Peters)
Date: Wed, 20 Dec 2000 16:06:07 -0500
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <14913.7162.824838.63143@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENGIEAA.tim.one@home.com>

[Fred]
>   Err, yes, that's easy to do, but... that means putting your private
> key on SourceForge.  They're a great bunch of guys, but they can't
> have my private key!

So generate a unique one-shot key pair for the life of the copy.  I can do
that for you on Windows if you lack a real OS <snort>.




From thomas at xs4all.net  Wed Dec 20 23:59:49 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 20 Dec 2000 23:59:49 +0100
Subject: [Python-Dev] scp with SourceForge
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>; from tim.one@home.com on Wed, Dec 20, 2000 at 02:52:40PM -0500
References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAEMOIEAA.tim.one@home.com>
Message-ID: <20001220235949.F29681@xs4all.nl>

On Wed, Dec 20, 2000 at 02:52:40PM -0500, Tim Peters wrote:

> So there you go:  scp files to the shell server from external hosts to the
> shell server whilst logged in to the shell server <wink>.

> Is scp working for *anyone*???

Not for me, anyway. And I'm not just saying that to avoid scp-duty :) And
I'm using the same ssh version, which works fine on all other machines. It
probably has to do with the funky setup Sourceforge uses. (Try looking at
'df' and 'cat /proc/mounts', and comparing the two -- you'll see what I mean
:) That also means I'm not tempted to try and reproduce it, obviously :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tim.one at home.com  Thu Dec 21 04:24:12 2000
From: tim.one at home.com (Tim Peters)
Date: Wed, 20 Dec 2000 22:24:12 -0500
Subject: [Python-Dev] Death to string functions!
In-Reply-To: <200012192235.LAA02763@s454.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEOEIEAA.tim.one@home.com>

[Guido]
>> Boy, are you stirring up a can of worms that we've been through many
>> times before!  Nothing you say hasn't been said at least a hundred
>> times before, on this list as well as on c.l.py.

[Greg Ewing]
> And I'll wager you'll continue to hear them said at regular intervals
> for a long time to come, because you've done something which a lot of
> people feel very strongly was a mistake, and they have some very
> rational arguments as to why it was a mistake, whereas you don't seem
> to have any arguments to the contrary which those people are likely to
> find convincing.

Then it's a wash:  Guido doesn't find their arguments convincing either, and
ties favor the status quo even in the absence of BDFLness.

>> There really seem to be only two possibilities that don't have this
>> problem: (1) make it a built-in, or (2) make it a method on strings.

> False dichotomy. Some other possibilities:
>
> (3) Use an operator.

Oh, that's likely <wink>.

> (4) Leave it in the string module! Really, I don't see what
> would be so bad about that. You still need somewhere to put
> all the string-related constants, so why not keep the string
> module for those, plus the few functions that don't have
> any other obvious place?

Guido said he wants to deprecate the entire string module, so that Python
can eventually warn on the mere presence of "import string".  That's what he
said when I earlier ranted in favor of keeping the string module around.

My guess is that making it a builtin is the only alternative that stands any
chance at this point.

>> If " ".join(L) bugs you, try this:
>>
>>    space = " "	 # This could be a global
>>     .
>>     .
>>     .
>>    s = space.join(L)

> Surely you must realise that this completely fails to
> address Mr. Petrilli's concern?

Don't know about Guido, but I don't realize that, and we haven't heard back
from Charles.  His objections were raised the first day " ".join was
suggested, space.join was suggested almost immediately after, and that
latter suggestion did seem to pacify at least several objectors.  Don't know
whether it makes Charles happier, but since it *has* made others happier in
the past, it's not unreasonable to imagine that Charles might like it too.

if-we're-to-be-swayed-by-his-continued-outrage-afraid-it-will-
    have-to-come-from-him-ly y'rs  - tim




From tim.one at home.com  Thu Dec 21 08:44:19 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 21 Dec 2000 02:44:19 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <20001220094058.A17623@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEOMIEAA.tim.one@home.com>

[Andrew Kuchling]
> Is it OK to refer to 8-bit strings under that name?
> How about "expected an 8-bit string or Unicode string", when the
> object passed to ord() isn't of the right type.
>
> Similarly, when the value is of the right type but has length>1,
> the message is "ord() expected a character, length-%d string found".
> Should that be "length-%d (string / unicode) found)"
>
> And should the type names be changed to '8-bit string'/'Unicode
> string', maybe?

Actually, upon reflection I think it was a mistake to add all these "or
Unicode" clauses to the error msgs to begin with.  Python used to have only
one string type, we're saying that's also a hope for the future, and in the
meantime I know I'd have no trouble understanding "string" as including both
8-bit strings and Unicode strings.

So we should say "8-bit string" or "Unicode string" when *only* one of those
is allowable.  So

    "ord() expected string ..."

instead of (even a repaired version of)

    "ord() expected string or Unicode character ..."

but-i'm-not-even-motivated-enough-to-finish-this-sig-




From tim.one at home.com  Thu Dec 21 09:52:54 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 21 Dec 2000 03:52:54 -0500
Subject: [Python-Dev] RE: The Dictionary Gem is polished!
In-Reply-To: <3A3F9C16.562F9D9F@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPAIEAA.tim.one@home.com>

[Christian Tismer]
> Are you saying I should check the thing in? Really?

Of course.  The first thing you talked about showed a major improvement in
some bad cases, did no harm in the others, and both results were more than
just plausible -- they made compelling sense and were backed by simulation.
So why not check it in?  It's a clear net win!

Stuff since then has been a spattering of maybe-good maybe-bad maybe-neutral
ideas that hasn't gotten anywhere conclusive.  What I want to avoid is
another "Unicode compression" scenario, where we avoid grabbing a clear win
for months just because it may not be the best possible of all conceivable
compression schemes -- and then mistakes get made in a last-second rush to
get *any* improvement.

Checking in a clear improvement today does not preclude checking in a better
one next week <wink>.

> ...
> Ok, we have to stick with the given polymomials to stay
> compatible,

Na, feel free to explore that too, if you like.  It really should get some
study!  The polys there now are utterly arbitrary:  of all polys that happen
to be irreducible and that have x as a primitive root in the induced
multiplicative group, these are simply the smallest when viewed as binary
integers.  That's because they were *found* by trying all odd binary ints
with odd parity (even ints and ints with even parity necessarily correspond
to reducible polys), starting with 2**N+3 and going up until finding the
first one that was both irreducible and had x as a primitive root.  There's
no theory at all that I know of to say that any such poly is any better for
this purpose than any other.  And they weren't tested for that either --
they're simply the first ones "that worked at all" in a brute search.

Besides, Python's "better than random" dict behavior-- when it obtains! --is
almost entirely due to that its hash functions produce distinct starting
indices more often than a random hash function would.  The contribution of
the GF-based probe sequence in case of collision is to avoid the terrible
behavior most other forms of probe sequence would cause given that Python's
hash functions also tend to fill solid contiguous slices of the table more
often than would a random hash function.

[stuff about rotation]
> ...
> Too bad that this isn't supported in C. It is a native
> machine instruction on X86 machines.

Guido long ago rejected hash functions based on rotation for this reason;
he's not likely to approve of rotations more in the probe sequence <wink>.

A similar frustration is that almost modern CPUs have a fast instruction to
get at the high 32 bits of a 32x32->64 bit multiply:  another way to get the
high bits of the hash code into play is to multiply the 32-bit hash code by
a 32-bit constant (see Knuth for "Fibonacci hashing" details), and take the
least-significant N bits of the *upper* 32 bits of the 64-bit product as the
initial table index.  If the constant is chosen correctly, this defines a
permutation on the space of 32-bit unsigned ints, and can be very effective
at "scrambling" arithmetic progressions (which Python's hash functions often
produce).  But C doesn't give a decent way to get at that either.

> ...
> On the current "faster than random" cases, I assume that
> high bits in the hash are less likely than low bits,

I'm not sure what this means.  As the comment in dictobject.c says, it's
common for Python's hash functions to return a result with lots of leading
zeroes.  But the lookup currently applies ~ to those first (which is a bad
idea -- see earlier msgs), so the actual hash that gets *used* often has
lots of leading ones.

> so it is more likely that an entry finds its good place in the dict,
> before bits are rotated in. hence the "good" cases would be kept.

I can agree with this easily if I read the above as asserting that in the
very good cases today, the low bits of hashes (whether or not ~ is applied)
vary more than the high bits.

> ...
> Random integers seem to withstand any of these procedures.

If you wanted to, you could *define* random this way <wink>.

> ...
> I'd say let's do the patch   --   ciao - chris

full-circle-ly y'rs  - tim




From mal at lemburg.com  Thu Dec 21 12:16:27 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 21 Dec 2000 12:16:27 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
References: <LNBBLJKPBEHFEDALKOLCKEOMIEAA.tim.one@home.com>
Message-ID: <3A41E68B.6B12CD71@lemburg.com>

Tim Peters wrote:
> 
> [Andrew Kuchling]
> > Is it OK to refer to 8-bit strings under that name?
> > How about "expected an 8-bit string or Unicode string", when the
> > object passed to ord() isn't of the right type.
> >
> > Similarly, when the value is of the right type but has length>1,
> > the message is "ord() expected a character, length-%d string found".
> > Should that be "length-%d (string / unicode) found)"
> >
> > And should the type names be changed to '8-bit string'/'Unicode
> > string', maybe?
> 
> Actually, upon reflection I think it was a mistake to add all these "or
> Unicode" clauses to the error msgs to begin with.  Python used to have only
> one string type, we're saying that's also a hope for the future, and in the
> meantime I know I'd have no trouble understanding "string" as including both
> 8-bit strings and Unicode strings.
> 
> So we should say "8-bit string" or "Unicode string" when *only* one of those
> is allowable.  So
> 
>     "ord() expected string ..."
> 
> instead of (even a repaired version of)
> 
>     "ord() expected string or Unicode character ..."

I think this has to do with understanding that there are two
string types in Python 2.0 -- a novice won't notice this until
she sees the error message.

My understanding is similar to yours, "string" should mean
"any string object" and in cases where the difference between
8-bit string and Unicode matters, these should be referred to
as "8-bit string" and "Unicode string".

Still, I think it is a good idea to make people aware of the
possibility of passing Unicode objects to these functions, so
perhaps the idea of adding both possibilies to error messages
is not such a bad idea for 2.1. The next phases would be converting
all messages back to "string" and then convert all strings to
Unicode ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From akuchlin at mems-exchange.org  Thu Dec 21 19:37:19 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 21 Dec 2000 13:37:19 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEOMIEAA.tim.one@home.com>; from tim.one@home.com on Thu, Dec 21, 2000 at 02:44:19AM -0500
References: <20001220094058.A17623@kronos.cnri.reston.va.us> <LNBBLJKPBEHFEDALKOLCKEOMIEAA.tim.one@home.com>
Message-ID: <20001221133719.B11880@kronos.cnri.reston.va.us>

On Thu, Dec 21, 2000 at 02:44:19AM -0500, Tim Peters wrote:
>So we should say "8-bit string" or "Unicode string" when *only* one of those
>is allowable.  So

OK... how about this patch?

Index: bltinmodule.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v
retrieving revision 2.185
diff -u -r2.185 bltinmodule.c
--- bltinmodule.c	2000/12/20 15:07:34	2.185
+++ bltinmodule.c	2000/12/21 18:36:54
@@ -1524,13 +1524,14 @@
 		}
 	} else {
 		PyErr_Format(PyExc_TypeError,
-			     "ord() expected string or Unicode character, " \
+			     "ord() expected string of length 1, but " \
 			     "%.200s found", obj->ob_type->tp_name);
 		return NULL;
 	}
 
 	PyErr_Format(PyExc_TypeError, 
-		     "ord() expected a character, length-%d string found",
+		     "ord() expected a character, "
+                     "but string of length %d found",
 		     size);
 	return NULL;
 }



From thomas at xs4all.net  Fri Dec 22 16:21:43 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 22 Dec 2000 16:21:43 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>; from noreply@sourceforge.net on Fri, Dec 22, 2000 at 07:07:03AM -0800
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
Message-ID: <20001222162143.A5515@xs4all.nl>

On Fri, Dec 22, 2000 at 07:07:03AM -0800, noreply at sourceforge.net wrote:

>   * Guido-style:  8-column hard-tab indents.
>   * New style:  4-column space-only indents.

Hm, I must have missed this... Is 'new style' the preferred style, as its
name suggests, or is Guido mounting a rebellion to adhere to the One True
Style (or rather his own version of it, which just has the * in pointer
type declarations wrong ? :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From fdrake at acm.org  Fri Dec 22 16:31:21 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Dec 2000 10:31:21 -0500 (EST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <20001222162143.A5515@xs4all.nl>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222162143.A5515@xs4all.nl>
Message-ID: <14915.29641.806901.661707@cj42289-a.reston1.va.home.com>

Thomas Wouters writes:
 > Hm, I must have missed this... Is 'new style' the preferred style, as its
 > name suggests, or is Guido mounting a rebellion to adhere to the One True
 > Style (or rather his own version of it, which just has the * in pointer
 > type declarations wrong ? :)

  Guido has grudgingly granted that new code in the "New style" is
acceptable, mostly because many people complain that "Guido style"
causes too much code to get scrunched up on the right margin.  The
"New style" is more like the recommendations for Python code as well,
so it's easier for Python programmers to read (Tabs are hard to read
clearly! ;).


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From cgw at fnal.gov  Fri Dec 22 16:43:45 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Fri, 22 Dec 2000 09:43:45 -0600 (CST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add
 panel support
In-Reply-To: <14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222162143.A5515@xs4all.nl>
	<14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
Message-ID: <14915.30385.201343.360880@buffalo.fnal.gov>

Fred L. Drake, Jr. writes:
 > 
 >   Guido has grudgingly granted that new code in the "New style" is
 > acceptable, mostly because many people complain that "Guido style"
 > causes too much code to get scrunched up on the right margin.

I am reminded of Linus Torvalds comments on this subject (see
/usr/src/linux/Documentation/CodingStyle):

  Now, some people will claim that having 8-character indentations
  makes the code move too far to the right, and makes it hard to read
  on a 80-character terminal screen.  The answer to that is that if
  you need more than 3 levels of indentation, you're screwed anyway,
  and should fix your program.




From fdrake at acm.org  Fri Dec 22 16:58:56 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Dec 2000 10:58:56 -0500 (EST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add
 panel support
In-Reply-To: <14915.30385.201343.360880@buffalo.fnal.gov>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222162143.A5515@xs4all.nl>
	<14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
	<14915.30385.201343.360880@buffalo.fnal.gov>
Message-ID: <14915.31296.56181.260479@cj42289-a.reston1.va.home.com>

Charles G Waldman writes:
 > I am reminded of Linus Torvalds comments on this subject (see
 > /usr/src/linux/Documentation/CodingStyle):

  The catch, of course, is Python/cevel.c, where breaking it up can
hurt performance.  People scream when you do things like that....


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From cgw at fnal.gov  Fri Dec 22 17:07:47 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Fri, 22 Dec 2000 10:07:47 -0600 (CST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add
 panel support
In-Reply-To: <14915.31296.56181.260479@cj42289-a.reston1.va.home.com>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222162143.A5515@xs4all.nl>
	<14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
	<14915.30385.201343.360880@buffalo.fnal.gov>
	<14915.31296.56181.260479@cj42289-a.reston1.va.home.com>
Message-ID: <14915.31827.250987.283364@buffalo.fnal.gov>

Fred L. Drake, Jr. writes:
 > 
 >   The catch, of course, is Python/cevel.c, where breaking it up can
 > hurt performance.  People scream when you do things like that....

Quoting again from the same source:

 Use helper functions with descriptive names (you can ask the compiler
 to in-line them if you think it's performance-critical, and it will
 probably do a better job of it that you would have done).

But I should have pointed out that I was quoting the great Linus
mostly for entertainment/cultural value, and was not really trying to
add fuel to the fire.  In other words, a message that I thought was
amusing, but probably shouldn't have sent ;-)



From fdrake at acm.org  Fri Dec 22 17:20:52 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Dec 2000 11:20:52 -0500 (EST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add
 panel support
In-Reply-To: <14915.31827.250987.283364@buffalo.fnal.gov>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222162143.A5515@xs4all.nl>
	<14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
	<14915.30385.201343.360880@buffalo.fnal.gov>
	<14915.31296.56181.260479@cj42289-a.reston1.va.home.com>
	<14915.31827.250987.283364@buffalo.fnal.gov>
Message-ID: <14915.32612.252115.562296@cj42289-a.reston1.va.home.com>

Charles G Waldman writes:
 > But I should have pointed out that I was quoting the great Linus
 > mostly for entertainment/cultural value, and was not really trying to
 > add fuel to the fire.  In other words, a message that I thought was
 > amusing, but probably shouldn't have sent ;-)

  I understood the intent; I think he's really got a point.  There are
a few places in Python where it would really help to break things up!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From fredrik at effbot.org  Fri Dec 22 17:33:37 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 22 Dec 2000 17:33:37 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net><20001222162143.A5515@xs4all.nl><14915.29641.806901.661707@cj42289-a.reston1.va.home.com><14915.30385.201343.360880@buffalo.fnal.gov><14915.31296.56181.260479@cj42289-a.reston1.va.home.com><14915.31827.250987.283364@buffalo.fnal.gov> <14915.32612.252115.562296@cj42289-a.reston1.va.home.com>
Message-ID: <004b01c06c34$f08151c0$e46940d5@hagrid>

Fred wrote:
>   I understood the intent; I think he's really got a point.  There are
> a few places in Python where it would really help to break things up!

if that's what you want, maybe you could start by
putting the INLINE stuff back again? <halfwink>

(if C/C++ compatibility is a problem, put it inside a
cplusplus ifdef, and mark it as "for internal use only.
don't use inline on public interfaces")

</F>




From fdrake at acm.org  Fri Dec 22 17:36:15 2000
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 22 Dec 2000 11:36:15 -0500 (EST)
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <004b01c06c34$f08151c0$e46940d5@hagrid>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222162143.A5515@xs4all.nl>
	<14915.29641.806901.661707@cj42289-a.reston1.va.home.com>
	<14915.30385.201343.360880@buffalo.fnal.gov>
	<14915.31296.56181.260479@cj42289-a.reston1.va.home.com>
	<14915.31827.250987.283364@buffalo.fnal.gov>
	<14915.32612.252115.562296@cj42289-a.reston1.va.home.com>
	<004b01c06c34$f08151c0$e46940d5@hagrid>
Message-ID: <14915.33535.520957.215310@cj42289-a.reston1.va.home.com>

Fredrik Lundh writes:
 > if that's what you want, maybe you could start by
 > putting the INLINE stuff back again? <halfwink>

  I could not see the value in the inline stuff that configure was
setting up, and still don't.

 > (if C/C++ compatibility is a problem, put it inside a
 > cplusplus ifdef, and mark it as "for internal use only.
 > don't use inline on public interfaces")

  We should be able to come up with something reasonable, but I don't
have time right now, and my head isn't currently wrapped around C
compilers.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From akuchlin at cnri.reston.va.us  Fri Dec 22 19:01:43 2000
From: akuchlin at cnri.reston.va.us (Andrew Kuchling)
Date: Fri, 22 Dec 2000 13:01:43 -0500
Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>; from noreply@sourceforge.net on Fri, Dec 22, 2000 at 07:07:03AM -0800
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
Message-ID: <20001222130143.B7127@newcnri.cnri.reston.va.us>

On Fri, Dec 22, 2000 at 07:07:03AM -0800, noreply at sourceforge.net wrote:
>  * Guido-style:  8-column hard-tab indents.
>  * New style:  4-column space-only indents.
>  * _curses style: 2 column indents.
>
>I'd prefer "New style", myself.

New style it is.  (Barry, is the "python" style in cc-mode.el going to
be changed to new style, or a "python2" style added?)

I've been wanting to reformat _cursesmodule.c to match the Python
style for some time.  Probably I'll do that a little while after the
panel module has settled down a bit.

Fred, did you look at the use of the CObject for exposing the API?
Did that look reasonable?  Also, should py_curses.h go in the Include/
subdirectory instead of Modules/?

--amk



From fredrik at effbot.org  Fri Dec 22 19:03:43 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 22 Dec 2000 19:03:43 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net><20001222162143.A5515@xs4all.nl><14915.29641.806901.661707@cj42289-a.reston1.va.home.com><14915.30385.201343.360880@buffalo.fnal.gov><14915.31296.56181.260479@cj42289-a.reston1.va.home.com><14915.31827.250987.283364@buffalo.fnal.gov><14915.32612.252115.562296@cj42289-a.reston1.va.home.com><004b01c06c34$f08151c0$e46940d5@hagrid> <14915.33535.520957.215310@cj42289-a.reston1.va.home.com>
Message-ID: <006701c06c41$896a1a00$e46940d5@hagrid>

Fred wrote:
>  > if that's what you want, maybe you could start by
>  > putting the INLINE stuff back again? <halfwink>
> 
>   I could not see the value in the inline stuff that configure was
> setting up, and still don't.

the INLINE stuff guarantees that "inline" is defined to be
whatever directive the compiler uses for explicit inlining.
quoting the autoconf docs:

    If the C compiler supports the keyword inline,
    do nothing. Otherwise define inline to __inline__
    or __inline if it accepts one of those, otherwise
    define inline to be empty

as a result, you can always use "inline" in your code, and
have it do the right thing on all compilers that support ex-
plicit inlining (all modern C compilers, in practice).

:::

to deal with people compiling Python with a C compiler, but
linking it with a C++ compiler, the config.h.in file could be
written as:

/* Define "inline" to be whatever the C compiler calls it.
    To avoid problems when mixing C and C++, make sure
    to only use "inline" for internal interfaces. */
#ifndef __cplusplus
#undef inline
#endif

</F>




From akuchlin at mems-exchange.org  Fri Dec 22 20:40:15 2000
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Fri, 22 Dec 2000 14:40:15 -0500
Subject: [Python-Dev] PEP 222 draft
Message-ID: <200012221940.OAA01936@207-172-57-45.s45.tnt2.ann.va.dialup.rcn.com>

I've completed a draft of PEP 222 (sort of -- note the XXX comments in
the text for things that still need to be resolved).  This is being
posted to python-dev, python-web-modules, and
python-list/comp.lang.python, to get comments on the proposed
interface.  I'm on all three lists, but would prefer to see followups
on python-list/comp.lang.python, so if you can reply there, please do
so.

--amk

Abstract

    This PEP proposes a set of enhancements to the CGI development
    facilities in the Python standard library.  Enhancements might be
    new features, new modules for tasks such as cookie support, or
    removal of obsolete code.

    The intent is to incorporate the proposals emerging from this
    document into Python 2.1, due to be released in the first half of
    2001.


Open Issues

    This section lists changes that have been suggested, but about
    which no firm decision has yet been made.  In the final version of
    this PEP, this section should be empty, as all the changes should
    be classified as accepted or rejected.

    cgi.py: We should not be told to create our own subclass just so
    we can handle file uploads. As a practical matter, I have yet to
    find the time to do this right, so I end up reading cgi.py's temp
    file into, at best, another file. Some of our legacy code actually
    reads it into a second temp file, then into a final destination!
    And even if we did, that would mean creating yet another object
    with its __init__ call and associated overhead.

    cgi.py: Currently, query data with no `=' are ignored.  Even if
    keep_blank_values is set, queries like `...?value=&...' are
    returned with blank values but queries like `...?value&...' are
    completely lost.  It would be great if such data were made
    available through the FieldStorage interface, either as entries
    with None as values, or in a separate list.

    Utility function: build a query string from a list of 2-tuples

    Dictionary-related utility classes: NoKeyErrors (returns an empty
    string, never a KeyError), PartialStringSubstitution (returns 
    the original key string, never a KeyError)


    
New Modules

    This section lists details about entire new packages or modules
    that should be added to the Python standard library.

    * fcgi.py : A new module adding support for the FastCGI protocol.
      Robin Dunn's code needs to be ported to Windows, though.

Major Changes to Existing Modules

    This section lists details of major changes to existing modules,
    whether in implementation or in interface.  The changes in this
    section therefore carry greater degrees of risk, either in
    introducing bugs or a backward incompatibility.

    The cgi.py module would be deprecated.  (XXX A new module or
    package name hasn't been chosen yet: 'web'?  'cgilib'?)

Minor Changes to Existing Modules

    This section lists details of minor changes to existing modules.
    These changes should have relatively small implementations, and
    have little risk of introducing incompatibilities with previous
    versions.


Rejected Changes

    The changes listed in this section were proposed for Python 2.1,
    but were rejected as unsuitable.  For each rejected change, a
    rationale is given describing why the change was deemed
    inappropriate.

    * An HTML generation module is not part of this PEP.  Several such
      modules exist, ranging from HTMLgen's purely programming
      interface to ASP-inspired simple templating to DTML's complex
      templating.  There's no indication of which templating module to
      enshrine in the standard library, and that probably means that
      no module should be so chosen.

    * cgi.py: Allowing a combination of query data and POST data.
      This doesn't seem to be standard at all, and therefore is
      dubious practice.

Proposed Interface

    XXX open issues: naming convention (studlycaps or
    underline-separated?); need to look at the cgi.parse*() functions
    and see if they can be simplified, too.

    Parsing functions: carry over most of the parse* functions from
    cgi.py
    
    # The Response class borrows most of its methods from Zope's
    # HTTPResponse class.
    
    class Response:
        """
        Attributes:
        status: HTTP status code to return
        headers: dictionary of response headers
        body: string containing the body of the HTTP response
        """
        
        def __init__(self, status=200, headers={}, body=""):
            pass
    
        def setStatus(self, status, reason=None):
            "Set the numeric HTTP response code"
            pass
    
        def setHeader(self, name, value):
            "Set an HTTP header"
            pass
    
        def setBody(self, body):
            "Set the body of the response"
            pass
    
        def setCookie(self, name, value,
                      path = '/',  
                      comment = None, 
                      domain = None, 
                      max-age = None,
                      expires = None,
                      secure = 0
                      ):
            "Set a cookie"
            pass
    
        def expireCookie(self, name):
            "Remove a cookie from the user"
            pass
    
        def redirect(self, url):
            "Redirect the browser to another URL"
            pass
    
        def __str__(self):
            "Convert entire response to a string"
            pass
    
        def dump(self):
            "Return a string representation useful for debugging"
            pass
            
        # XXX methods for specific classes of error:serverError, badRequest, etc.?
    
    
    class Request:
    
        """
        Attributes: 

        XXX should these be dictionaries, or dictionary-like objects?
        .headers : dictionary containing HTTP headers
        .cookies : dictionary of cookies
        .fields  : data from the form
        .env     : environment dictionary
        """
        
        def __init__(self, environ=os.environ, stdin=sys.stdin,
                     keep_blank_values=1, strict_parsing=0):
            """Initialize the request object, using the provided environment
            and standard input."""
            pass
    
        # Should people just use the dictionaries directly?
        def getHeader(self, name, default=None):
            pass
    
        def getCookie(self, name, default=None):
            pass
    
        def getField(self, name, default=None):
            "Return field's value as a string (even if it's an uploaded file)"
            pass
            
        def getUploadedFile(self, name):
            """Returns a file object that can be read to obtain the contents
            of an uploaded file.  XXX should this report an error if the 
            field isn't actually an uploaded file?  Or should it wrap
            a StringIO around simple fields for consistency?
            """
            
        def getURL(self, n=0, query_string=0):
            """Return the URL of the current request, chopping off 'n' path
            components from the right.  Eg. if the URL is
            "http://foo.com/bar/baz/quux", n=2 would return
            "http://foo.com/bar".  Does not include the query string (if
            any)
            """

        def getBaseURL(self, n=0):
            """Return the base URL of the current request, adding 'n' path
            components to the end to recreate more of the whole URL.  
            
            Eg. if the request URL is
            "http://foo.com/q/bar/baz/qux", n=0 would return
            "http://foo.com/", and n=2 "http://foo.com/q/bar".
            
            Returned URL does not include the query string, if any.
            """
        
        def dump(self):
            "String representation suitable for debugging output"
            pass
    
        # Possibilities?  I don't know if these are worth doing in the 
        # basic objects.
        def getBrowser(self):
            "Returns Mozilla/IE/Lynx/Opera/whatever"
    
        def isSecure(self):
            "Return true if this is an SSLified request"
            

    # Module-level function        
    def wrapper(func, logfile=sys.stderr):
        """
        Calls the function 'func', passing it the arguments
        (request, response, logfile).  Exceptions are trapped and
        sent to the file 'logfile'.  
        """
        # This wrapper will detect if it's being called from the command-line,
        # and if so, it will run in a debugging mode; name=value pairs 
        # can be entered on standard input to set field values.
        # (XXX how to do file uploads in this syntax?)

    
Copyright
    
    This document has been placed in the public domain.




From tim.one at home.com  Fri Dec 22 20:31:07 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 22 Dec 2000 14:31:07 -0500
Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support)
In-Reply-To: <20001222162143.A5515@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCAECIIFAA.tim.one@home.com>

[Thomas Wouters]
>>   * Guido-style:  8-column hard-tab indents.
>>   * New style:  4-column space-only indents.
>
> Hm, I must have missed this... Is 'new style' the preferred style, as
> its name suggests, or is Guido mounting a rebellion to adhere to the
> One True Style (or rather his own version of it, which just has
> the * in pointer type declarations wrong ? :)

Every time this comes up wrt C code,

1. Fred repeats that he thinks Guido caved in (but doesn't supply a
reference to anything saying so).

2. Guido repeats that he prefers old-style (but in a wishy-washy way that
leaves it uncertain (*)).

3. Fredrik and/or I repeat a request for a BDFL Pronouncement.

4. And there the thread ends.

It's *very* hard to find this history in the Python-Dev archives because
these threads always have subject lines like this one originally had ("RE:
[Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel
support").

Fred already did the #1 bit in this thread.

You can consider this msg the repeat of #3.

Since Guido is out of town, we can skip #2 and go straight to #4 early
<wink>.


(*) Two examples of #2 from this year:

Subject: Re: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/
Modules mmapmodule.c,2.1,2.2
From: Guido van Rossum <guido at python.org>
Date: Fri, 31 Mar 2000 07:10:45 -0500
> Can we change the 8-space-tab rule for all new C code that goes in?  I
> know that we can't practically change existing code right now, but for
> new C code, I propose we use no tab characters, and we use a 4-space
> block indentation.

Actually, this one was formatted for 8-space indents but using 4-space
tabs, so in my editor it looked like 16-space indents!

Given that we don't want to change existing code, I'd prefer to stick
with 1-tab 8-space indents.



Subject: Re: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules
linuxaudiodev.c,2.2,2.3
From: Guido van Rossum <guido at beopen.com>
Date: Sat, 08 Jul 2000 09:39:51 -0500

> Aren't tabs preferred as C-source indents, instead of 4-spaces ? At
> least, that's what I see in Python/*.c and Object/*.c, but I only
> vaguely recall it from the style document...

Yes, you're right.




From fredrik at effbot.org  Fri Dec 22 21:37:35 2000
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 22 Dec 2000 21:37:35 +0100
Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support)
References: <LNBBLJKPBEHFEDALKOLCAECIIFAA.tim.one@home.com>
Message-ID: <00e201c06c57$052fff00$e46940d5@hagrid>

> 3. Fredrik and/or I repeat a request for a BDFL Pronouncement.

and.

</F>




From akuchlin at mems-exchange.org  Fri Dec 22 22:09:47 2000
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Fri, 22 Dec 2000 16:09:47 -0500
Subject: [Python-Dev] Reviving the bookstore
Message-ID: <200012222109.QAA02737@207-172-57-45.s45.tnt2.ann.va.dialup.rcn.com>

Since the PSA isn't doing anything for us any longer, I've been
working on reviving the bookstore at a new location with a new
affiliate code.  

A draft version is up at its new home,
http://www.kuchling.com/bookstore/ .  Please take a look and offer
comments.  Book authors, please take a look at the entry for your book
and let me know about any corrections.  Links to reviews of books
would also be really welcomed.

I'd like to abolish having book listings with no description or
review, so if you notice a book that you've read has no description,
please feel free to submit a description and/or review.

--amk



From tim.one at home.com  Sat Dec 23 08:15:59 2000
From: tim.one at home.com (Tim Peters)
Date: Sat, 23 Dec 2000 02:15:59 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <3A41E68B.6B12CD71@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDJIFAA.tim.one@home.com>

[Tim]
> ...
> So we should say "8-bit string" or "Unicode string" when *only*
> one of those is allowable.  So
>
>     "ord() expected string ..."
>
> instead of (even a repaired version of)
>
>     "ord() expected string or Unicode character ..."

[MAL]
> I think this has to do with understanding that there are two
> string types in Python 2.0 -- a novice won't notice this until
> she sees the error message.

Except that this error msg has nothing to do with how many string types
there are:  they didn't pass *any* flavor of string when they get this msg.
At the time they pass (say) a float to ord(), that there are currently two
flavors of string is more information than they need to know.

> My understanding is similar to yours, "string" should mean
> "any string object" and in cases where the difference between
> 8-bit string and Unicode matters, these should be referred to
> as "8-bit string" and "Unicode string".

In that happy case of universal harmony, the msg above should say just
"string" and leave it at that.

> Still, I think it is a good idea to make people aware of the
> possibility of passing Unicode objects to these functions,

Me too.

> so perhaps the idea of adding both possibilies to error messages
> is not such a bad idea for 2.1.

But not that.  The user is trying to track down their problem.  Advertising
an irrelevant (to their problem) distinction at that time of crisis is
simply spam.

    TypeError:  ord() requires an 8-bit string or a Unicode string.
                On the other hand, you'd be surprised to discover
                all the things you can pass to chr():  it's not just
                ints.  Long ints are also accepted, by design, and
                due to an obscure bug in the Python internals, you
                can also pass floats, which get truncated to ints.

> The next phases would be converting all messages back to "string"
> and then convert all strings to Unicode ;-)

Then we'll save a lot of work by skipping the need for the first half of
that -- unless you're volunteering to do all of it <wink>.





From tim.one at home.com  Sat Dec 23 08:16:29 2000
From: tim.one at home.com (Tim Peters)
Date: Sat, 23 Dec 2000 02:16:29 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix
In-Reply-To: <20001221133719.B11880@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEDKIFAA.tim.one@home.com>

[Tim]
> So we should say "8-bit string" or "Unicode string" when *only*
> one of those is allowable.

[Andrew]
> OK... how about this patch?

+1 from me.  And maybe if you offer to send a royalty to Marc-Andre each
time it's printed, he'll back down from wanting to use the error msgs as a
billboard <wink>.

> Index: bltinmodule.c
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v
> retrieving revision 2.185
> diff -u -r2.185 bltinmodule.c
> --- bltinmodule.c	2000/12/20 15:07:34	2.185
> +++ bltinmodule.c	2000/12/21 18:36:54
> @@ -1524,13 +1524,14 @@
>  		}
>  	} else {
>  		PyErr_Format(PyExc_TypeError,
> -			     "ord() expected string or Unicode character, " \
> +			     "ord() expected string of length 1, but " \
>  			     "%.200s found", obj->ob_type->tp_name);
>  		return NULL;
>  	}
>
>  	PyErr_Format(PyExc_TypeError,
> -		     "ord() expected a character, length-%d string found",
> +		     "ord() expected a character, "
> +                     "but string of length %d found",
>  		     size);
>  	return NULL;
>  }




From barry at digicool.com  Sat Dec 23 17:43:37 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 23 Dec 2000 11:43:37 -0500
Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222130143.B7127@newcnri.cnri.reston.va.us>
Message-ID: <14916.54841.418495.194558@anthem.concentric.net>

>>>>> "AK" == Andrew Kuchling <akuchlin at cnri.reston.va.us> writes:

    AK> New style it is.  (Barry, is the "python" style in cc-mode.el
    AK> going to be changed to new style, or a "python2" style added?)

There should probably be a second style added to cc-mode.el.  I
haven't maintained that package in a long time, but I'll work out a
patch and send it to the current maintainer.  Let's call it
"python2".

-Barry




From cgw at fnal.gov  Sat Dec 23 18:09:57 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Sat, 23 Dec 2000 11:09:57 -0600 (CST)
Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: <14916.54841.418495.194558@anthem.concentric.net>
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222130143.B7127@newcnri.cnri.reston.va.us>
	<14916.54841.418495.194558@anthem.concentric.net>
Message-ID: <14916.56421.370499.762023@buffalo.fnal.gov>

Barry A. Warsaw writes:

 > There should probably be a second style added to cc-mode.el.  I
 > haven't maintained that package in a long time, but I'll work out a
 > patch and send it to the current maintainer.  Let's call it
 > "python2".

Maybe we should wait for the BDFL's pronouncement?



From barry at digicool.com  Sat Dec 23 20:24:42 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 23 Dec 2000 14:24:42 -0500
Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net>
	<20001222130143.B7127@newcnri.cnri.reston.va.us>
	<14916.54841.418495.194558@anthem.concentric.net>
	<14916.56421.370499.762023@buffalo.fnal.gov>
Message-ID: <14916.64506.56351.443287@anthem.concentric.net>

>>>>> "CGW" == Charles G Waldman <cgw at fnal.gov> writes:

    CGW> Barry A. Warsaw writes:

    >> There should probably be a second style added to cc-mode.el.  I
    >> haven't maintained that package in a long time, but I'll work
    >> out a patch and send it to the current maintainer.  Let's call
    >> it "python2".

    CGW> Maybe we should wait for the BDFL's pronouncement?

Sure, at least before submitting a patch.  Here's the simple one liner
you can add to your .emacs file to play with the new style in the
meantime.

-Barry

(c-add-style "python2" '("python" (c-basic-offset . 4)))




From tim.one at home.com  Sun Dec 24 05:04:47 2000
From: tim.one at home.com (Tim Peters)
Date: Sat, 23 Dec 2000 23:04:47 -0500
Subject: [Python-Dev] PEP 208 and __coerce__
In-Reply-To: <20001209033006.A3737@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEEMIFAA.tim.one@home.com>

[Neil Schemenauer
 Saturday, December 09, 2000 6:30 AM]

> While working on the implementation of PEP 208, I discovered that
> __coerce__ has some surprising properties.  Initially I
> implemented __coerce__ so that the numberic operation currently
> being performed was called on the values returned by __coerce__.
> This caused test_class to blow up due to code like this:
>
>     class Test:
>         def __coerce__(self, other):
>             return (self, other)
>
> The 2.0 "solves" this by not calling __coerce__ again if the
> objects returned by __coerce__ are instances.

If C.__coerce__ doesn't *know* it can do the full job, it should return
None.   This is what's documented, too:  a coerce method should return a
pair consisting of objects of the same type, or return None.

It's always going to be somewhat clumsy since what you really want is double
(or, in the case of pow, sometimes triple) dispatch.

Now there's a deliberate cheat that may not have gotten documented
comprehensibly:  when __coerce__ returns a pair, Python does not check to
verify both elements are of the same class.  That's because "a pair
consisting of objects of the same type" is often not what you *want* from
coerce.  For example, if I've got a matrix class M, then in

    M() + 42

I really don't want M.__coerce__ "promoting" 42 to a multi-gigabyte matrix
matching the shape and size of M().  M.__add__ can deal with that much more
efficiently if it gets 42 directly.  OTOH, M.__coerce__ may want to coerce
types other than scalar numbers to conform to the shape and size of self, or
fiddle self to conform to some other type.

What Python accepts back from __coerce__ has to be flexible enough to allow
all of those without further interference from the interpreter (just ask MAL
<wink>:  the *real* problem in practice is making coerce more of a help than
a burden to the end user; outside of int->long->float->complex (which is
itself partly broken, because long->float can lose precision or even fail
outright), "coercion to a common type" is almost never quite right; note
that C99 introduces distinct imaginary and complex types, because even
auto-conversion of imaginary->complex can be a royal PITA!).

> This has the effect of making code like:
>
>     class A:
>         def __coerce__(self, other):
>             return B(), other
>
>     class B:
>         def __coerce__(self, other):
>             return 1, other
>
>     A() + 1
>
> fail to work in the expected way.

I have no idea how you expected that to work.  Neither coerce() method looks
reasonable:  they don't follow the rules for coerce methods.  If A thinks it
needs to create a B() and have coercion "start over from scratch" with that,
then it should do so explicitly:

    class A:
        def __coerce__(self, other):
            return coerce(B(), other)

> The question is: how should __coerce__ work?

This can't be answered by a one-liner:  the intended behavior is documented
by a complex set of rules at the bottom of Lang Ref 3.3.6 ("Emulating
numeric types").  Alternatives should be written up as a diff against those
rules, which Guido worked hard on in years past -- more than once, too
<wink>.




From esr at thyrsus.com  Mon Dec 25 10:17:23 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 25 Dec 2000 04:17:23 -0500
Subject: [Python-Dev] Tkinter support under RH 7.0?
Message-ID: <20001225041723.A9567@thyrsus.com>

I just upgraded to Red Hat 7.0 and installed Python 2.0.  Anybody have
a recipe for making Tkinter support work in this environment?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Government is not reason, it is not eloquence, it is force; like fire, a
troublesome servant and a fearful master. Never for a moment should it be left
to irresponsible action."
	-- George Washington, in a speech of January 7, 1790



From thomas at xs4all.net  Mon Dec 25 11:59:45 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 25 Dec 2000 11:59:45 +0100
Subject: [Python-Dev] Tkinter support under RH 7.0?
In-Reply-To: <20001225041723.A9567@thyrsus.com>; from esr@thyrsus.com on Mon, Dec 25, 2000 at 04:17:23AM -0500
References: <20001225041723.A9567@thyrsus.com>
Message-ID: <20001225115945.A25820@xs4all.nl>

On Mon, Dec 25, 2000 at 04:17:23AM -0500, Eric S. Raymond wrote:

> I just upgraded to Red Hat 7.0 and installed Python 2.0.  Anybody have
> a recipe for making Tkinter support work in this environment?

I installed Python 2.0 + Tkinter both from the BeOpen rpms and later
from source (for various reasons) and both were a breeze. I didn't really
use the 2.0+tkinter rpm version until I needed Numpy and various other
things and had to revert to the self-compiled version, but it seemed to work
fine.

As far as I can recall, there's only two things you have to keep in mind:
the tcl/tk version that comes with RedHat 7.0 is 8.3, so you have to adjust
the Tkinter section of Modules/Setup accordingly, and some of the
RedHat-supplied scripts stop working because they use deprecated modules (at
least 'rand') and use the socket.socket call wrong.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From esr at thyrsus.com  Wed Dec 27 20:37:50 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 27 Dec 2000 14:37:50 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
Message-ID: <20001227143750.A26894@thyrsus.com>

I have 2.0 up and running on RH7.0, compiled from sources.  In the process, 
I discovered a couple of issues:

1. The curses module is commented out in the default Modules/Setup
file.  This is not good, as it may lead careless distribution builders
to ship Python 2.0s that will not be able to support the curses front
end in CML2.  Supporting CML2 (and thus getting Python the "design
win" of being involved in the Linux kernel build) was the major point
of integrating the curses module into the Python core.  It is possible
that one little "#" may have blown that.

2.The default Modules/Setup file assumes that various Tkinter-related libraries
are in /usr/local.  But /usr would be a more appropriate choice under most
circumstances.  Most Linux users now install their Tcl/Tk stuff from RPMs
or .deb packages that place the binaries and libraries under /usr.  Under
most other Unixes (e.g. Solaris) they were there to begin with.

3. The configure machinery could be made to deduce more about the contents
of Modules/Setup than it does now.  In particular, it's silly that the person
building Python has to fill in the locations of X librasries when 
configure is in principle perfectly capable of finding them.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Our society won't be truly free until "None of the Above" is always an option.



From guido at digicool.com  Wed Dec 27 22:04:27 2000
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 27 Dec 2000 16:04:27 -0500
Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support)
In-Reply-To: Your message of "Fri, 22 Dec 2000 14:31:07 EST."
             <LNBBLJKPBEHFEDALKOLCAECIIFAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCAECIIFAA.tim.one@home.com> 
Message-ID: <200012272104.QAA22278@cj20424-a.reston1.va.home.com>

> 2. Guido repeats that he prefers old-style (but in a wishy-washy way that
> leaves it uncertain (*)).

OK, since a pronouncement is obviously needed, here goes: Python C
source code should be indented using tabs only.

Exceptions:

(1) If 3rd party code is already written using a different style, it
    can stay that way, especially if it's a large volume that would be
    hard to reformat.  But only if it is consistent within a file or
    set of files (e.g. a 3rd party patch will have to conform to the
    prevailing style in the patched file).

(2) Occasionally (e.g. in ceval.c) there is code that's very deeply
    nested.  I will allow 4-space indents for the innermost nesting
    levels here.

Other C whitespace nits:

- Always place spaces around assignment operators, comparisons, &&, ||.

- No space between function name and left parenthesis.

- Always a space between a keyword ('if', 'for' etc.) and left paren.

- No space inside parentheses, brackets etc.

- No space before a comma or semicolon.

- Always a space after a comma (and semicolon, if not at end of line).

- Use ``return x;'' instead of ``return(x)''.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From cgw at fnal.gov  Wed Dec 27 23:17:31 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Wed, 27 Dec 2000 16:17:31 -0600 (CST)
Subject: [Python-Dev] sourceforge: problems with bug list?
Message-ID: <14922.27259.456364.750295@buffalo.fnal.gov>

Is it just me, or is anybody else getting this error when trying to
access the bug list?

 > An error occured in the logger. ERROR: pg_atoi: error in "5470/":
 > can't parse "/" 





From akuchlin at mems-exchange.org  Wed Dec 27 23:39:35 2000
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 27 Dec 2000 17:39:35 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001227143750.A26894@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 27, 2000 at 02:37:50PM -0500
References: <20001227143750.A26894@thyrsus.com>
Message-ID: <20001227173935.A25605@kronos.cnri.reston.va.us>

On Wed, Dec 27, 2000 at 02:37:50PM -0500, Eric S. Raymond wrote:
>1. The curses module is commented out in the default Modules/Setup
>file.  This is not good, as it may lead careless distribution builders

It always has been commented out.  Good distributions ship with most
of the available modules enabled; I can't say if RH7.0 counts as a
good distribution or not (still on 6.2).

>3. The configure machinery could be made to deduce more about the contents
>of Modules/Setup than it does now.  In particular, it's silly that the person

This is the point of PEP 229 and patch #102588, which uses a setup.py
script to build extension modules.  (I need to upload an updated
version of the patch which actually includes setup.py -- thought I did
that, but apparently not...)  The patch is still extremely green,
though, but I think it's the best course; witness the tissue of
hackery required to get the bsddb module automatically detected and
built.

--amk



From guido at digicool.com  Wed Dec 27 23:54:26 2000
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 27 Dec 2000 17:54:26 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support
In-Reply-To: Your message of "Fri, 22 Dec 2000 10:58:56 EST."
             <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> 
References: <E149Tmh-0005KZ-00@usw-sf-web1.sourceforge.net> <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov>  
            <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> 
Message-ID: <200012272254.RAA22931@cj20424-a.reston1.va.home.com>

> Charles G Waldman writes:
>  > I am reminded of Linus Torvalds comments on this subject (see
>  > /usr/src/linux/Documentation/CodingStyle):

Fred replied:
>   The catch, of course, is Python/cevel.c, where breaking it up can
> hurt performance.  People scream when you do things like that....

Funny, Jeremy is doing just that, and it doesn't seem to be hurting
performance at all.  See

 http://sourceforge.net/patch/?func=detailpatch&patch_id=102337&group_id=5470

(though this is not quite finished).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Thu Dec 28 00:05:46 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 27 Dec 2000 18:05:46 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001227173935.A25605@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Wed, Dec 27, 2000 at 05:39:35PM -0500
References: <20001227143750.A26894@thyrsus.com> <20001227173935.A25605@kronos.cnri.reston.va.us>
Message-ID: <20001227180546.A4365@thyrsus.com>

Andrew Kuchling <akuchlin at mems-exchange.org>:
> >1. The curses module is commented out in the default Modules/Setup
> >file.  This is not good, as it may lead careless distribution builders
> 
> It always has been commented out.  Good distributions ship with most
> of the available modules enabled; I can't say if RH7.0 counts as a
> good distribution or not (still on 6.2).

I think this needs to change.  If curses is a core facility  now, the
default build should tread it as one.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

If a thousand men were not to pay their tax-bills this year, that would
... [be] the definition of a peaceable revolution, if any such is possible.
	-- Henry David Thoreau



From tim.one at home.com  Thu Dec 28 01:44:29 2000
From: tim.one at home.com (Tim Peters)
Date: Wed, 27 Dec 2000 19:44:29 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Misc python-mode.el,3.108,3.109
In-Reply-To: <E14BKaD-0004JB-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKFIFAA.tim.one@home.com>

[Barry Warsaw]
> Modified Files:
> 	python-mode.el
> Log Message:
> (python-font-lock-keywords): Add highlighting of `as' as a keyword,
> but only in "import foo as bar" statements (including optional
> preceding `from' clause).

Oh, that's right, try to make IDLE look bad, will you?  I've got half a mind
to take up the challenge.  Unfortunately, I only have half a mind in total,
so you may get away with this backstabbing for a while <wink>.




From thomas at xs4all.net  Thu Dec 28 10:53:31 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 28 Dec 2000 10:53:31 +0100
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001227143750.A26894@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 27, 2000 at 02:37:50PM -0500
References: <20001227143750.A26894@thyrsus.com>
Message-ID: <20001228105331.A6042@xs4all.nl>

On Wed, Dec 27, 2000 at 02:37:50PM -0500, Eric S. Raymond wrote:
> I have 2.0 up and running on RH7.0, compiled from sources.  In the process, 
> I discovered a couple of issues:

> 1. The curses module is commented out in the default Modules/Setup
> file.  This is not good, as it may lead careless distribution builders
> to ship Python 2.0s that will not be able to support the curses front
> end in CML2.  Supporting CML2 (and thus getting Python the "design
> win" of being involved in the Linux kernel build) was the major point
> of integrating the curses module into the Python core.  It is possible
> that one little "#" may have blown that.

Note that Tkinter is off by default too. And readline. And ssl. And the use
of shared libraries. We *can't* enable the cursesmodule by default, because
we don't know what the system's curses library is called. We'd have to
auto-detect that before we can enable it (and lots of other modules)
automatically, and that's a lot of work. I personally favour autoconf for
the job, but since amk is already busy on using distutils, I'm not going to
work on that.

> 2.The default Modules/Setup file assumes that various Tkinter-related libraries
> are in /usr/local.  But /usr would be a more appropriate choice under most
> circumstances.  Most Linux users now install their Tcl/Tk stuff from RPMs
> or .deb packages that place the binaries and libraries under /usr.  Under
> most other Unixes (e.g. Solaris) they were there to begin with.

This is nonsense. The line above it specifically states 'edit to reflect
where your Tcl/Tk headers are'. And besides from the issue whether they are
usually found in /usr (I don't believe so, not even on Solaris, but 'my'
Solaris box doesn't even have tcl/tk,) /usr/local is a perfectly sane
choice, since /usr is already included in the include-path, but /usr/local
usually is not.

> 3. The configure machinery could be made to deduce more about the contents
> of Modules/Setup than it does now.  In particular, it's silly that the person
> building Python has to fill in the locations of X librasries when 
> configure is in principle perfectly capable of finding them.

In principle, I agree. It's a lot of work, though. For instance, Debian
stores the Tcl/Tk headers in /usr/include/tcl<version>, which means you can
compile for more than one tcl version, by just changing your include path
and the library you link with. And there are undoubtedly several other
variants out there.

Should we really make the Setup file default to Linux, and leave other
operating systems in the dark about what it might be on their system ? I
think people with Linux and without clue are the least likely people to
compile their own Python, since Linux distributions already come with a
decent enough Python. And, please, lets assume the people assembling those
know how to read ?

Maybe we just need a HOWTO document covering Setup ?

(Besides, won't this all be fixed when CML2 comes with a distribution, Eric ?
They'll *have* to have working curses/tkinter then :-)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From MarkH at ActiveState.com  Thu Dec 28 13:34:09 2000
From: MarkH at ActiveState.com (Mark Hammond)
Date: Thu, 28 Dec 2000 23:34:09 +1100
Subject: [Python-Dev] Fwd: try...else
Message-ID: <3A4B3341.5010707@ActiveState.com>

Spotted on c.l.python.  Although Pythonwin is mentioned, python.exe 
gives the same results - as does 1.5.2.

Seems a reasonable question...

[Also, if Robin hasn't been invited to join us here, I think it could 
make some sense...]

Mark.
-------- Original Message --------
Subject: try...else
Date: Fri, 22 Dec 2000 18:02:27 +0000
From: Robin Becker <robin at jessikat.fsnet.co.uk>
Newsgroups: comp.lang.python

I had expected that in try: except: else
the else clause always got executed, but it seems not for return

PythonWin 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on
win32.Portions Copyright 1994-2000 Mark Hammond (MarkH at ActiveState.com)
- see 'Help/About PythonWin' for further copyright information.
 >>> def bang():
....     try:
....             return 'return value'
....     except:
....             print 'bang failed'
....     else:
....             print 'bang succeeded'
....
  >>> bang()
'return value'
 >>>

is this a 'feature' or bug. The 2.0 docs seem not to mention
return/continue except for try finally.
-- 
Robin Becker




From mal at lemburg.com  Thu Dec 28 15:45:49 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Dec 2000 15:45:49 +0100
Subject: [Python-Dev] Fwd: try...else
References: <3A4B3341.5010707@ActiveState.com>
Message-ID: <3A4B521D.4372224A@lemburg.com>

Mark Hammond wrote:
> 
> Spotted on c.l.python.  Although Pythonwin is mentioned, python.exe
> gives the same results - as does 1.5.2.
> 
> Seems a reasonable question...
> 
> [Also, if Robin hasn't been invited to join us here, I think it could
> make some sense...]
> 
> Mark.
> -------- Original Message --------
> Subject: try...else
> Date: Fri, 22 Dec 2000 18:02:27 +0000
> From: Robin Becker <robin at jessikat.fsnet.co.uk>
> Newsgroups: comp.lang.python
> 
> I had expected that in try: except: else
> the else clause always got executed, but it seems not for return

I think Robin mixed up try...finally with try...except...else.
The finally clause is executed even in case an exception occurred.

He does have a point however that 'return' will bypass 
try...else and try...finally clauses. I don't think we can change
that behaviour, though, as it would break code.
 
> PythonWin 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on
> win32.Portions Copyright 1994-2000 Mark Hammond (MarkH at ActiveState.com)
> - see 'Help/About PythonWin' for further copyright information.
>  >>> def bang():
> ....     try:
> ....             return 'return value'
> ....     except:
> ....             print 'bang failed'
> ....     else:
> ....             print 'bang succeeded'
> ....
>   >>> bang()
> 'return value'
>  >>>
> 
> is this a 'feature' or bug. The 2.0 docs seem not to mention
> return/continue except for try finally.
> --
> Robin Becker
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Thu Dec 28 16:04:23 2000
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 28 Dec 2000 10:04:23 -0500
Subject: [Python-Dev] chomp()?
Message-ID: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>

Someone just posted a patch to implement s.chomp() as a string method:

http://sourceforge.net/patch/?func=detailpatch&patch_id=103029&group_id=5470

Pseudo code (for those not aware of the Perl function by that name):

def chomp(s):
    if s[-2:] == '\r\n':
        return s[:-2]
    if s[-1:] == '\r' or s[-1:] == '\n':
        return s[:-1]
    return s

I.e. it removes a trailing \r\n, \r, or \n.

Any comments?  Is this needed given that we have s.rstrip() already?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Thu Dec 28 16:30:57 2000
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 28 Dec 2000 10:30:57 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: Your message of "Wed, 27 Dec 2000 14:37:50 EST."
             <20001227143750.A26894@thyrsus.com> 
References: <20001227143750.A26894@thyrsus.com> 
Message-ID: <200012281530.KAA26049@cj20424-a.reston1.va.home.com>

Eric,

I think your recent posts have shown a worldview that's a bit too
Eric-centered. :-)

Not all the world is Linux.  CML2 isn't the only Python application
that matters.  Python world domination is not a goal.  There is no
Eric conspiracy! :-)

That said, I think that the future is bright: Anderw is already
working on a much more intelligent configuration manager.

I believe it would be a mistake to enable curses by default using the
current approach to module configuration: it doesn't compile out of
the box on every platform, and you wouldn't believe how much email I
get from clueless Unix users trying to build Python when there's a
problem like that in the distribution.

So I'd rather wait for Andrew's work.  You could do worse than help
him with that, to further your goal!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at acm.org  Thu Dec 28 16:41:23 2000
From: fdrake at acm.org (Fred L. Drake)
Date: Thu, 28 Dec 2000 10:41:23 -0500
Subject: [Python-Dev] chomp()?
In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <web-403062@digicool.com>

On Thu, 28 Dec 2000 10:04:23 -0500, Guido
<guido at digicool.com> wrote:
 > Someone just posted a patch to implement s.chomp() as a
 > string method:
...
 > Any comments?  Is this needed given that we have
 > s.rstrip() already?

  I've always considered this a different operation from
rstrip().  When you intend to be as surgical in your changes
as possible, it is important *not* to use rstrip().
  I don't feel strongly that it needs to be implemented in
C, though I imagine people who do a lot of string processing
feel otherwise.  It's just hard to beat the performance
difference if you are doing this a lot.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From barry at digicool.com  Thu Dec 28 17:00:36 2000
From: barry at digicool.com (Barry A. Warsaw)
Date: Thu, 28 Dec 2000 11:00:36 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Misc python-mode.el,3.108,3.109
References: <E14BKaD-0004JB-00@usw-pr-cvs1.sourceforge.net>
	<LNBBLJKPBEHFEDALKOLCGEKFIFAA.tim.one@home.com>
Message-ID: <14923.25508.668453.186209@anthem.concentric.net>

>>>>> "TP" == Tim Peters <tim.one at home.com> writes:

    TP> [Barry Warsaw]
    >> Modified Files: python-mode.el Log Message:
    >> (python-font-lock-keywords): Add highlighting of `as' as a
    >> keyword, but only in "import foo as bar" statements (including
    >> optional preceding `from' clause).

    TP> Oh, that's right, try to make IDLE look bad, will you?  I've
    TP> got half a mind to take up the challenge.  Unfortunately, I
    TP> only have half a mind in total, so you may get away with this
    TP> backstabbing for a while <wink>.

With my current network (un)connectivity, I feel like a nuclear sub
which can only surface once a month to receive low frequency orders
from some remote antenna farm out in New Brunswick.  Just think of me
as a rogue commander who tries to do as much damage as possible when
he's not joyriding in the draft-wake of giant squids.

rehoming-all-remaining-missiles-at-the-Kingdom-of-Timbotia-ly y'rs,
-Barry




From esr at thyrsus.com  Thu Dec 28 17:01:54 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 11:01:54 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <200012281530.KAA26049@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 10:30:57AM -0500
References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com>
Message-ID: <20001228110154.D32394@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> Not all the world is Linux.  CML2 isn't the only Python application
> that matters.  Python world domination is not a goal.  There is no
> Eric conspiracy! :-)

Perhaps I misunderstood you, then.  I thought you considered CML2 an
potentially important design win, and that was why curses didn't get
dropped from the core.  Have you changed your mind about this?

If Python world domination is not a goal then I can only conclude that
you haven't had your morning coffee yet :-).

There's a more general question here about what it means for something
to be in the core language.  Developers need to have a clear,
bright-line picture of what they can count on to be present.  To me
this implies that it's the job of the Python maintainers to make sure
that a facility declared "core" by its presence in the standard
library documentation is always present, for maximum "batteries are
included" effect.  

Yes, dealing with cross-platform variations in linking curses is a
pain -- but dealing with that kind of pain so the Python user doesn't
have to is precisely our job.  Or so I understand it, anyway.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Conservatism is the blind and fear-filled worship of dead radicals.



From moshez at zadka.site.co.il  Thu Dec 28 17:51:32 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: 28 Dec 2000 16:51:32 -0000
Subject: [Python-Dev] chomp()?
In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <20001228165132.8025.qmail@stimpy.scso.com>

On Thu, 28 Dec 2000, Guido van Rossum <guido at digicool.com> wrote:

> Someone just posted a patch to implement s.chomp() as a string method:
...
> Any comments?  Is this needed given that we have s.rstrip() already?

Yes.

i=0
for line in fileinput.input():
	print '%d: %s' % (i, line.chomp())
	i++

I want that operation to be invertable by

sed 's/^[0-9]*: //'



From guido at digicool.com  Thu Dec 28 18:08:18 2000
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 28 Dec 2000 12:08:18 -0500
Subject: [Python-Dev] scp to sourceforge
Message-ID: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>

I've seen a thread on this but there was no conclusive answer, so I'm
reopening this.

I can't SCP updated PEPs to the SourceForge machine.  The "pep2html.py
-i" command just hangs.  I can ssh into shell.sourceforge.net just
fine, but scp just hangs.  "scp -v" prints a bunch of things
suggesting that it can authenticate itself just fine, ending with
these three lines:

  cj20424-a.reston1.va.home.com: RSA authentication accepted by server.
  cj20424-a.reston1.va.home.com: Sending command: scp -v -t .
  cj20424-a.reston1.va.home.com: Entering interactive session.

and then nothing.  It just sits there.

Would somebody please figure out a way to update the PEPs?  It's kind
of pathetic to see the website not have the latest versions...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From moshez at zadka.site.co.il  Thu Dec 28 17:28:07 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: 28 Dec 2000 16:28:07 -0000
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <3A4B521D.4372224A@lemburg.com>
References: <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com>
Message-ID: <20001228162807.7229.qmail@stimpy.scso.com>

On Thu, 28 Dec 2000, "M.-A. Lemburg" <mal at lemburg.com> wrote:

> He does have a point however that 'return' will bypass 
> try...else and try...finally clauses. I don't think we can change
> that behaviour, though, as it would break code.

It doesn't bypass try..finally:

>>> def foo():
...     try:
...             print "hello"
...             return
...     finally:
...             print "goodbye"
...
>>> foo()
hello
goodbye




From guido at digicool.com  Thu Dec 28 17:43:26 2000
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 28 Dec 2000 11:43:26 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: Your message of "Thu, 28 Dec 2000 11:01:54 EST."
             <20001228110154.D32394@thyrsus.com> 
References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com>  
            <20001228110154.D32394@thyrsus.com> 
Message-ID: <200012281643.LAA26687@cj20424-a.reston1.va.home.com>

> Guido van Rossum <guido at digicool.com>:
> > Not all the world is Linux.  CML2 isn't the only Python application
> > that matters.  Python world domination is not a goal.  There is no
> > Eric conspiracy! :-)
> 
> Perhaps I misunderstood you, then.  I thought you considered CML2 an
> potentially important design win, and that was why curses didn't get
> dropped from the core.  Have you changed your mind about this?

Supporting CML2 was one of the reasons to keep curses in the core, but
not the only one.  Linux kernel configuration is so far removed from
my daily use of computers that I don't have a good way to judge its
importance in the grand scheme of things.  Since you obviously
consider it very important, and since I generally trust your judgement
(except on the issue of firearms :-), your plea for keeping, and
improving, curses support in the Python core made a difference in my
decision.  And don't worry, I don't expect to change that decision
-- though I personally still find it curious that curses is so important.
I find curses-style user interfaces pretty pathetic, and wished that
Linux migrated to a real GUI for configuration.  (And the linuxconf
approach does *not* qualify as a a real GUI. :-)

> If Python world domination is not a goal then I can only conclude that
> you haven't had your morning coffee yet :-).

Sorry to disappoint you, Eric.  I gave up coffee years ago. :-)

I was totally serious though: my personal satisfaction doesn't come
from Python world domination.  Others seem have that goal, and if it
doesn't inconvenience me too much I'll play along, but in the end I've
got some goals in my personal life that are much more important.

> There's a more general question here about what it means for something
> to be in the core language.  Developers need to have a clear,
> bright-line picture of what they can count on to be present.  To me
> this implies that it's the job of the Python maintainers to make sure
> that a facility declared "core" by its presence in the standard
> library documentation is always present, for maximum "batteries are
> included" effect.  

We do the best we can.  Using the current module configuration system,
it's a one-character edit to enable curses if you need it.  With
Andrew's new scheme, it will be automatic.

> Yes, dealing with cross-platform variations in linking curses is a
> pain -- but dealing with that kind of pain so the Python user doesn't
> have to is precisely our job.  Or so I understand it, anyway.

So help Andrew: http://python.sourceforge.net/peps/pep-0229.html

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Thu Dec 28 17:52:36 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Dec 2000 17:52:36 +0100
Subject: [Python-Dev] Fwd: try...else
References: <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com>
Message-ID: <3A4B6FD3.9B576E9A@lemburg.com>

Moshe Zadka wrote:
> 
> On Thu, 28 Dec 2000, "M.-A. Lemburg" <mal at lemburg.com> wrote:
> 
> > He does have a point however that 'return' will bypass
> > try...else and try...finally clauses. I don't think we can change
> > that behaviour, though, as it would break code.
> 
> It doesn't bypass try..finally:
> 
> >>> def foo():
> ...     try:
> ...             print "hello"
> ...             return
> ...     finally:
> ...             print "goodbye"
> ...
> >>> foo()
> hello
> goodbye

Hmm, that must have changed between Python 1.5 and more recent
versions:

Python 1.5:
>>> def f():
...     try:
...             return 1
...     finally:
...             print 'finally'
... 
>>> f()
1
>>> 

Python 2.0:
>>> def f():
...     try:
...             return 1
...     finally:
...             print 'finally'
... 
>>> f()
finally
1
>>>

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From moshez at stimpy.scso.com  Thu Dec 28 17:59:32 2000
From: moshez at stimpy.scso.com (Moshe Zadka)
Date: 28 Dec 2000 16:59:32 -0000
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <3A4B6FD3.9B576E9A@lemburg.com>
References: <3A4B6FD3.9B576E9A@lemburg.com>, <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com>
Message-ID: <20001228165932.8143.qmail@stimpy.scso.com>

On Thu, 28 Dec 2000 17:52:36 +0100, "M.-A. Lemburg" <mal at lemburg.com> wrote:

[about try..finally not playing well with return]
> Hmm, that must have changed between Python 1.5 and more recent
> versions:

I posted a 1.5.2 test. So it changed between 1.5 and 1.5.2?



From esr at thyrsus.com  Thu Dec 28 18:20:48 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 12:20:48 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001228105331.A6042@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 10:53:31AM +0100
References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl>
Message-ID: <20001228122048.A1381@thyrsus.com>

Thomas Wouters <thomas at xs4all.net>:
> > 1. The curses module is commented out in the default Modules/Setup
> > file.  This is not good, as it may lead careless distribution builders
> > to ship Python 2.0s that will not be able to support the curses front
> > end in CML2.  Supporting CML2 (and thus getting Python the "design
> > win" of being involved in the Linux kernel build) was the major point
> > of integrating the curses module into the Python core.  It is possible
> > that one little "#" may have blown that.
> 
> Note that Tkinter is off by default too. And readline. And ssl. And the use
> of shared libraries.

IMO ssl isn't an issue because it's not documented as being in the standard
module set.  Readline is a minor issue because raw_input()'s functionality
changes somewhat if it's not linked, but I think we can live with this -- the
change isn't visible to calling programs.  

Hm.  It appears tkinter isn't documented in the standard set of modules 
either.  Interesting.  Technically this means I don't have a problem with
it not being built in by default, but I think there is a problem here...

My more general point is that right now Pyjthon has three classes of modules:

1. Documented as being in the core and built in by default.
2. Not documented as being in the core and not built in by default.
3. Documented as being in the core but not built in by default.

My more general claim is that the existence of class 3 is a problem,
because it compromises the "batteries are included" effect -- it means
Python users don't have a bright-line test for what will be present in
every Python (or at least every Python on an operating system
theoretically feature-compatible with theirs).

My struggle to get CML2 adopted brings this problem into particularly
sharp focus because the kernel group is allergic to big footprints or
having to download extension modules to do a build.  But the issue is
really broader than that.  I think we ought to be migrating stuff out
of class 3 into class 1 where possible and to class 2 only where
unavoidable.

>         We *can't* enable the cursesmodule by default, because
> we don't know what the system's curses library is called. We'd have to
> auto-detect that before we can enable it (and lots of other modules)
> automatically, and that's a lot of work. I personally favour autoconf for
> the job, but since amk is already busy on using distutils, I'm not going to
> work on that.

Yes, we need to do a lot more autodetection -- this is a large part of
my point.  I have nothing against distutils, but I don't see how it
solves this problem unless we assume that we'll always have Python
already available on any platform where we're building Python.

I'm willing to put my effort where my mouth is on this.  I have a lot
of experience with autoconf; I'm willing to write some of these nasty
config tests.

> > 2.The default Modules/Setup file assumes that various Tkinter-related libraries
> > are in /usr/local.  But /usr would be a more appropriate choice under most
> > circumstances.  Most Linux users now install their Tcl/Tk stuff from RPMs
> > or .deb packages that place the binaries and libraries under /usr.  Under
> > most other Unixes (e.g. Solaris) they were there to begin with.
> 
> This is nonsense. The line above it specifically states 'edit to reflect
> where your Tcl/Tk headers are'. And besides from the issue whether they are
> usually found in /usr (I don't believe so, not even on Solaris, but 'my'
> Solaris box doesn't even have tcl/tk,) /usr/local is a perfectly sane
> choice, since /usr is already included in the include-path, but /usr/local
> usually is not.

Is it?  That is not clear from the comment.  Perhaps this is just a 
documentation problem.  I'll look again.
 
> > 3. The configure machinery could be made to deduce more about the contents
> > of Modules/Setup than it does now.  In particular, it's silly that the
> > person building Python has to fill in the locations of X librasries when 
> > configure is in principle perfectly capable of finding them.
> 
> In principle, I agree. It's a lot of work, though. For instance, Debian
> stores the Tcl/Tk headers in /usr/include/tcl<version>, which means you can
> compile for more than one tcl version, by just changing your include path
> and the library you link with. And there are undoubtedly several other
> variants out there.

As I said to Guido, I think it is exactly our job to deal with this sort
of grottiness.  One of Python's major selling points is supposed to be
cross-platform consistency of the API.  If we fail to do what you're
describing, we're failing to meet Python users' reasonable expectations
for the language.

> Should we really make the Setup file default to Linux, and leave other
> operating systems in the dark about what it might be on their system ? I
> think people with Linux and without clue are the least likely people to
> compile their own Python, since Linux distributions already come with a
> decent enough Python. And, please, lets assume the people assembling those
> know how to read ?

Please note that I am specifically *not* advocating making the build defaults
Linux-centric.  That's not my point at all.

> Maybe we just need a HOWTO document covering Setup ?

That would be a good idea.

> (Besides, won't this all be fixed when CML2 comes with a distribution, Eric ?
> They'll *have* to have working curses/tkinter then :-)

I'm concerned that it will work the other way around, that CML2 won't happen
if the core does not reliably include these facilities.  In itself CML2 
not happening wouldn't be the end of the world of course, but I'm pushing on
this because I think the larger issue of class 3 modules is actually important
to the health of Python and needs to be attacked seriously.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The Bible is not my book, and Christianity is not my religion.  I could never
give assent to the long, complicated statements of Christian dogma.
	-- Abraham Lincoln



From cgw at fnal.gov  Thu Dec 28 18:36:06 2000
From: cgw at fnal.gov (Charles G Waldman)
Date: Thu, 28 Dec 2000 11:36:06 -0600 (CST)
Subject: [Python-Dev] chomp()?
In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <14923.31238.65155.496546@buffalo.fnal.gov>

Guido van Rossum writes:
 > Someone just posted a patch to implement s.chomp() as a string method:
 > I.e. it removes a trailing \r\n, \r, or \n.
 > 
 > Any comments?  Is this needed given that we have s.rstrip() already?

-1 from me.  P=NP (Python is not Perl).  "Chomp" is an excessively cute name.
And like you said, this is too much like "rstrip" to merit a separate
method.





From esr at thyrsus.com  Thu Dec 28 18:41:17 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 12:41:17 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <200012281643.LAA26687@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 11:43:26AM -0500
References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com> <20001228110154.D32394@thyrsus.com> <200012281643.LAA26687@cj20424-a.reston1.va.home.com>
Message-ID: <20001228124117.B1381@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> Supporting CML2 was one of the reasons to keep curses in the core, but
> not the only one.  Linux kernel configuration is so far removed from
> my daily use of computers that I don't have a good way to judge its
> importance in the grand scheme of things.  Since you obviously
> consider it very important, and since I generally trust your judgement
> (except on the issue of firearms :-), your plea for keeping, and
> improving, curses support in the Python core made a difference in my
> decision.  And don't worry, I don't expect to change that decision
> -- though I personally still find it curious that curses is so important.
> I find curses-style user interfaces pretty pathetic, and wished that
> Linux migrated to a real GUI for configuration.  (And the linuxconf
> approach does *not* qualify as a a real GUI. :-)

Thank you, that makes your priorities much clearer.

Actually I agree with you that curses interfaces are mostly pretty
pathetic.  A lot of people still like them, though, because they tend
to be fast and lightweight.  Then, too, a really well-designed curses
interface can in fact be good enough that the usability gain from
GUIizing is marginal.  My favorite examples of this are mutt and slrn.
The fact that GUI programs have failed to make much headway against
this is not simply due to user conservatism, it's genuinely hard to
see how a GUI interface could be made significantly better.

And unfortunately, there is a niche where it is still important to
support curses interfacing independently of anyone's preferences in
interface style -- early in the system-configuration process before
one has bootstrapped to the point where X is reliably available.  I
hasten to add that this is not just *my* problem -- one of your
more important Python constituencies in a practical sense is 
the guys who maintain Red Hat's installer.

> I was totally serious though: my personal satisfaction doesn't come
> from Python world domination.  Others seem have that goal, and if it
> doesn't inconvenience me too much I'll play along, but in the end I've
> got some goals in my personal life that are much more important.

There speaks the new husband :-).  OK.  So what *do* you want from Python?

Personally, BTW, my goal is not exactly Python world domination either
-- it's that the world should be dominated by the language that has
the least tendency to produce grotty fragile code (remember that I
tend to obsess about the software-quality problem :-)).  Right now
that's Python.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The people of the various provinces are strictly forbidden to have in their
possession any swords, short swords, bows, spears, firearms, or other types
of arms. The possession of unnecessary implements makes difficult the
collection of taxes and dues and tends to foment uprisings.
        -- Toyotomi Hideyoshi, dictator of Japan, August 1588



From mal at lemburg.com  Thu Dec 28 18:43:13 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Dec 2000 18:43:13 +0100
Subject: [Python-Dev] chomp()?
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <3A4B7BB1.F09660ED@lemburg.com>

Guido van Rossum wrote:
> 
> Someone just posted a patch to implement s.chomp() as a string method:
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=103029&group_id=5470
> 
> Pseudo code (for those not aware of the Perl function by that name):
> 
> def chomp(s):
>     if s[-2:] == '\r\n':
>         return s[:-2]
>     if s[-1:] == '\r' or s[-1:] == '\n':
>         return s[:-1]
>     return s
> 
> I.e. it removes a trailing \r\n, \r, or \n.
> 
> Any comments?  Is this needed given that we have s.rstrip() already?

We already have .splitlines() which does the above (remove
line breaks) not only for a single line, but for many lines at once.

Even better: .splitlines() also does the right thing for Unicode.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Thu Dec 28 20:06:33 2000
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 28 Dec 2000 20:06:33 +0100
Subject: [Python-Dev] Fwd: try...else
References: <3A4B6FD3.9B576E9A@lemburg.com>, <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com> <20001228165932.8143.qmail@stimpy.scso.com>
Message-ID: <3A4B8F39.58C64EFB@lemburg.com>

Moshe Zadka wrote:
> 
> On Thu, 28 Dec 2000 17:52:36 +0100, "M.-A. Lemburg" <mal at lemburg.com> wrote:
> 
> [about try..finally not playing well with return]
> > Hmm, that must have changed between Python 1.5 and more recent
> > versions:
> 
> I posted a 1.5.2 test. So it changed between 1.5 and 1.5.2?

Sorry, false alarm: there was a bug in my patched 1.5 version.
The original 1.5 version does not show the described behaviour.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Thu Dec 28 21:21:15 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 28 Dec 2000 21:21:15 +0100
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <3A4B521D.4372224A@lemburg.com>; from mal@lemburg.com on Thu, Dec 28, 2000 at 03:45:49PM +0100
References: <3A4B3341.5010707@ActiveState.com> <3A4B521D.4372224A@lemburg.com>
Message-ID: <20001228212115.C1811@xs4all.nl>

On Thu, Dec 28, 2000 at 03:45:49PM +0100, M.-A. Lemburg wrote:

> > I had expected that in try: except: else
> > the else clause always got executed, but it seems not for return

> I think Robin mixed up try...finally with try...except...else.
> The finally clause is executed even in case an exception occurred.

(MAL and I already discussed this in private mail: Robin did mean
try/except/else, and 'finally' already executes when returning directly from
the 'try' block, even in Python 1.5)

> He does have a point however that 'return' will bypass 
> try...else and try...finally clauses. I don't think we can change
> that behaviour, though, as it would break code.

This code:

try:
   return
except:
   pass
else:
   print "returning"

will indeed not print 'returning', but I believe it's by design. I'm against
changing it, in any case, and not just because it'd break code :) If you
want something that always executes, use a 'finally'. Or don't return from
the 'try', but return in the 'else' clause. 

The 'except' clause is documented to execute if a matching exception occurs,
and 'else' if no exception occurs. Maybe the intent of the 'else' clause
would be clearer if it was documented to 'execute if the try: clause
finishes without an exception being raised' ? The 'else' clause isn't
executed when you 'break' or (after applying my continue-in-try patch ;)
'continue' out of the 'try', either.

Robin... Did I already reply this, on python-list or to you directly ? I
distinctly remember writing that post, but I'm not sure if it arrived. Maybe
I didn't send it after all, or maybe something on mail.python.org is
detaining it ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Thu Dec 28 19:19:06 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 28 Dec 2000 19:19:06 +0100
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001228122048.A1381@thyrsus.com>; from esr@thyrsus.com on Thu, Dec 28, 2000 at 12:20:48PM -0500
References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> <20001228122048.A1381@thyrsus.com>
Message-ID: <20001228191906.F1281@xs4all.nl>

On Thu, Dec 28, 2000 at 12:20:48PM -0500, Eric S. Raymond wrote:

> My more general point is that right now Pyjthon has three classes of
> modules:

> 1. Documented as being in the core and built in by default.
> 2. Not documented as being in the core and not built in by default.
> 3. Documented as being in the core but not built in by default.

> My more general claim is that the existence of class 3 is a problem,
> because it compromises the "batteries are included" effect -- it means
> Python users don't have a bright-line test for what will be present in
> every Python (or at least every Python on an operating system
> theoretically feature-compatible with theirs).

It depends on your definition of 'being in the core'. Some of the things
that are 'in the core' are simply not possible on all platforms. So if you
want really portable code, you don't want to use them. Other features are
available on all systems that matter [to you], so you don't really care
about it, just use them, and at best document that they need feature X.

There is also the subtle difference between a Python user and a Python
compiler/assembler (excuse my overloading of the terms, but you know what I
mean). People who choose to compile their own Python should realize that
they might disable or misconfigure some parts of it. I personally trust most
people that assemble OS distributions to compile a proper Python binary +
modules, but I think a HOWTO isn't a bad idea -- unless we autoconf
everything.

> I think we ought to be migrating stuff out
> of class 3 into class 1 where possible and to class 2 only where
> unavoidable.

[ and ]

> I'm willing to put my effort where my mouth is on this.  I have a lot
> of experience with autoconf; I'm willing to write some of these nasty
> config tests.

[ and ]

> As I said to Guido, I think it is exactly our job to deal with this sort
> of grottiness.  One of Python's major selling points is supposed to be
> cross-platform consistency of the API.  If we fail to do what you're
> describing, we're failing to meet Python users' reasonable expectations
> for the language.

[ and ]

> Please note that I am specifically *not* advocating making the build defaults
> Linux-centric.  That's not my point at all.

I apologize for the tone of my previous post, and the above snippet. I'm not
trying to block progress here ;) I'm actually all for autodetecting as much
as possible, and more than willing to put effort into it as well (as long as
it's deemed useful, and isn't supplanted by a distutils variant weeks
later.) And I personally have my doubts about the distutils variant, too,
but that's partly because I have little experience with distutils. If we can
work out a deal where both autoconf and distutils are an option, I'm happy
to write a few, if not all, autoconf tests for the currently disabled
modules.

So, Eric, let's split the work. I'll do Tkinter if you do curses. :)

However, I'm also keeping those oddball platforms that just don't support
some features in mind. If you want truly portable code, you have to work at
it. I think it's perfectly okay to say "your Python needs to have the curses
module or the tkinter module compiled in -- contact your administrator if it
has neither". There will still be platforms that don't have curses, or
syslog, or crypt(), though hopefully none of them will be Linux.

Oh, and I also apologize for possibly duplicating what has already been said
by others. I haven't seen anything but this post (which was CC'd to me
directly) since I posted my reply to Eric, due to the ululating bouts of
delay on mail.python.org. Maybe DC should hire some *real* sysadmins,
instead of those silly programmer-kniggits ? >:->

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mwh21 at cam.ac.uk  Thu Dec 28 19:27:48 2000
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: Thu, 28 Dec 2000 18:27:48 +0000 (GMT)
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <3A4B521D.4372224A@lemburg.com>
Message-ID: <Pine.SOL.4.21.0012281820240.3801-100000@yellow.csi.cam.ac.uk>

On Thu, 28 Dec 2000, M.-A. Lemburg wrote:

> I think Robin mixed up try...finally with try...except...else.

I think so too.

> The finally clause is executed even in case an exception occurred.
> 
> He does have a point however that 'return' will bypass 
> try...else and try...finally clauses. I don't think we can change
> that behaviour, though, as it would break code.

return does not skip finally clauses[1].  In my not especially humble
opinion, the current behaviour is the Right Thing.  I'd have to think for
a moment before saying what Robin's example would print, but I think the
alternative would disturb me far more.

Cheers,
M.

[1] In fact the flow of control on return is very similar to that of an
    exception - ooh, look at that implementation...




From esr at thyrsus.com  Thu Dec 28 20:17:51 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 14:17:51 -0500
Subject: [Python-Dev] Miscellaneous 2.0 installation issues
In-Reply-To: <20001228191906.F1281@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 07:19:06PM +0100
References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> <20001228122048.A1381@thyrsus.com> <20001228191906.F1281@xs4all.nl>
Message-ID: <20001228141751.B2528@thyrsus.com>

Thomas Wouters <thomas at xs4all.net>:
> > My more general claim is that the existence of class 3 is a problem,
> > because it compromises the "batteries are included" effect -- it means
> > Python users don't have a bright-line test for what will be present in
> > every Python (or at least every Python on an operating system
> > theoretically feature-compatible with theirs).
> 
> It depends on your definition of 'being in the core'. Some of the things
> that are 'in the core' are simply not possible on all platforms. So if you
> want really portable code, you don't want to use them. Other features are
> available on all systems that matter [to you], so you don't really care
> about it, just use them, and at best document that they need feature X.

I understand.  We can't, for example, guarantee to duplicate the Windows-
specific stuff in the Unix port (nor would we want to in most cases :-)).
However, I think "we build in curses/Tkinter everywhere the corresponding
libraries exist" is a guarantee we can and should make.  Similarly for
other modules presently in class 3.
 
> There is also the subtle difference between a Python user and a Python
> compiler/assembler (excuse my overloading of the terms, but you know what I
> mean).

Yes.  We have three categories here:

1. People who use python for applications (what I've been calling users)
2. People who configure Python binary packages for distribution (what
   you call a "compiler/assembler" and I think of as a "builder").
3. People who hack Python itself.

Problem is that "developer" is very ambiguous in this context...

>           People who choose to compile their own Python should realize that
> they might disable or misconfigure some parts of it. I personally trust most
> people that assemble OS distributions to compile a proper Python binary +
> modules, but I think a HOWTO isn't a bad idea -- unless we autoconf
> everything.

I'd like to see both things happen (HOWTO and autoconfing) and am willing to
work on both.

> I apologize for the tone of my previous post, and the above snippet.

No offense taken at all, I assure you.

>                                                                    I'm not
> trying to block progress here ;) I'm actually all for autodetecting as much
> as possible, and more than willing to put effort into it as well (as long as
> it's deemed useful, and isn't supplanted by a distutils variant weeks
> later.) And I personally have my doubts about the distutils variant, too,
> but that's partly because I have little experience with distutils. If we can
> work out a deal where both autoconf and distutils are an option, I'm happy
> to write a few, if not all, autoconf tests for the currently disabled
> modules.

I admit I'm not very clear on the scope of what distutils is supposed to
handle, and how.  Perhaps amk can enlighten us?

> So, Eric, let's split the work. I'll do Tkinter if you do curses. :)

You've got a deal.  I'll start looking at the autoconf code.  I've already
got a fair idea how to do this.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

No one who's seen it in action can say the phrase "government help" without
either laughing or crying.



From tim.one at home.com  Fri Dec 29 03:59:53 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 28 Dec 2000 21:59:53 -0500
Subject: [Python-Dev] scp to sourceforge
In-Reply-To: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEMKIFAA.tim.one@home.com>

[Guido]
> I've seen a thread on this but there was no conclusive answer, so I'm
> reopening this.

It hasn't budged an inch since then:  your "Entering interactive session"
problem is the same one everyone has; it gets reported on SF's bug and/or
support managers at least daily; SF has not fixed it yet; these days they
don't even respond to scp bug reports anymore; the cause appears to be SF's
custom sfshell, and only SF can change that; the only known workarounds are
to (a) modify files on SF directly (they suggest vi <wink>), or (b) initiate
scp *from* SF, using your local machine as a server (if you can do that -- I
cannot, or at least haven't succeeded).




From martin at loewis.home.cs.tu-berlin.de  Thu Dec 28 23:52:02 2000
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 28 Dec 2000 23:52:02 +0100
Subject: [Python-Dev] curses in the core?
Message-ID: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>

> If curses is a core facility now, the default build should tread it
> as one.
...
> IMO ssl isn't an issue because it's not documented as being in the
> standard module set.
...
> 3. Documented as being in the core but not built in by default.
> My more general claim is that the existence of class 3 is a problem

In the case of curses, I believe there is a documentation error in the
2.0 documentation. The curses packages is listed under "Generic
Operating System Services". I believe this is wrong, it should be listed
as "Unix Specific Services".

Unless I'm mistaken, the curses module is not available on the Mac and
on Windows. With that change, the curses module would then fall into
Eric's category 2 (Not documented as being in the core and not built
in by default).

That documentation change should be carried out even if curses is
autoconfigured; autoconf is used on Unix only, either.

Regards,
Martin

P.S. The "Python Library Reference" content page does not mention the
word "core" at all, except as part of asyncore...



From thomas at xs4all.net  Thu Dec 28 23:58:25 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 28 Dec 2000 23:58:25 +0100
Subject: [Python-Dev] scp to sourceforge
In-Reply-To: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 12:08:18PM -0500
References: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>
Message-ID: <20001228235824.E1811@xs4all.nl>

On Thu, Dec 28, 2000 at 12:08:18PM -0500, Guido van Rossum wrote:

> I've seen a thread on this but there was no conclusive answer, so I'm
> reopening this.

Actually there was: it's all SourceForge's fault. (At least that's my
professional opinion ;) They honestly have a strange setup, though how
strange and to what end I cannot tell.

> Would somebody please figure out a way to update the PEPs?  It's kind
> of pathetic to see the website not have the latest versions...

The way to update the peps is by ssh'ing into shell.sourceforge.net, and
then scp'ing the files from your work repository to the htdocs/peps
directory. That is, until SF fixes the scp problem. This method works (I
just updated all PEPs to up-to-date CVS versions) but it's a bit cumbersome.
And it only works if you have ssh access to your work environment. And it's
damned hard to script; I tried playing with a single ssh command that did
all the work, but between shell weirdness, scp weirdness and a genuine bash
bug I couldn't figure it out.

I assume that SF is aware of the severity of this problem, and is working on
something akin to a fix or workaround. Until then, I can do an occasional
update of the PEPs, for those that can't themselves.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Fri Dec 29 00:05:28 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 29 Dec 2000 00:05:28 +0100
Subject: [Python-Dev] scp to sourceforge
In-Reply-To: <20001228235824.E1811@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 11:58:25PM +0100
References: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> <20001228235824.E1811@xs4all.nl>
Message-ID: <20001229000528.F1811@xs4all.nl>

On Thu, Dec 28, 2000 at 11:58:25PM +0100, Thomas Wouters wrote:
> On Thu, Dec 28, 2000 at 12:08:18PM -0500, Guido van Rossum wrote:

> > Would somebody please figure out a way to update the PEPs?  It's kind
> > of pathetic to see the website not have the latest versions...
> 
> The way to update the peps is by ssh'ing into shell.sourceforge.net, and
> then scp'ing the files from your work repository to the htdocs/peps

[ blah blah ]

And then they fixed it ! At least, for me, direct scp now works fine. (I
should've tested that before posting my blah blah, sorry.) Anybody else,
like people using F-secure ssh (unix or windows) experience the same ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From MarkH at ActiveState.com  Fri Dec 29 00:15:01 2000
From: MarkH at ActiveState.com (Mark Hammond)
Date: Fri, 29 Dec 2000 10:15:01 +1100
Subject: [Python-Dev] chomp()?
In-Reply-To: <14923.31238.65155.496546@buffalo.fnal.gov>
Message-ID: <LCEPIIGDJPKCOIHOBJEPIEILCNAA.MarkH@ActiveState.com>

> -1 from me.  P=NP (Python is not Perl).  "Chomp" is an
> excessively cute name.
> And like you said, this is too much like "rstrip" to merit a separate
> method.

My thoughts exactly.  I can't remember _ever_ wanting to chomp() when
rstrip() wasnt perfectly suitable.  I'm sure it happens, but not often
enough to introduce an ambiguous new function purely for "feature parity"
with Perl.

Mark.




From esr at thyrsus.com  Fri Dec 29 00:25:28 2000
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 28 Dec 2000 18:25:28 -0500
Subject: [Python-Dev] Re: curses in the core?
In-Reply-To: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Thu, Dec 28, 2000 at 11:52:02PM +0100
References: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>
Message-ID: <20001228182528.A10743@thyrsus.com>

Martin v. Loewis <martin at loewis.home.cs.tu-berlin.de>:
> In the case of curses, I believe there is a documentation error in the
> 2.0 documentation. The curses packages is listed under "Generic
> Operating System Services". I believe this is wrong, it should be listed
> as "Unix Specific Services".

I agree that this is an error and should be fixed.
 
> Unless I'm mistaken, the curses module is not available on the Mac and
> on Windows. With that change, the curses module would then fall into
> Eric's category 2 (Not documented as being in the core and not built
> in by default).

Well...that's a definitional question that is part of the larger issue here.

What does being in the Python core mean?  There are two potential definitions:

1. Documentation says it's available on all platforms.

2. Documentation restricts it to one of the three platform groups 
   (Unix/Windows/Mac) but implies that it will be available on any
   OS in that group.  

I think the second one is closer to what application programmers
thinking about which batteries are included expect.  But I could be
persuaded otherwise by a good argument.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The difference between death and taxes is death doesn't get worse
every time Congress meets
	-- Will Rogers



From akuchlin at mems-exchange.org  Fri Dec 29 01:33:36 2000
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Thu, 28 Dec 2000 19:33:36 -0500
Subject: [Python-Dev] Bookstore completed
Message-ID: <200012290033.TAA01295@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com>

OK, I think I'm ready to declare the Python bookstore complete enough
to go public.  Before I set up redirects from www.python.org, please
take another look.  (More book descriptions would be helpful...)

http://www.amk.ca/bookstore/

--amk





From akuchlin at mems-exchange.org  Fri Dec 29 01:46:16 2000
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Thu, 28 Dec 2000 19:46:16 -0500
Subject: [Python-Dev] Help wanted with setup.py script
Message-ID: <200012290046.TAA01346@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com>

Want to help with the laudable goal of automating the Python build
process?  It'll need lots of testing on many different platforms, and
I'd like to start the process now.

First, download the setup.py script from  
       http://www.amk.ca/files/python/setup.py

Next, drop it in the root directory of your Python source tree and run
"python setup.py build".  

If it dies with an exception, let me know.  (Replies to this list are
OK.)

If it runs to completion, look in the Modules/build/lib.<something>
directory to see which modules got built.  (On my system, <something>
is "linux-i686-2.0", but of course this will depend on your platform.)

Is anything missing that should have been built?  (_tkinter.so is the
prime candidate; the autodetection code is far too simple at the
moment and assumes one particular version of Tcl and Tk.)  Did an
attempt at building a module fail?  These indicate problems
autodetecting something, so if you can figure out how to find the
required library or include file, let me know what to do.

--amk



From fdrake at acm.org  Fri Dec 29 05:12:18 2000
From: fdrake at acm.org (Fred L. Drake)
Date: Thu, 28 Dec 2000 23:12:18 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <20001228212115.C1811@xs4all.nl>
Message-ID: <web-404134@digicool.com>

On Thu, 28 Dec 2000 21:21:15 +0100,
 Thomas Wouters <thomas at xs4all.net> wrote:
 > The 'except' clause is documented to execute if a
 > matching exception occurs,
 > and 'else' if no exception occurs. Maybe the intent of
 > the 'else' clause

  This can certainly be clarified in the documentation --
please file a bug report at http://sourceforge.net/projects/python/.
  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From tim.one at home.com  Fri Dec 29 05:25:44 2000
From: tim.one at home.com (Tim Peters)
Date: Thu, 28 Dec 2000 23:25:44 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <20001228212115.C1811@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEMMIFAA.tim.one@home.com>

[Fred, suggested doc change near the end]

[Thomas Wouters]
> (MAL and I already discussed this in private mail: Robin did mean
> try/except/else, and 'finally' already executes when returning
> directly from the 'try' block, even in Python 1.5)
>
> This code:
>
> try:
>    return
> except:
>    pass
> else:
>    print "returning"
>
> will indeed not print 'returning', but I believe it's by design.
> I'm against changing it, in any case, and not just because it'd
> break code :) If you want something that always executes, use a
> 'finally'. Or don't return from the 'try', but return in the
> 'else' clause.

Guido's out of town again, so I'll channel him:  Thomas is correct on all
counts.  In try/else, the "else" clause should execute if and only if
control "falls off the end" of the "try" block.

IOW, consider:

    try:
        arbitrary stuff
    x = 1

An "else" clause added to that "try" should execute when and only when the
code as written executes the "x = 1" after the block.  When "arbitrary
stuff" == "return", control does not fall off the end, so "else" shouldn't
trigger.  Same thing if "arbitrary stuff" == "break" and we're inside a
loop, or "continue" after Thomas's patch gets accepted.

> The 'except' clause is documented to execute if a matching
> exception occurs, and 'else' if no exception occurs.

Yup, and that's imprecise:  the same words are used to describe (part of)
when 'finally' executes, but they weren't intended to be the same.

> Maybe the intent of the 'else' clause would be clearer if it
> was documented to 'execute if the try: clause finishes without
> an exception being raised' ?

Sorry, I don't find that any clearer.  Let's be explicit:

    The optional 'else' clause is executed when the 'try' clause
    terminates by any means other than an exception or executing a
    'return', 'continue' or 'break' statement.  Exceptions in the
    'else' clause are not handled by the preceding 'except' clauses.

> The 'else' clause isn't executed when you 'break' or (after
> applying my continue-in-try patch ;) 'continue' out of the
> 'try', either.

Hey, now you're channeling me <wink>!  Be afraid -- be very afraid.




From moshez at zadka.site.co.il  Fri Dec 29 15:42:44 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 29 Dec 2000 16:42:44 +0200 (IST)
Subject: [Python-Dev] chomp()?
In-Reply-To: <3A4B7BB1.F09660ED@lemburg.com>
References: <3A4B7BB1.F09660ED@lemburg.com>, <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
Message-ID: <20001229144244.D5AD0A84F@darjeeling.zadka.site.co.il>

On Thu, 28 Dec 2000, "M.-A. Lemburg" <mal at lemburg.com> wrote:

[about chomp]
> We already have .splitlines() which does the above (remove
> line breaks) not only for a single line, but for many lines at once.
> 
> Even better: .splitlines() also does the right thing for Unicode.

OK, I retract my earlier +1, and instead I move that this be added
to the FAQ. Where is the FAQ maintained nowadays? The grail link
doesn't work anymore.

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From loewis at informatik.hu-berlin.de  Fri Dec 29 17:52:13 2000
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Fri, 29 Dec 2000 17:52:13 +0100 (MET)
Subject: [Python-Dev] Re: [Patch #103002] Fix for #116285: Properly raise UnicodeErrors
Message-ID: <200012291652.RAA20251@pandora.informatik.hu-berlin.de>

[resent since python.org ran out of disk space]
> My only problem with it is your copyright notice. AFAIK, patches to
> the Python core cannot contain copyright notices without proper
> license information. OTOH, I don't think that these minor changes
> really warrant adding a complete license paragraph.

I'd like to get an "official" clarification on this question. Is it
the case that patches containing copyright notices are only accepted
if they are accompanied with license information?

I agree that the changes are minor, I also believe that I hold the
copyright to the changes whether I attach a notice or not (at least
according to our local copyright law).

What concerns me that without such a notice, gencodec.py looks as if
CNRI holds the copyright to it. I'm not willing to assign the
copyright of my changes to CNRI, and I'd like to avoid the impression
of doing so.

What is even more concerning is that CNRI also holds the copyright to
the generated files, even though they are derived from information
made available by the Unicode consortium!

Regards,
Martin



From tim.one at home.com  Fri Dec 29 20:56:36 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 29 Dec 2000 14:56:36 -0500
Subject: [Python-Dev] scp to sourceforge
In-Reply-To: <20001229000528.F1811@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEOBIFAA.tim.one@home.com>

[Thomas Wouters]
> And then they fixed it ! At least, for me, direct scp now works
> fine. (I should've tested that before posting my blah blah, sorry.)

I tried it immediately before posting my blah-blah yesterday, and it was
still hanging.

> Anybody else, like people using F-secure ssh (unix or windows)
> experience the same ?

Same here:  I tried it again just now (under Win98 cmdline ssh/scp) and it
worked fine!  We're in business again.  Thanks for fixing it, Thomas <wink>.

now-if-only-we-could-get-python-dev-email-on-an-approximation-to-the-
    same-day-it's-sent-ly y'rs  - tim




From tim.one at home.com  Fri Dec 29 21:27:40 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 29 Dec 2000 15:27:40 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <EC$An3AHZGT6EwJP@jessikat.fsnet.co.uk>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEODIFAA.tim.one@home.com>

[Robin Becker]
> The 2.0 docs clearly state 'The optional else clause is executed when no
> exception occurs in the try clause.' This makes it sound as though it
> gets executed on the 'way out'.

Of course.  That's not what the docs meant, though, and Guido is not going
to change the implementation now because that would break code that relies
on how Python has always *worked* in these cases.  The way Python works is
also the way Guido intended it to work (I'm allowed to channel him when he's
on vacation <0.9 wink)>.

Indeed, that's why I suggested a specific doc change.  If your friend would
also be confused by that, then we still have a problem; else we don't.




From tim.one at home.com  Fri Dec 29 21:37:08 2000
From: tim.one at home.com (Tim Peters)
Date: Fri, 29 Dec 2000 15:37:08 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <web-404134@digicool.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEOFIFAA.tim.one@home.com>

[Fred]
>   This can certainly be clarified in the documentation --
> please file a bug report at http://sourceforge.net/projects/python/.

Here you go:

https://sourceforge.net/bugs/?func=detailbug&bug_id=127098&group_id=5470




From thomas at xs4all.net  Fri Dec 29 21:59:16 2000
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 29 Dec 2000 21:59:16 +0100
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEODIFAA.tim.one@home.com>; from tim.one@home.com on Fri, Dec 29, 2000 at 03:27:40PM -0500
References: <EC$An3AHZGT6EwJP@jessikat.fsnet.co.uk> <LNBBLJKPBEHFEDALKOLCKEODIFAA.tim.one@home.com>
Message-ID: <20001229215915.L1281@xs4all.nl>

On Fri, Dec 29, 2000 at 03:27:40PM -0500, Tim Peters wrote:

> Indeed, that's why I suggested a specific doc change.  If your friend would
> also be confused by that, then we still have a problem; else we don't.

Note that I already uploaded a patch to fix the docs, assigned to fdrake,
using Tim's wording exactly. (patch #103045)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From moshez at zadka.site.co.il  Sun Dec 31 01:33:30 2000
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Sun, 31 Dec 2000 02:33:30 +0200 (IST)
Subject: [Python-Dev] FAQ Horribly Out Of Date
Message-ID: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>

Hi!
The current FAQ is horribly out of date. I think the FAQ-Wizard method
has proven itself not very efficient (for example, apparently no one
noticed until now that it's not working <0.2 wink>). Is there any 
hope putting the FAQ in Misc/, having a script which scp's it
to the SF page, and making that the official FAQ?

On a related note, what is the current status of the PSA? Is it officially
dead?
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tim.one at home.com  Sat Dec 30 21:48:08 2000
From: tim.one at home.com (Tim Peters)
Date: Sat, 30 Dec 2000 15:48:08 -0500
Subject: [Python-Dev] Most everything is busted
Message-ID: <LNBBLJKPBEHFEDALKOLCCEOMIFAA.tim.one@home.com>

Add this error to the pot:

"""
http://www.python.org/cgi-bin/moinmoin

Proxy Error
The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /cgi-bin/moinmoin.

Reason: Document contains no data
-------------------------------------------------------------------

Apache/1.3.9 Server at www.python.org Port 80
"""

Also, as far as I can tell:

+ news->mail for c.l.py hasn't delivered anything for well over 24 hours.

+ No mail to Python-Dev has showed up in the archives (let alone been
delivered) since Fri, 29 Dec 2000 16:42:44 +0200 (IST).

+ The other Python mailing lists appear equally dead.

time-for-a-new-year!-ly y'rs  - tim




From barry at wooz.org  Sun Dec 31 02:06:23 2000
From: barry at wooz.org (Barry A. Warsaw)
Date: Sat, 30 Dec 2000 20:06:23 -0500
Subject: [Python-Dev] Re: Most everything is busted
References: <LNBBLJKPBEHFEDALKOLCCEOMIFAA.tim.one@home.com>
Message-ID: <14926.34447.60988.553140@anthem.concentric.net>

>>>>> "TP" == Tim Peters <tim.one at home.com> writes:

    TP> + news->mail for c.l.py hasn't delivered anything for well
    TP> over 24 hours.

    TP> + No mail to Python-Dev has showed up in the archives (let
    TP> alone been delivered) since Fri, 29 Dec 2000 16:42:44 +0200
    TP> (IST).

    TP> + The other Python mailing lists appear equally dead.

There's a stupid, stupid bug in Mailman 2.0, which I've just fixed and
(hopefully) unjammed things on the Mailman end[1].  We're still
probably subject to the Postfix delays unfortunately; I think those
are DNS related, and I've gotten a few other reports of DNS oddities,
which I've forwarded off to the DC sysadmins.  I don't think that
particular problem will be fixed until after the New Year.

relax-and-enjoy-the-quiet-ly y'rs,
-Barry

[1] For those who care: there's a resource throttle in qrunner which
limits the number of files any single qrunner process will handle.
qrunner does a listdir() on the qfiles directory and ignores any .msg
file it finds (it only does the bulk of the processing on the
corresponding .db files).  But it performs the throttle check on every
file in listdir() so depending on the order that listdir() returns and
the number of files in the qfiles directory, the throttle check might
get triggered before any .db file is seen.  Wedge city.  This is
serious enough to warrant a Mailman 2.0.1 release, probably mid-next
week.




From gstein at lyra.org  Sun Dec 31 11:19:50 2000
From: gstein at lyra.org (Greg Stein)
Date: Sun, 31 Dec 2000 02:19:50 -0800
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Sun, Dec 31, 2000 at 02:33:30AM +0200
References: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>
Message-ID: <20001231021950.M28628@lyra.org>

On Sun, Dec 31, 2000 at 02:33:30AM +0200, Moshe Zadka wrote:
>...
> On a related note, what is the current status of the PSA? Is it officially
> dead?

The PSA was always kind of a (legal) fiction with the basic intent to help
provide some funding for Python development. Since that isn't occurring at
CNRI any more, the PSA is a bit moot. There was always some idea that maybe
the PSA would be the "sponsor" (and possibly the beneficiary) of the
conferences. That wasn't ever really formalized either.


From akuchlin at cnri.reston.va.us  Sun Dec 31 16:58:12 2000
From: akuchlin at cnri.reston.va.us (Andrew Kuchling)
Date: Sun, 31 Dec 2000 10:58:12 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Sun, Dec 31, 2000 at 02:33:30AM +0200
References: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>
Message-ID: <20001231105812.A12168@newcnri.cnri.reston.va.us>

On Sun, Dec 31, 2000 at 02:33:30AM +0200, Moshe Zadka wrote:
>The current FAQ is horribly out of date. I think the FAQ-Wizard method
>has proven itself not very efficient (for example, apparently no one
>noticed until now that it's not working <0.2 wink>). Is there any 

It also leads to one section of the FAQ (#3, I think) having something
like 60 questions jumbled together.  IMHO the FAQ should be a text
file, perhaps in the PEP format so it can be converted to HTML, and it
should have an editor who'll arrange it into smaller sections.  Any
volunteers?  (Must ... resist ...  urge to volunteer myself...  help
me, Spock...)

--amk





From skip at mojam.com  Sun Dec 31 20:25:18 2000
From: skip at mojam.com (Skip Montanaro)
Date: Sun, 31 Dec 2000 13:25:18 -0600 (CST)
Subject: [Python-Dev] plz test bsddb using shared linkage
Message-ID: <14927.34846.153117.764547@beluga.mojam.com>

A bug was filed on SF contending that the default linkage for bsddb should
be shared instead of static because some Linux systems ship multiple
versions of libdb.

Would those of you who can and do build bsddb (probably only unixoids of
some variety) please give this simple test a try?  Uncomment the *shared*
line in Modules/Setup.config.in, re-run configure, build Python and then
try:

    import bsddb
    db = bsddb.btopen("/tmp/dbtest.db", "c")
    db["1"] = "1"
    print db["1"]
    db.close()
    del db

If this doesn't fail for anyone I'll check the change in and close the bug
report, otherwise I'll add a(nother) comment to the bug report that *shared*
breaks bsddb for others and close the bug report.

Thx,

Skip




From skip at mojam.com  Sun Dec 31 20:26:16 2000
From: skip at mojam.com (Skip Montanaro)
Date: Sun, 31 Dec 2000 13:26:16 -0600 (CST)
Subject: [Python-Dev] plz test bsddb using shared linkage
Message-ID: <14927.34904.20832.319647@beluga.mojam.com>

oops, forgot the bug report is at

  http://sourceforge.net/bugs/?func=detailbug&bug_id=126564&group_id=5470

for those of you who do not monitor python-bugs-list.

S



From tim.one at home.com  Sun Dec 31 21:28:47 2000
From: tim.one at home.com (Tim Peters)
Date: Sun, 31 Dec 2000 15:28:47 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEBBIGAA.tim.one@home.com>

[Moshe Zadka]
> The current FAQ is horribly out of date.

The password is Spam.  Fix it <wink>.

> I think the FAQ-Wizard method has proven itself not very
> efficient (for example, apparently no one noticed until now
> that it's not working <0.2 wink>).

I'm afraid almost nothing on python.org with an active component works today
(not searches, not the FAQ Wizard, not the 2.0 Wiki, ...).  If history is
any clue, these will remain broken until Guido gets back from vacation.

> Is there any hope putting the FAQ in Misc/, having a script
> which scp's it to the SF page, and making that the official FAQ?

Would be OK by me.  I'm more concerned that the whole of python.org has
barely been updated since March; huge chunks of the FAQ are still relevant,
but, e.g., the Job Board hasn't been touched in over 3 months; the News got
so out of date Guido deleted the whole section; etc.

> On a related note, what is the current status of the PSA? Is it
> officially dead?

It appears that CNRI can only think about one thing at a time <0.5 wink>.
For the last 6 months, that thing has been the license.  If they ever
resolve the GPL compatibility issue, maybe they can be persuaded to think
about the PSA.  In the meantime, I'd suggest you not renew <ahem>.




From tim.one at home.com  Sun Dec 31 23:12:43 2000
From: tim.one at home.com (Tim Peters)
Date: Sun, 31 Dec 2000 17:12:43 -0500
Subject: [Python-Dev] plz test bsddb using shared linkage
In-Reply-To: <14927.34846.153117.764547@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGECEIGAA.tim.one@home.com>

[Skip Montanaro]
> ...
> Would those of you who can and do build bsddb (probably only
> unixoids of some variety) please give this simple test a try?

Just noting that bsddb already ships with the Windows installer as a
(shared) DLL.  But it's an old (1.85?) Windows port from Sam Rushing.